Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multipli

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multipli

From:	Alex Bennée
Subject:	Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication
Date:	Thu, 29 Mar 2018 11:00:31 +0100
User-agent:	mu4e 1.1.0; emacs 26.0.91

Emilio G. Cota <address@hidden> writes:

> On Wed, Mar 28, 2018 at 14:26:30 +0100, Alex Bennée wrote:
>> Emilio G. Cota <address@hidden> writes:
>> OK I've had a bit more of a play and I think we can drop the macro abuse
>> and have common wrappers for the host_fpu. We don't want to intermingle
>> with the soft float slow path to stop the compiler adding overhead. We
>> also need a wrapper for each float size and op count due to differences
>> in the classify functions. However the boiler plate is pretty common and
>> where there are differences the compiler is smart enough to fix it.
>>
>> See branch:
>> https://github.com/stsquad/qemu/tree/hostfloat/common-fpu-wrapper
>>
>> I keep the numbers for add/sub and doubled the speed of float32_mul on
>> my box, without any macros ;-)
>
> I really like the idea of letting the compiler unfold everything.
> In fact I just did that to re-implement fp-bench (now with support
> for -t host/soft, yay).

It's a poor mans templates but it certainly makes reading and debugging
the code a lot easier. Of course sometimes it does feel like you are
doing more work to guide the compiler to the right answer....

>
>> Full patch inline:
>>
>> diff --git a/fpu/softfloat.c b/fpu/softfloat.c
>> index d0f1f65c12..89217b5e67 100644
>> --- a/fpu/softfloat.c
>> +++ b/fpu/softfloat.c
>> @@ -879,56 +879,72 @@ soft_float64_sub(float64 a, float64 b, float_status 
>> *status)
>>      return float64_round_pack_canonical(pr, status);
>>  }
> (snip)
>> +static float fpu_mul32(float a, float b, bool *nocheck) {
>> +
>> +    if (float32_is_zero(a) || float32_is_zero(b)) {
>> +        bool signbit = float32_is_neg(a) ^ float32_is_neg(b);
>> +        *nocheck = true;
>> +        return float32_set_sign((0), signbit);
>> +    } else {
>> +        float ha = float32_to_float(a);
>> +        float hb = float32_to_float(b);
>> +        float hr = ha * hb;
>> +        return hr;
>>      }
>> +}
>
> This function is wrong :-(
>
> Note that a and b are floats, not float32's. So if any of
> them is 0.X then they get silently converted to 0, which goes via the
> fast(er) path above. This explains the speedup.

So we need to push the casting into the helpers. Well at least it didn't
get any slower once fixed ;-)

>
> Note that you could have caught this with:
>
>   $ ./fp-test -t soft ibm/* -w whitelist.txt -e x
>
> Compiling with -Wconversion would also point these out, but the output
> is way too noisy to be useful.
>
>
> That said, I'll take inspiration from your approach for v3--hopefully
> without (many) macros this time round.
>
> Thanks!
>
>               Emilio


--
Alex Bennée

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v2 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks, (continued)
- [Qemu-devel] [PATCH v2 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 13/14] hardfloat: support float32/64 comparison, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 04/14] fp-test: add muladd variants, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 10/14] hardfloat: support float32/64 division, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 14/14] hardfloat: support float32_to_float64, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 07/14] fpu: introduce hardfloat, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 12/14] hardfloat: support float32/64 square root, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication, Emilio G. Cota, 2018/03/27
  - Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication, Alex Bennée, 2018/03/28
    - Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication, Emilio G. Cota, 2018/03/28
    - Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication, Alex Bennée <=
- [Qemu-devel] [PATCH v2 11/14] hardfloat: support float32/64 fused multiply-add, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v2 02/14] tests: add fp-test, a floating point test suite, Emilio G. Cota, 2018/03/27
- Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat, Bastian Koppelmann, 2018/03/27
  - Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat, Bastian Koppelmann, 2018/03/27
    - [Qemu-devel] [PATCH] softfloat: rename canonicalize to sf_canonicalize, Emilio G. Cota, 2018/03/27
- Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat, Alex Bennée, 2018/03/28
- Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat, no-reply, 2018/03/29
- Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat, no-reply, 2018/03/30

Prev by Date: Re: [Qemu-devel] [PATCH v2 00/14] fp-test + hardfloat
Next by Date: Re: [Qemu-devel] [PATCH for-2.13] pc-bios/s390-ccw: Increase virtio timeout to 30 seconds
Previous by thread: Re: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication
Next by thread: [Qemu-devel] [PATCH v2 11/14] hardfloat: support float32/64 fused multiply-add
Index(es):
- Date
- Thread