Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition

From:	Emilio G. Cota
Subject:	Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction
Date:	Thu, 22 Mar 2018 15:57:21 -0400
User-agent:	Mutt/1.5.24 (2015-08-30)

On Thu, Mar 22, 2018 at 14:41:05 +0800, Richard Henderson wrote:
(snip)
> Another thought re all of the soft_is_normal || soft_is_zero checks that 
> you're
> performing.  I think it would be nice if we could work with
> float*_unpack_canonical so that we don't have to duplicate work.  E.g.
> 
> /* Return true for float_class_normal && float_class_zero.  */
> static inline bool is_finite(FloatClass c) { return c <= float_class_zero; }
> 
> float32 float32_add(float32 a, float32 b, float_status *s)
> {
>   FloatClass a_cls = float32_classify(a);
>   FloatClass b_cls = float32_classify(b);

Just looked at this. It can be done, although it comes at the
price of some performance for fp-bench -o add:
180 Mflops vs. 196 Mflops, i.e. a 8% slowdown. That is with
adequate inlining etc., otherwise perf is worse.

I'm not convinced that we can gain much in simplicity to
justify the perf impact. Yes, we'd simplify canonicalize(),
but we'd probably need a float_class_denormal[*], which
would complicate everything else.

I think it makes sense to keep some inlines that work on
the float32/64's directly.

>   if (is_finite(a_cls) && is_finite(b_cls) && ...) {
>       /* do hardfp thing */
>   }

[*] Taking 0, denormals and normals would be OK from correctness,
but we really don't want to compute ops with denormal inputs on
the host; it is very likely that the output will also be denormal,
and we'll end up deferring to soft-fp anyway to avoid
computing whether the underflow exception has occurred,
which is expensive.

>   pa = float32_unpack(a, ca, s);
>   pb = float32_unpack(b, cb, s);
>   pr = addsub_floats(pa, pb, s, false);
>   return float32_round_pack(pr, s);
> }

It pays off to have two separate functions (add & sub) for the
slow path. With soft_f32_add/sub factored out:

$ taskset -c 0 x86_64-linux-user/qemu-x86_64 tests/fp-bench -o add
197.53 MFlops

With the above four lines (pa...return) as an else branch:
169.16 MFlops

BTW flattening makes things worse (150.63 MFlops).

Note that fp-bench only tests normal numbers. But I think it's fair
to assume that this is the path we want to speed up.

Thanks,

                E.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v1 06/14] softfloat: add float32_is_denormal and float64_is_denormal, (continued)
- [Qemu-devel] [PATCH v1 03/14] softfloat: fix {min, max}nummag for same-abs-value inputs, Emilio G. Cota, 2018/03/21
  - Re: [Qemu-devel] [PATCH v1 03/14] softfloat: fix {min, max}nummag for same-abs-value inputs, Alex Bennée, 2018/03/27
  - Re: [Qemu-devel] [PATCH v1 03/14] softfloat: fix {min, max}nummag for same-abs-value inputs, Alex Bennée, 2018/03/27
- [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Emilio G. Cota, 2018/03/21
  - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Richard Henderson, 2018/03/22
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Emilio G. Cota, 2018/03/22
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Richard Henderson, 2018/03/22
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Emilio G. Cota, 2018/03/22
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Laurent Vivier, 2018/03/22
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Emilio G. Cota <=
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Alex Bennée, 2018/03/27
    - Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v1 09/14] hostfloat: support float32/64 multiplication, Emilio G. Cota, 2018/03/21
- [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks, Emilio G. Cota, 2018/03/21
  - Re: [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks, Alex Bennée, 2018/03/27
    - Re: [Qemu-devel] [PATCH v1 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks, Emilio G. Cota, 2018/03/27
- [Qemu-devel] [PATCH v1 11/14] hostfloat: support float32/64 fused multiply-add, Emilio G. Cota, 2018/03/21
- [Qemu-devel] [PATCH v1 10/14] hostfloat: support float32/64 division, Emilio G. Cota, 2018/03/21
- [Qemu-devel] [PATCH v1 12/14] hostfloat: support float32/64 square root, Emilio G. Cota, 2018/03/21
  - Re: [Qemu-devel] [PATCH v1 12/14] hostfloat: support float32/64 square root, Alex Bennée, 2018/03/21

Prev by Date: Re: [Qemu-devel] [PATCH v2 1/2] make: move generated headers to qemu-build/
Next by Date: Re: [Qemu-devel] [PATCH v2 1/2] make: move generated headers to qemu-build/
Previous by thread: Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction
Next by thread: Re: [Qemu-devel] [PATCH v1 08/14] hostfloat: support float32/64 addition and subtraction
Index(es):
- Date
- Thread