[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [V2 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [V2 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds |
Date: |
Wed, 20 Nov 2013 10:29:40 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 |
On 11/19/2013 11:40 PM, Tom Musta wrote:
> + /* NOTE: in order to get accurate results, we must first round back */
> \
> + /* to single precision and use the fused multiply add routine */
> \
> + /* for 32-bit floats. */
> \
> + float_status tstat = env->fp_status;
> \
> + float32 a32 = float64_to_float32(xa.f64[0], &tstat);
> \
> + float32 b32 = float64_to_float32(b->f64[0], &tstat);
> \
> + float32 c32 = float64_to_float32(c->f64[0], &tstat);
> \
> +
> \
> + set_float_exception_flags(0, &tstat);
> \
> + float32 t32 = float32_muladd(a32, b32, c32, maddflgs, &tstat);
> \
While this will produce correct results for the "normal" use case of correctly
rounded single-precision inputs, the spec says
# Except for xsresp or xsrsqrtesp, any double-precision value can
# be used in single-precision scalar arithmetic operations when
# OE=0 and UE=0.
Thus a more correct implementation would use the full double-precision inputs
while also correctly rounding. I pointed you at the glibc implementation to
show how that can be done using round-to-zero plus examining the inexact bit.
float_status tstat = env->fp_status;
set_float_exception_flags(0, &tstat);
if (tstat.float_rounding_mode == float_round_nearest_even) {
/* Avoid double rounding errors by rounding the intermediate
result to odd. See
http://hal.inria.fr/docs/00/08/04/27/PDF/odd-rounding.pdf */
set_float_rounding_mode(float_round_to_zero, &tstat);
res = float64_muladd(...);
res |= (get_float_exception_flags(&tstat) & float_flag_inexact) != 0;
} else {
res = float64_muladd(...);
}
res = helper_frsp(env, res);
apply tstat exceptions;
r~
- [Qemu-devel] [V2 PATCH 05/14] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx, (continued)
- [Qemu-devel] [V2 PATCH 05/14] target-ppc: VSX Stage 4: Add stxsiwx and stxsspx, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 03/14] target-ppc: VSX Stage 4: Add lxsiwax, lxsiwzx and lxsspx, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 07/14] target-ppc: VSX Stage 4: Add xsmulsp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 06/14] target-ppc: VSX Stage 4: Add xsaddsp and xssubsp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 04/14] target-ppc: VSX Stage 4: Refactor stxsdx, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 08/14] target-ppc: VSX Stage 4: Add xsdivsp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 09/14] target-ppc: VSX Stage 4: Add xsresp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 10/14] target-ppc: VSX Stage 4: Add xssqrtsp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 11/14] target-ppc: VSX Stage 4: add xsrsqrtesp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds, Tom Musta, 2013/11/19
- Re: [Qemu-devel] [V2 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds,
Richard Henderson <=
- [Qemu-devel] [V2 PATCH 13/14] target-ppc: VSX Stage 4: Add xscvsxdsp and xscvuxdsp, Tom Musta, 2013/11/19
- [Qemu-devel] [V2 PATCH 14/14] target-ppc: VSX Stage 4: Add xxleqv, xxlnand and xxlorc, Tom Musta, 2013/11/19