qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP


From: Richard Henderson
Subject: Re: [Qemu-devel] [V3 PATCH 12/14] target-ppc: VSX Stage 4: Add Scalar SP Fused Multiply-Adds
Date: Wed, 04 Dec 2013 13:23:22 +1300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0

On 12/04/2013 04:58 AM, Tom Musta wrote:
> This patch adds the Single Precision VSX Scalar Fused Multiply-Add
> instructions: xsmaddasp, xsmaddmsp, xssubasp, xssubmsp, xsnmaddasp,
> xsnmaddmsp, xsnmsubasp, xsnmsubmsp.
> 
> The existing VSX_MADD() macro is modified to support rounding of the
> intermediate double precision result to single precision.
> 
> V2: Re-implemented per feedback from Richard Henderson.  In order to
> avoid double rounding and incorrect results, the operands must be
> converted to true single precision values and use the single precision
> fused multiply/add routine.
> 
> V3: Re-implemented per feedback from Richard Henderson.  The implementation
> now uses a round-to-odd algorithm to address subtle double rounding errors.
> 
> Signed-off-by: Tom Musta <address@hidden>
> ---
>  target-ppc/fpu_helper.c |   84 ++++++++++++++++++++++++++++++----------------
>  target-ppc/helper.h     |    8 ++++
>  target-ppc/translate.c  |   16 +++++++++
>  3 files changed, 79 insertions(+), 29 deletions(-)
> 
> diff --git a/target-ppc/fpu_helper.c b/target-ppc/fpu_helper.c
> index 8825db2..077d057 100644
> --- a/target-ppc/fpu_helper.c
> +++ b/target-ppc/fpu_helper.c
> @@ -2192,7 +2192,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, f32, -126, 23)
>   *   afrm  - A form (1=A, 0=M)
>   *   sfprf - set FPRF
>   */
> -#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf)                   
>  \
> +#define VSX_MADD(op, nels, tp, fld, maddflgs, afrm, sfprf, r2sp)             
>  \
>  void helper_##op(CPUPPCState *env, uint32_t opcode)                          
>  \
>  {                                                                            
>  \
>      ppc_vsr_t xt_in, xa, xb, xt_out;                                         
>  \
> @@ -2218,8 +2218,18 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)    
>                        \
>      for (i = 0; i < nels; i++) {                                             
>  \
>          float_status tstat = env->fp_status;                                 
>  \
>          set_float_exception_flags(0, &tstat);                                
>  \
> -        xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],         
>  \
> -                                     maddflgs, &tstat);                      
>  \
> +        if (r2sp && (tstat.float_rounding_mode == float_round_nearest_even)) 
> {\
> +            /* Avoid double rounding errors by rounding the intermediate */  
>  \
> +            /* result to odd.                                            */  
>  \
> +            set_float_rounding_mode(float_round_to_zero, &tstat);            
>  \
> +            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],     
>  \
> +                                       maddflgs, &tstat);                    
>  \
> +            xt_out.fld[i] |= (get_float_exception_flags(&tstat) &            
>  \
> +                              float_flag_inexact) != 0;                      
>  \
> +        } else {                                                             
>  \
> +            xt_out.fld[i] = tp##_muladd(xa.fld[i], b->fld[i], c->fld[i],     
>  \
> +                                        maddflgs, &tstat);                   
>  \
> +        }                                                                    
>  \
>          env->fp_status.float_exception_flags |= tstat.float_exception_flags; 
>  \
>                                                                               
>  \
>          if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {    
>  \
> @@ -2242,6 +2252,13 @@ void helper_##op(CPUPPCState *env, uint32_t opcode)    
>                        \
>                  fload_invalid_op_excp(env, POWERPC_EXCP_FP_VXISI, sfprf);    
>  \
>              }                                                                
>  \
>          }                                                                    
>  \
> +                                                                             
>  \
> +        if (r2sp) {                                                          
>  \
> +            float32 tmp32 = float64_to_float32(xt_out.fld[i],                
>  \
> +                                               &env->fp_status);             
>  \
> +            xt_out.fld[i] = float32_to_float64(tmp32, &env->fp_status);      
>  \
> +        }                                                                    
>  \
> +                                                                             
>  \

helper_frsp

Otherwise,

Reviewed-by: Richard Henderson <address@hidden>


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]