qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [RFC PATCH 14/30] softfloat: 16 bit helpers for shr, clz


From: Richard Henderson
Subject: Re: [Qemu-arm] [RFC PATCH 14/30] softfloat: 16 bit helpers for shr, clz and rounding and packing
Date: Sun, 15 Oct 2017 11:02:45 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

On 10/13/2017 09:24 AM, Alex Bennée wrote:
> Half-precision helpers for float16 maths. I didn't bother hand-coding
> the count leading zeros as we could always fall-back to host-utils if
> we needed to.
> 
> Signed-off-by: Alex Bennée <address@hidden>
> ---
>  fpu/softfloat-macros.h | 39 +++++++++++++++++++++++++++++++++++++++
>  fpu/softfloat.c        | 21 +++++++++++++++++++++
>  2 files changed, 60 insertions(+)
> 
> diff --git a/fpu/softfloat-macros.h b/fpu/softfloat-macros.h
> index 9cc6158cb4..73091a88a8 100644
> --- a/fpu/softfloat-macros.h
> +++ b/fpu/softfloat-macros.h
> @@ -89,6 +89,31 @@ this code that are retained.
>  # define SOFTFLOAT_GNUC_PREREQ(maj, min) 0
>  #endif
>  
> +/*----------------------------------------------------------------------------
> +| Shifts `a' right by the number of bits given in `count'.  If any nonzero
> +| bits are shifted off, they are ``jammed'' into the least significant bit of
> +| the result by setting the least significant bit to 1.  The value of `count'
> +| can be arbitrarily large; in particular, if `count' is greater than 16, the
> +| result will be either 0 or 1, depending on whether `a' is zero or nonzero.
> +| The result is stored in the location pointed to by `zPtr'.
> +*----------------------------------------------------------------------------*/
> +
> +static inline void shift16RightJamming(uint16_t a, int count, uint16_t *zPtr)
> +{
> +    uint16_t z;
> +
> +    if ( count == 0 ) {
> +        z = a;
> +    }
> +    else if ( count < 16 ) {
> +        z = ( a>>count ) | ( ( a<<( ( - count ) & 16 ) ) != 0 );
> +    }
> +    else {
> +        z = ( a != 0 );
> +    }
> +    *zPtr = z;
> +
> +}

When are you going to use a SRJ of a uint16_t?  Isn't most of your actual
arithmetic actually done on uint32_t?

> +/*----------------------------------------------------------------------------
> +| Returns the number of leading 0 bits before the most-significant 1 bit of
> +| `a'.  If `a' is zero, 16 is returned.
> +*----------------------------------------------------------------------------*/
> +
> +static int8_t countLeadingZeros16( uint16_t a )
> +{
> +    if (a) {
> +        return __builtin_clz(a);
> +    } else {
> +        return 16;
> +    }
> +}

__builtin_clz works on "int".  You need to use clz32(a) - 16.

> +/*----------------------------------------------------------------------------
> +| Takes an abstract floating-point value having sign `zSign', exponent 
> `zExp',
> +| and significand `zSig', and returns the proper single-precision floating-

s/single/half/

> +| point value corresponding to the abstract input.  This routine is just like
> +| `roundAndPackFloat32' except that `zSig' does not have to be normalized.
> +| Bit 15 of `zSig' must be zero, and `zExp' must be 1 less than the ``true''
> +| floating-point exponent.
> +*----------------------------------------------------------------------------*/
> +
> +static float16
> + normalizeRoundAndPackFloat16(flag zSign, int zExp, uint16_t zSig,
> +                              float_status *status)
> +{
> +    int8_t shiftCount;
> +
> +    shiftCount = countLeadingZeros16( zSig ) - 1;
> +    return roundAndPackFloat16(zSign, zExp - shiftCount, zSig<<shiftCount,
> +                               true, status);

Do I recall correctly that your lsb is between bits 7:6, like
roundAndPackFloat32?  You've got 11 bits of sig.  Plus 7 bits of extra equals
18 bits.  Which doesn't fit in uint16_t.

So, the reason that roundAndPackFloat32 uses 7 bits is that 7 + 24 == 31.

We can either use a split at (15 - 11 =) 4 bits, and still fit in a uint16_t,
or we can drop uint16_t and admit that the compiler is going to promote to int,
or uint32_t, anyway.  If we do that, we have options of a split between 4 and
(31 - 11 =) 20 bits.

We talked this week re fp->int conversion, it did seem Really Useful when we
noted that sig << exp is representable in a uint32_t.  Which does suggest a
choice at or below (32 - 11 - 14 =) 7.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]