qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 3/4] target-tilegx: Add double floating point


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v3 3/4] target-tilegx: Add double floating point implementation
Date: Thu, 10 Dec 2015 13:17:19 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

On 12/10/2015 06:15 AM, Chen Gang wrote:
> +#define TILEGX_F_MAN_HBIT   (1ULL << 59)
...
> +static uint64_t fr_to_man(float64 d)
> +{
> +    uint64_t val = get_f64_man(d) << 7;
> +
> +    if (get_f64_exp(d)) {
> +        val |= TILEGX_F_MAN_HBIT;
> +    }
> +
> +    return val;
> +}

One presumes that "HBIT" is the ieee implicit one bit.
A better name or better comments would help there.

Do we know for sure that "7" is the correct number of guard bits?  From the gcc
implementation of floatsidf, I might guess that the correct number is "4".

> +static uint32_t get_fdouble_vexp(uint64_t n)
> +{
> +    return extract32(n, 7, 13);
> +}

What's a "vexp"?

> +uint64_t helper_fdouble_unpack_min(CPUTLGState *env,
> +                                   uint64_t srca, uint64_t srcb)
> +{
> +    uint64_t v = 0;
> +    uint32_t expa = get_f64_exp(srca);
> +    uint32_t expb = get_f64_exp(srcb);
> +
> +    if (float64_is_any_nan(srca) || float64_is_any_nan(srcb)
> +        || float64_is_infinity(srca) || float64_is_infinity(srcb)) {
> +        return 0;
> +    } else if (expa > expb) {
> +        if (expa - expb < 64) {
> +            set_fdouble_man(&v, fr_to_man(srcb) >> (expa - expb));
> +        } else {
> +            return 0;
> +        }
> +    } else if (expa < expb) {
> +        if (expb - expa < 64) {
> +            set_fdouble_man(&v, fr_to_man(srca) >> (expb - expa));

I very sincerely doubt that a simple right-shift is correct.  In order to
obtain proper rounding for real computation, a sticky bit is required.  That
is, set bit 0 if any bits are shifted out.  See the implementation of
shift64RightJamming in fpu/softfloat-macros.h.

> +uint64_t helper_fdouble_addsub(CPUTLGState *env,
> +                               uint64_t dest, uint64_t srca, uint64_t srcb)
> +{
> +    if (get_fdouble_calc(srcb) == TILEGX_F_CALC_ADD) {
> +        return dest + srca; /* maybe set addsub overflow bit */

Definitely not.  That would be part of packing.

> +/* absolute-add/mul may cause add/mul carry or overflow */
> +static bool proc_oflow(uint64_t *flags, uint64_t *v, uint64_t *srcb)
> +{
> +    if (get_fdouble_man_of(*v)) {
> +        set_fdouble_vexp(flags, get_fdouble_vexp(*flags) + 1);
> +        *srcb >>= 1;
> +        *srcb |= *v << 63;
> +        *v >>= 1;
> +        clear_fdouble_man_of(v);
> +    }
> +    return get_fdouble_vexp(*flags) > TILEGX_F_EXP_DMAX;
> +}
> +
> +uint64_t helper_fdouble_pack2(CPUTLGState *env, uint64_t flags /* dest */,
> +                              uint64_t srca, uint64_t srcb)
> +{
> +    uint64_t v = srca;
> +    float64 d = float64_set_sign(float64_zero, get_fdouble_sign(flags));
> +
> +    /*
> +     * fdouble_add_flags, fdouble_sub_flags, or fdouble_mul_flags have
> +     * processed exceptions. So need not process fp_status, again.
> +     */

No need to process fp_status at all, actually.  Tile-GX (and pro) do not
support exception flags, so everything we do with fp_status is discarded.

Indeed, we should probably not store fp_status in env at all, but create it on
the stack in any function that actually needs one.


> +
> +    if (get_fdouble_nan(flags)) {
> +        return float64_val(float64_default_nan);
> +    } else if (get_fdouble_inf(flags)) {
> +        return float64_val(d |= float64_infinity);

s/|=/|/

> +    /* absolute-mul needs left shift 4 + 1 bytes to match the real mantissa 
> */
> +    if (get_fdouble_calc(flags) == TILEGX_F_CALC_MUL) {
> +        v <<= 5;
> +        v |= srcb >> 59;
> +        srcb <<= 5;
> +    }

As with single, I don't like this calc thing.  We can infer what's required
from principals.

We're given two words containing mantissa, and a "flags" word containing sign,
exponent, and other flags.  For add, sub, and floatsidf, the compiler passes us
0 as the low word; for mul the compiler passes us the result of a 64x64->128
bit multiply.

The first step would be to normalize the 128-bit value so that the highest bit
set is TILEGX_F_MAN_HBIT in the high word, adjusting the exponent in the
process.  Fold the low word into the sticky bit of the high word (high |= (low
!= 0)) for rounding purposes.

The second step would be to round and pack, similar to roundAndPackFloat64,
except that your HBIT is at a different place than softfloat.c.

> +    d = calc(fsrca, fsrcb, fp_status); /* also check exceptions */

There are no exceptions to check.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]