[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 3/4] target-tilegx: Add double floating point
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH v3 3/4] target-tilegx: Add double floating point implementation |
Date: |
Thu, 10 Dec 2015 13:17:19 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
On 12/10/2015 06:15 AM, Chen Gang wrote:
> +#define TILEGX_F_MAN_HBIT (1ULL << 59)
...
> +static uint64_t fr_to_man(float64 d)
> +{
> + uint64_t val = get_f64_man(d) << 7;
> +
> + if (get_f64_exp(d)) {
> + val |= TILEGX_F_MAN_HBIT;
> + }
> +
> + return val;
> +}
One presumes that "HBIT" is the ieee implicit one bit.
A better name or better comments would help there.
Do we know for sure that "7" is the correct number of guard bits? From the gcc
implementation of floatsidf, I might guess that the correct number is "4".
> +static uint32_t get_fdouble_vexp(uint64_t n)
> +{
> + return extract32(n, 7, 13);
> +}
What's a "vexp"?
> +uint64_t helper_fdouble_unpack_min(CPUTLGState *env,
> + uint64_t srca, uint64_t srcb)
> +{
> + uint64_t v = 0;
> + uint32_t expa = get_f64_exp(srca);
> + uint32_t expb = get_f64_exp(srcb);
> +
> + if (float64_is_any_nan(srca) || float64_is_any_nan(srcb)
> + || float64_is_infinity(srca) || float64_is_infinity(srcb)) {
> + return 0;
> + } else if (expa > expb) {
> + if (expa - expb < 64) {
> + set_fdouble_man(&v, fr_to_man(srcb) >> (expa - expb));
> + } else {
> + return 0;
> + }
> + } else if (expa < expb) {
> + if (expb - expa < 64) {
> + set_fdouble_man(&v, fr_to_man(srca) >> (expb - expa));
I very sincerely doubt that a simple right-shift is correct. In order to
obtain proper rounding for real computation, a sticky bit is required. That
is, set bit 0 if any bits are shifted out. See the implementation of
shift64RightJamming in fpu/softfloat-macros.h.
> +uint64_t helper_fdouble_addsub(CPUTLGState *env,
> + uint64_t dest, uint64_t srca, uint64_t srcb)
> +{
> + if (get_fdouble_calc(srcb) == TILEGX_F_CALC_ADD) {
> + return dest + srca; /* maybe set addsub overflow bit */
Definitely not. That would be part of packing.
> +/* absolute-add/mul may cause add/mul carry or overflow */
> +static bool proc_oflow(uint64_t *flags, uint64_t *v, uint64_t *srcb)
> +{
> + if (get_fdouble_man_of(*v)) {
> + set_fdouble_vexp(flags, get_fdouble_vexp(*flags) + 1);
> + *srcb >>= 1;
> + *srcb |= *v << 63;
> + *v >>= 1;
> + clear_fdouble_man_of(v);
> + }
> + return get_fdouble_vexp(*flags) > TILEGX_F_EXP_DMAX;
> +}
> +
> +uint64_t helper_fdouble_pack2(CPUTLGState *env, uint64_t flags /* dest */,
> + uint64_t srca, uint64_t srcb)
> +{
> + uint64_t v = srca;
> + float64 d = float64_set_sign(float64_zero, get_fdouble_sign(flags));
> +
> + /*
> + * fdouble_add_flags, fdouble_sub_flags, or fdouble_mul_flags have
> + * processed exceptions. So need not process fp_status, again.
> + */
No need to process fp_status at all, actually. Tile-GX (and pro) do not
support exception flags, so everything we do with fp_status is discarded.
Indeed, we should probably not store fp_status in env at all, but create it on
the stack in any function that actually needs one.
> +
> + if (get_fdouble_nan(flags)) {
> + return float64_val(float64_default_nan);
> + } else if (get_fdouble_inf(flags)) {
> + return float64_val(d |= float64_infinity);
s/|=/|/
> + /* absolute-mul needs left shift 4 + 1 bytes to match the real mantissa
> */
> + if (get_fdouble_calc(flags) == TILEGX_F_CALC_MUL) {
> + v <<= 5;
> + v |= srcb >> 59;
> + srcb <<= 5;
> + }
As with single, I don't like this calc thing. We can infer what's required
from principals.
We're given two words containing mantissa, and a "flags" word containing sign,
exponent, and other flags. For add, sub, and floatsidf, the compiler passes us
0 as the low word; for mul the compiler passes us the result of a 64x64->128
bit multiply.
The first step would be to normalize the 128-bit value so that the highest bit
set is TILEGX_F_MAN_HBIT in the high word, adjusting the exponent in the
process. Fold the low word into the sticky bit of the high word (high |= (low
!= 0)) for rounding purposes.
The second step would be to round and pack, similar to roundAndPackFloat64,
except that your HBIT is at a different place than softfloat.c.
> + d = calc(fsrca, fsrcb, fp_status); /* also check exceptions */
There are no exceptions to check.
r~