qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Mips 64 emulation not compiling


From: J. Mayer
Subject: Re: [Qemu-devel] Mips 64 emulation not compiling
Date: Sat, 27 Oct 2007 14:24:51 +0200

On Sat, 2007-10-27 at 12:19 +0100, Thiemo Seufer wrote:
> J. Mayer wrote:
> > The latest patches in clo makes gcc 3.4.6 fail to build the mips64
> > targets on my amd64 host (looks like an register allocation clash in the
> > optimizer code).
> 
> Your version is likely faster as well.
> 
> > Furthermore, the clz micro-op for Mips seems very suspect to me,
> > according to the changes made in the clo implementation.
> 
> It is correct, the sign-extension are zero in that case.

OK, you know better than me...

> > I did change the clz / clo implementation to use the same code as the
> > one used for the PowerPC implementation. It seems to me that the result
> > would be correct... And it compiles...
> > 
> > Please take a look to the folowing patch:
> 
> We have now clz/clo in several places, so I expanded your patch a
> bit. For now it is only used for the mips target. Comments?

I fully aggree with the idea of sharing this code, if it's OK according
to all targets specifications. Please commit and I'll update PowerPC and
Alpha target to use them.
Oh, I did an optimisation for clz64 used on 32 bits host, avoiding use
of 64 bits logical operations:
static always_inline int clz64(uint64_t val)
{
    int cnt = 0;

#if HOST_LONG_BITS == 64
    if (!(val & 0xFFFFFFFF00000000ULL)) {
        cnt += 32;
        val <<= 32;
    }
    if (!(val & 0xFFFF000000000000ULL)) {
        cnt += 16;
        val <<= 16;
    }
    if (!(val & 0xFF00000000000000ULL)) {
        cnt += 8;
        val <<= 8;
    }
    if (!(val & 0xF000000000000000ULL)) {
        cnt += 4;
        val <<= 4;
    }
    if (!(val & 0xC000000000000000ULL)) {
        cnt += 2;
        val <<= 2;
    }
    if (!(val & 0x8000000000000000ULL)) {
        cnt++;
        val <<= 1;
    }
    if (!(val & 0x8000000000000000ULL)) {
        cnt++;
    }
#else
    /* Make it easier on 32 bits host machines */
    if (!(val >> 32))
        cnt = _do_cntlzw(val) + 32;
    else
        cnt = _do_cntlzw(val >> 32);
#endif

    return cnt;
}

If gcc is really cleaver, this would not lead to a better code, but it
seemed that the 32 bits implementation leaded to a more optimized code
on 32 bits hosts. Maybe this implementation could also be used for 64
bits host, avoiding #ifdef.

Count trailing zero is also implemented on Alpha, it may be a good idea
to share the implementation, if needed:
static always_inline void ctz32 (uint32_t val)
{
    int cnt = 0;

    if (!(val & 0x0000FFFFUL)) {
        cnt += 16;
        op32 >>= 16;
    }
    if (!(val & 0x000000FFUL)) {
        cnt += 8;
        val >>= 8;
    }
    if (!(val & 0x0000000FUL)) {
        cnt += 4;
        val >>= 4;
    }
    if (!(val & 0x00000003UL)) {
        cnt += 2;
        val >>= 2;
    }
    if (!(val & 0x00000001UL)) {
        cnt++;
        val >>= 1;
    }
    if (!(val & 0x00000001UL)) {
        cnt++;
    }

    return cnt;
}

static always_inline void ctz64 (uint64_t val)
{
    int cnt = 0;

    if (!(val & 0x00000000FFFFFFFFULL)) {
        cnt+= 32;
        val >>= 32;
    }
    /* Make it easier for 32 bits hosts */
   cnt += ctz32(val);

    return cnt;
}

And of course cto32 and cto64 could also be added.

I also got optimized versions of bit population count which could also
be shared:
static always_inline int ctpop32 (uint32_t val)
{
    int i;

    for (i = 0; val != 0; i++)
        val = val ^ (val - 1);

    return i;
}

If you prefer, I can add those shared functions (ctz32, ctz64, cto32,
cto64, ctpop32, ctpop64) later, as they do not seem as widely used as
clxxx functions.

-- 
J. Mayer <address@hidden>
Never organized





reply via email to

[Prev in Thread] Current Thread [Next in Thread]