[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] utils: Add pow2ceil()
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH v2] utils: Add pow2ceil() |
Date: |
Fri, 13 Mar 2015 12:04:02 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 |
On 03/12/2015 09:45 AM, Eric Blake wrote:
> On 03/12/2015 09:29 AM, Richard Henderson wrote:
>> On 02/25/2015 02:45 AM, Markus Armbruster wrote:
>>> return 0x8000000000000000u >> (clz64(value - 1) - 1);
>>
>> I realize this was weeks ago, but it would certainly be preferable to shift a
>> small constant left than a large constant right.
>>
>> Most RISC machines can't form 0x8000000000000000ull without loading 1 and
>> then
>> left shifting to start with. So end the end you're better off with
>>
>> return 1ull << (63 - clz64(value));
>
> Since the value being shifted is a constant either way, can't gcc figure
> out the equivalence and generate the optimal code to begin with? If
> not, should it be opened as a gcc bug for potential optimization?
With the simplest of tests,
unsigned long f(unsigned long x)
{
return 1UL << (63 - x);
}
unsigned long g(unsigned long x)
{
return 0x8000000000000000ul >> x;
}
the code is of similar size: 3 operations each.
But if you throw in the whole operation
1ul << (63 - (__builtin_clzl(x - 1) - 1))
vs
0x8...0ul >> (__builtin_clzl(x - 1) - 1)
then gcc is able to fold away one of the instructions and the 1UL alternative
is shorter.
r~