|
From: | Richard Henderson |
Subject: | Re: [Qemu-devel] [PATCH v15 33/33] target-tilegx: Handle v1shl, v1shru, v1shrs |
Date: | Thu, 3 Sep 2015 08:38:19 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 |
On 09/03/2015 08:02 AM, Eric Blake wrote:
Is it any more efficient to use multiplies instead of looping, as in: uint64_t m; b &= 7; m = 0x0101010101010101ULL * ((1 << (8 - b)) - 1); return (a & m) << b; Or if multiplies are bad, what about straight-line expansion of the mask, as in: uint64_t m; b &= 7; m = (1 << (8 - b)) - 1; m |= m << 32; m |= m << 16; m |= m << 8; return (a & m) << b;
The multiply will (likely) be expanded by the compiler to the shift, but yes, I should have played the mask game after Peter pointed out that this wasn't a per-element variable shift in v14.
r~
[Prev in Thread] | Current Thread | [Next in Thread] |