qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 6/9] target-ppc: add lxvh8x instruction


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v4 6/9] target-ppc: add lxvh8x instruction
Date: Wed, 28 Sep 2016 10:22:30 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

On 09/28/2016 10:11 AM, Nikunj A Dadhania wrote:
> Richard Henderson <address@hidden> writes:
> 
>> On 09/27/2016 10:31 PM, Nikunj A Dadhania wrote:
>>> +DEF_HELPER_1(bswap16x4, i64, i64)
>>
>> DEF_HELPER_FLAGS_1(bswap16x4, TCG_CALL_NO_RWG_SE, i64, i64)
>>
>>> +    uint64_t m = 0x00ff00ff00ff00ffull;
>>> +    return ((x & m) << 8) | ((x >> 8) & m);
>>
>> ... although I suppose this is only 5 instructions, and could reasonably be
>> done inline too.  Especially if you shared the one 64-bit constant across the
>> two bswaps.
> 
> Something like this:
> 
> static void gen_bswap16x4(TCGv_i64 val)
> {
>     TCGv_i64 mask = tcg_const_i64(0x00FF00FF00FF00FF);
>     TCGv_i64 t0 = tcg_temp_new_i64();
>     TCGv_i64 t1 = tcg_temp_new_i64();
> 
>     /* val = ((val & mask) << 8) | ((val >> 8) & mask) */
>     tcg_gen_and_i64(t0, val, mask); 
>     tcg_gen_shri_i64(t0, t0, 8);
>     tcg_gen_shli_i64(t1, val, 8);
>     tcg_gen_and_i64(t1, t1, mask);
>     tcg_gen_or_i64(val, t0, t1);
> 
>     tcg_temp_free_i64(t0);
>     tcg_temp_free_i64(t1);
>     tcg_temp_free_i64(mask);
> }

Like that, except that since you always perform this twice, you should share
the expensive constant load.  Recall also that you need temporaries for the
store, so

static void gen_bswap16x8(TCGv_i64 outh, TCGv_i64 outl,
                          TCGv_i64 inh, TCGv_i64 inl)


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]