[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same ins
From: |
Richard Henderson |
Subject: |
Re: [PATCH 23/36] target/arm: Convert Neon 64-bit element 3-reg-same insns |
Date: |
Fri, 1 May 2020 09:13:46 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 |
On 5/1/20 8:54 AM, Peter Maydell wrote:
> On Thu, 30 Apr 2020 at 21:54, Richard Henderson
> <address@hidden> wrote:
>>
>> On 4/30/20 11:09 AM, Peter Maydell wrote:
>>> +
>>> + rn = tcg_temp_new_i64();
>>> + rm = tcg_temp_new_i64();
>>> + rd = tcg_temp_new_i64();
>>> +
>>> + for (pass = 0; pass < (a->q ? 2 : 1); pass++) {
>>> + neon_load_reg64(rn, a->vn + pass);
>>> + neon_load_reg64(rm, a->vm + pass);
>>> + fn(rd, rm, rn);
>>> + neon_store_reg64(rd, a->vd + pass);
>>> + }
>>> +
>>> + tcg_temp_free_i64(rn);
>>> + tcg_temp_free_i64(rm);
>>> + tcg_temp_free_i64(rd);
>>> +
>>> + return true;
>>> +}
>>> +
>>> +#define DO_3SAME_64(INSN, FUNC) \
>>> + static bool trans_##INSN##_3s(DisasContext *s, arg_3same *a) \
>>> + { \
>>> + return do_3same_64(s, a, FUNC); \
>>> + }
>>
>> You can morph this into the gvec interface like so:
>>
>> #define DO_3SAME_64(INSN, FUNC) \
>> static void gen_##INSN##_3s(unsigned vece, uint32_t rd_ofs,
>> uint32_t rn_ofs, uint32_t rm_ofs,
>> uint32_t oprsz, uint32_t maxsz)
>> {
>> static const GVecGen3 op = { .fni8 = FUNC };
>> tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs,
>> oprsz, maxsz, &op);
>> }
>> DO_3SAME(INSN, gen_##INSN##_3s)
>>
>> The .fni8 function tells gvec that we have a helper that processes the
>> operation in 8 byte chunks. It will handle the pass loop for you.
>
> This doesn't quite work, because these are shift ops and
> so the operands are passed to the helper in the order
> rd, rm, rn. Reshuffling the order of arguments to
> tcg_gen_gvec_3() fixes this, though.
>
> I guess I should call the macro DO_3SAME_SHIFT64, I hadn't
> noticed it was shift specific because the only thing we do
> with it is shifts.
See my reply to patch 26. I think we should swap these operands during decode.
r~