qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations


From: Richard Henderson
Subject: Re: [RFC PATCH v1 03/43] accel/tcg: Add gvec size changing operations
Date: Tue, 3 Dec 2024 15:14:58 -0600
User-agent: Mozilla Thunderbird

On 12/3/24 14:15, Anton Johansson wrote:
The point is that we have a lot of Hexagon instructions where size
changes are probably unavoidable, another example is V6_vshuffh which
takes in a <16 x i16> vector and shuffles the upper <8xi16> into the upper
16-bits of a <8 x i32> vector

     void emit_V6_vshuffh(intptr_t vec3, intptr_t vec7) {
         VectorMem mem = {0};
         intptr_t vec2 = temp_new_gvec(&mem, 128);
         tcg_gen_gvec_zext(MO_32, MO_16, vec2, vec7, 128, 64, 128);

         intptr_t vec0 = temp_new_gvec(&mem, 128);
         tcg_gen_gvec_zext(MO_32, MO_16, vec0, (vec7 + 64ull), 128, 64, 128);

         intptr_t vec1 = temp_new_gvec(&mem, 128);
         tcg_gen_gvec_shli(MO_32, vec1, vec0, 16, 128, 128);
         tcg_gen_gvec_or(MO_32, vec3, vec1, vec2, 128, 128);
     }

Not to bloat the email too much with examples, you can see 3 more here

   https://pad.rev.ng/11IvAKhiRy2cPwC7MX9nXA

Maybe we rely on the target defining size-changing operations if they
are needed?

Perhaps.

I'll note that emit_V6_vpackwh_sat in particular should probably not use vectors at all. I'm sure it would be shorter to simply expand directly to integer code.

I'll also note that tcg's vector support isn't really designed for the way you're using it. It leads to the creation of many on-stack temporaries that would not otherwise be required.

When targets are emitting their own complex patterns, the expected method is to use the GVecGen* structures and the callbacks therein. This allows the JIT to select different expansions depending on the host cpu vector support (or lack thereof).

For a simple example, see gen_gvec_xar() in target/arm/tcg/gengvec64.c, which simply combines a rotate and an xor. For a more complex example, see gen_gvec_usqadd_qc() later in that same file, where in the worst case we call an out-of-line helper.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]