qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 42/69] target/arm: Introduce gen_gvec_rev{16,32,64}


From: Richard Henderson
Subject: Re: [PATCH v3 42/69] target/arm: Introduce gen_gvec_rev{16,32,64}
Date: Wed, 11 Dec 2024 11:31:23 -0600
User-agent: Mozilla Thunderbird

On 12/11/24 11:19, Philippe Mathieu-Daudé wrote:
On 11/12/24 17:30, Richard Henderson wrote:
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
  target/arm/tcg/translate.h      |  6 +++
  target/arm/tcg/gengvec.c        | 58 ++++++++++++++++++++++
  target/arm/tcg/translate-neon.c | 88 +++++++--------------------------
  3 files changed, 81 insertions(+), 71 deletions(-)

diff --git a/target/arm/tcg/translate.h b/target/arm/tcg/translate.h
index cb8e1b2586..342ebedafc 100644
--- a/target/arm/tcg/translate.h
+++ b/target/arm/tcg/translate.h
@@ -586,6 +586,12 @@ void gen_gvec_cnt(unsigned vece, uint32_t rd_ofs, uint32_t 
rn_ofs,
                    uint32_t opr_sz, uint32_t max_sz);
  void gen_gvec_rbit(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
                     uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_rev16(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_rev32(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t opr_sz, uint32_t max_sz);
+void gen_gvec_rev64(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs,
+                    uint32_t opr_sz, uint32_t max_sz);

Remembering https://lore.kernel.org/qemu-devel/20230822124042.54739-1-philmd@linaro.org/, these gvec helpers might be useful for other targets.

These may be factored incorrectly for other usage. Here, for rev<N>, N is the size of the container, and vece specifies the size of the element within each container. It's reverse of the usual meaning of vece, but it maps well to the Arm instruction encoding.

The only other bswap I can recall with vector operands is s390x VLBR/VSTBR, and similar for Power VSX, which performs the reversal at the same time as a load/store. So in this case the heavy lifting of the bswap gets pushed off to MO_BSWAP.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]