[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PULL 08/29] target/arm: Fix SQDMULH (by element) with Q=0
From: |
Peter Maydell |
Subject: |
[PULL 08/29] target/arm: Fix SQDMULH (by element) with Q=0 |
Date: |
Mon, 1 Jul 2024 17:07:08 +0100 |
From: Richard Henderson <richard.henderson@linaro.org>
The inner loop, bounded by eltspersegment, must not be
larger than the outer loop, bounded by elements.
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20240625183536.1672454-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/tcg/vec_helper.c | 24 ++++++++++++++++--------
1 file changed, 16 insertions(+), 8 deletions(-)
diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 7b34cc98afe..d477479bb19 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -317,10 +317,12 @@ void HELPER(neon_sqdmulh_idx_h)(void *vd, void *vn, void
*vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx);
+ intptr_t elements = opr_sz / 2;
+ intptr_t eltspersegment = MIN(16 / 2, elements);
- for (i = 0; i < opr_sz / 2; i += 16 / 2) {
+ for (i = 0; i < elements; i += 16 / 2) {
int16_t mm = m[i];
- for (j = 0; j < 16 / 2; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, false, vq);
}
}
@@ -333,10 +335,12 @@ void HELPER(neon_sqrdmulh_idx_h)(void *vd, void *vn, void
*vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int16_t *d = vd, *n = vn, *m = (int16_t *)vm + H2(idx);
+ intptr_t elements = opr_sz / 2;
+ intptr_t eltspersegment = MIN(16 / 2, elements);
- for (i = 0; i < opr_sz / 2; i += 16 / 2) {
+ for (i = 0; i < elements; i += 16 / 2) {
int16_t mm = m[i];
- for (j = 0; j < 16 / 2; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_h(n[i + j], mm, 0, false, true, vq);
}
}
@@ -512,10 +516,12 @@ void HELPER(neon_sqdmulh_idx_s)(void *vd, void *vn, void
*vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx);
+ intptr_t elements = opr_sz / 4;
+ intptr_t eltspersegment = MIN(16 / 4, elements);
- for (i = 0; i < opr_sz / 4; i += 16 / 4) {
+ for (i = 0; i < elements; i += 16 / 4) {
int32_t mm = m[i];
- for (j = 0; j < 16 / 4; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, false, vq);
}
}
@@ -528,10 +534,12 @@ void HELPER(neon_sqrdmulh_idx_s)(void *vd, void *vn, void
*vm,
intptr_t i, j, opr_sz = simd_oprsz(desc);
int idx = simd_data(desc);
int32_t *d = vd, *n = vn, *m = (int32_t *)vm + H4(idx);
+ intptr_t elements = opr_sz / 4;
+ intptr_t eltspersegment = MIN(16 / 4, elements);
- for (i = 0; i < opr_sz / 4; i += 16 / 4) {
+ for (i = 0; i < elements; i += 16 / 4) {
int32_t mm = m[i];
- for (j = 0; j < 16 / 4; ++j) {
+ for (j = 0; j < eltspersegment; ++j) {
d[i + j] = do_sqrdmlah_s(n[i + j], mm, 0, false, true, vq);
}
}
--
2.34.1
- [PULL 00/29] target-arm queue, Peter Maydell, 2024/07/01
- [PULL 05/29] tests/avocado: use default amount of cores on sbsa-ref, Peter Maydell, 2024/07/01
- [PULL 03/29] hw/misc: Implement mailbox properties for customer OTP and device specific private keys, Peter Maydell, 2024/07/01
- [PULL 26/29] docs/system/arm: Add a doc for zynq board, Peter Maydell, 2024/07/01
- [PULL 01/29] hw/nvram: Add BCM2835 OTP device, Peter Maydell, 2024/07/01
- [PULL 07/29] target/arm: Fix VCMLA Dd, Dn, Dm[idx], Peter Maydell, 2024/07/01
- [PULL 10/29] target/arm: Convert SQRDMLAH, SQRDMLSH to decodetree, Peter Maydell, 2024/07/01
- [PULL 04/29] tests/avocado: update firmware for sbsa-ref, Peter Maydell, 2024/07/01
- [PULL 02/29] hw/arm: Connect OTP device to BCM2835, Peter Maydell, 2024/07/01
- [PULL 08/29] target/arm: Fix SQDMULH (by element) with Q=0,
Peter Maydell <=
- [PULL 14/29] target/arm: Convert BFMLALB, BFMLALT to decodetree, Peter Maydell, 2024/07/01
- [PULL 06/29] hw/arm/smmu-common: Replace smmu_iommu_mr with smmu_find_sdev, Peter Maydell, 2024/07/01
- [PULL 21/29] target/arm: Move initialization of debug ID registers, Peter Maydell, 2024/07/01
- [PULL 11/29] target/arm: Convert SDOT, UDOT to decodetree, Peter Maydell, 2024/07/01
- [PULL 09/29] target/arm: Fix FJCVTZS vs flush-to-zero, Peter Maydell, 2024/07/01
- [PULL 16/29] target/arm: Add data argument to do_fp3_vector, Peter Maydell, 2024/07/01
- [PULL 22/29] target/arm: Enable FEAT_Debugv8p8 for -cpu max, Peter Maydell, 2024/07/01
- [PULL 19/29] target/arm: Delete dead code from disas_simd_indexed, Peter Maydell, 2024/07/01
- [PULL 24/29] hw/misc/zynq_slcr: Add boot-mode property, Peter Maydell, 2024/07/01
- [PULL 25/29] hw/arm/xilinx_zynq: Add boot-mode property, Peter Maydell, 2024/07/01