[PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes

qemu-arm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes

From:	Peter Maydell
Subject:	[PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations
Date:	Fri, 28 Aug 2020 19:33:51 +0100

In the gvec helper functions for indexed operations, for AArch32
Neon the oprsz (total size of the vector) can be less than 16 bytes
if the operation is on a D reg. Since the inner loop in these
helpers always goes from 0 to segment, we must clamp it based
on oprsz to avoid processing a full 16 byte segment when asked to
handle an 8 byte wide vector.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
 target/arm/vec_helper.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 20f153b47a1..b27b90e1dd8 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -1040,7 +1040,8 @@ DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
 #define DO_MUL_IDX(NAME, TYPE, H) \
 void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
 {                                                                          \
-    intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE);  \
+    intptr_t i, j, oprsz = simd_oprsz(desc);                               \
+    intptr_t segment = MIN(16, oprsz) / sizeof(TYPE);                      \
     intptr_t idx = simd_data(desc);                                        \
     TYPE *d = vd, *n = vn, *m = vm;                                        \
     for (i = 0; i < oprsz / sizeof(TYPE); i += segment) {                  \
@@ -1061,7 +1062,8 @@ DO_MUL_IDX(gvec_mul_idx_d, uint64_t, )
 #define DO_MLA_IDX(NAME, TYPE, OP, H) \
 void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc)   \
 {                                                                          \
-    intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE);  \
+    intptr_t i, j, oprsz = simd_oprsz(desc);                               \
+    intptr_t segment = MIN(16, oprsz) / sizeof(TYPE);                      \
     intptr_t idx = simd_data(desc);                                        \
     TYPE *d = vd, *n = vn, *m = vm, *a = va;                               \
     for (i = 0; i < oprsz / sizeof(TYPE); i += segment) {                  \
@@ -1086,7 +1088,8 @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -,   )
 #define DO_FMUL_IDX(NAME, TYPE, H) \
 void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
 {                                                                          \
-    intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE);  \
+    intptr_t i, j, oprsz = simd_oprsz(desc);                               \
+    intptr_t segment = MIN(16, oprsz) / sizeof(TYPE);                      \
     intptr_t idx = simd_data(desc);                                        \
     TYPE *d = vd, *n = vn, *m = vm;                                        \
     for (i = 0; i < oprsz / sizeof(TYPE); i += segment) {                  \
@@ -1108,7 +1111,8 @@ DO_FMUL_IDX(gvec_fmul_idx_d, float64, )
 void HELPER(NAME)(void *vd, void *vn, void *vm, void *va,                  \
                   void *stat, uint32_t desc)                               \
 {                                                                          \
-    intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE);  \
+    intptr_t i, j, oprsz = simd_oprsz(desc);                               \
+    intptr_t segment = MIN(16, oprsz) / sizeof(TYPE);                      \
     TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1);                    \
     intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1);                          \
     TYPE *d = vd, *n = vn, *m = vm, *a = va;                               \
-- 
2.20.1

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH v2 38/45] target/arm: Implement fp16 for Neon VCVT fixed-point, (continued)
- [PATCH v2 38/45] target/arm: Implement fp16 for Neon VCVT fixed-point, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 38/45] target/arm: Implement fp16 for Neon VCVT fixed-point, Richard Henderson, 2020/08/28
- [PATCH v2 39/45] target/arm: Implement fp16 for Neon VCVT with rounding modes, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 39/45] target/arm: Implement fp16 for Neon VCVT with rounding modes, Richard Henderson, 2020/08/28
- [PATCH v2 43/45] target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 43/45] target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations, Richard Henderson, 2020/08/28
    - Re: [PATCH v2 43/45] target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations, Peter Maydell, 2020/08/29
- [PATCH v2 40/45] target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 40/45] target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode, Richard Henderson, 2020/08/28
- [PATCH v2 45/45] target/arm: Enable FP16 in '-cpu max', Peter Maydell, 2020/08/28
- [PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations, Peter Maydell <=
  - Re: [PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations, Richard Henderson, 2020/08/28
- [PATCH v2 41/45] target/arm: Implement fp16 for Neon VRINTX, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 41/45] target/arm: Implement fp16 for Neon VRINTX, Richard Henderson, 2020/08/28
- [PATCH v2 44/45] target/arm: Implement fp16 for Neon VMUL, VMLA, VMLS, Peter Maydell, 2020/08/28
  - Re: [PATCH v2 44/45] target/arm: Implement fp16 for Neon VMUL, VMLA, VMLS, Richard Henderson, 2020/08/28
    - Re: [PATCH v2 44/45] target/arm: Implement fp16 for Neon VMUL, VMLA, VMLS, Peter Maydell, 2020/08/29

Prev by Date: [PATCH v2 45/45] target/arm: Enable FP16 in '-cpu max'
Next by Date: [PATCH v2 41/45] target/arm: Implement fp16 for Neon VRINTX
Previous by thread: [PATCH v2 45/45] target/arm: Enable FP16 in '-cpu max'
Next by thread: Re: [PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations
Index(es):
- Date
- Thread