[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes
From: |
Peter Maydell |
Subject: |
[PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations |
Date: |
Fri, 28 Aug 2020 19:33:51 +0100 |
In the gvec helper functions for indexed operations, for AArch32
Neon the oprsz (total size of the vector) can be less than 16 bytes
if the operation is on a D reg. Since the inner loop in these
helpers always goes from 0 to segment, we must clamp it based
on oprsz to avoid processing a full 16 byte segment when asked to
handle an 8 byte wide vector.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
target/arm/vec_helper.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/target/arm/vec_helper.c b/target/arm/vec_helper.c
index 20f153b47a1..b27b90e1dd8 100644
--- a/target/arm/vec_helper.c
+++ b/target/arm/vec_helper.c
@@ -1040,7 +1040,8 @@ DO_MULADD(gvec_vfms_s, float32_mulsub_f, float32)
#define DO_MUL_IDX(NAME, TYPE, H) \
void HELPER(NAME)(void *vd, void *vn, void *vm, uint32_t desc) \
{ \
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
intptr_t idx = simd_data(desc); \
TYPE *d = vd, *n = vn, *m = vm; \
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
@@ -1061,7 +1062,8 @@ DO_MUL_IDX(gvec_mul_idx_d, uint64_t, )
#define DO_MLA_IDX(NAME, TYPE, OP, H) \
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, uint32_t desc) \
{ \
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
intptr_t idx = simd_data(desc); \
TYPE *d = vd, *n = vn, *m = vm, *a = va; \
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
@@ -1086,7 +1088,8 @@ DO_MLA_IDX(gvec_mls_idx_d, uint64_t, -, )
#define DO_FMUL_IDX(NAME, TYPE, H) \
void HELPER(NAME)(void *vd, void *vn, void *vm, void *stat, uint32_t desc) \
{ \
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
intptr_t idx = simd_data(desc); \
TYPE *d = vd, *n = vn, *m = vm; \
for (i = 0; i < oprsz / sizeof(TYPE); i += segment) { \
@@ -1108,7 +1111,8 @@ DO_FMUL_IDX(gvec_fmul_idx_d, float64, )
void HELPER(NAME)(void *vd, void *vn, void *vm, void *va, \
void *stat, uint32_t desc) \
{ \
- intptr_t i, j, oprsz = simd_oprsz(desc), segment = 16 / sizeof(TYPE); \
+ intptr_t i, j, oprsz = simd_oprsz(desc); \
+ intptr_t segment = MIN(16, oprsz) / sizeof(TYPE); \
TYPE op1_neg = extract32(desc, SIMD_DATA_SHIFT, 1); \
intptr_t idx = desc >> (SIMD_DATA_SHIFT + 1); \
TYPE *d = vd, *n = vn, *m = vm, *a = va; \
--
2.20.1
- [PATCH v2 38/45] target/arm: Implement fp16 for Neon VCVT fixed-point, (continued)
- [PATCH v2 38/45] target/arm: Implement fp16 for Neon VCVT fixed-point, Peter Maydell, 2020/08/28
- [PATCH v2 39/45] target/arm: Implement fp16 for Neon VCVT with rounding modes, Peter Maydell, 2020/08/28
- [PATCH v2 43/45] target/arm/vec_helper: Add gvec fp indexed multiply-and-add operations, Peter Maydell, 2020/08/28
- [PATCH v2 40/45] target/arm: Implement fp16 for Neon VRINT-with-specified-rounding-mode, Peter Maydell, 2020/08/28
- [PATCH v2 45/45] target/arm: Enable FP16 in '-cpu max', Peter Maydell, 2020/08/28
- [PATCH v2 42/45] target/arm/vec_helper: Handle oprsz less than 16 bytes in indexed operations,
Peter Maydell <=
- [PATCH v2 41/45] target/arm: Implement fp16 for Neon VRINTX, Peter Maydell, 2020/08/28
- [PATCH v2 44/45] target/arm: Implement fp16 for Neon VMUL, VMLA, VMLS, Peter Maydell, 2020/08/28