qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PULL 01/50] target/i386: fix pmovsx/pmovzx in-place operat


From: Paolo Bonzini
Subject: [Qemu-devel] [PULL 01/50] target/i386: fix pmovsx/pmovzx in-place operations
Date: Tue, 19 Sep 2017 14:28:50 +0200

From: Joseph Myers <address@hidden>

The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte
or 4-byte inputs and sign-extend or zero-extend them to a wider vector
output.  The associated helpers for these instructions do the
extension on each element in turn, starting with the lowest.  If the
input and output are the same register, this means that all the input
elements after the first have been overwritten before they are read.
This patch makes the helpers extend starting with the highest element,
not the lowest, to avoid such overwriting.  This fixes many GCC test
failures (161 in the gcc testsuite in my GCC 6-based testing) when
testing with a default CPU setting enabling those instructions.

Signed-off-by: Joseph Myers <address@hidden>

Message-Id: <address@hidden>
Signed-off-by: Paolo Bonzini <address@hidden>
---
 target/i386/ops_sse.h | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 16509d0..d578216 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -1617,18 +1617,18 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s)
 #define SSE_HELPER_F(name, elem, num, F)        \
     void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)     \
     {                                           \
-        d->elem(0) = F(0);                      \
-        d->elem(1) = F(1);                      \
         if (num > 2) {                          \
-            d->elem(2) = F(2);                  \
-            d->elem(3) = F(3);                  \
             if (num > 4) {                      \
-                d->elem(4) = F(4);              \
-                d->elem(5) = F(5);              \
-                d->elem(6) = F(6);              \
                 d->elem(7) = F(7);              \
+                d->elem(6) = F(6);              \
+                d->elem(5) = F(5);              \
+                d->elem(4) = F(4);              \
             }                                   \
+            d->elem(3) = F(3);                  \
+            d->elem(2) = F(2);                  \
         }                                       \
+        d->elem(1) = F(1);                      \
+        d->elem(0) = F(0);                      \
     }
 
 SSE_HELPER_F(helper_pmovsxbw, W, 8, (int8_t) s->B)
-- 
1.8.3.1





reply via email to

[Prev in Thread] Current Thread [Next in Thread]