qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/2] tcg-x86_64: Avoid unnecessary REX.B prefixe


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 2/2] tcg-x86_64: Avoid unnecessary REX.B prefixes.
Date: Thu, 14 Jan 2010 10:09:48 -0800
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Thunderbird/3.0

On 01/14/2010 08:10 AM, Aurelien Jarno wrote:
On Tue, Jan 05, 2010 at 04:31:11PM -0800, Richard Henderson wrote:
A while ago Laurent pointed out that the setcc opcode emitted by
the setcond patch had unnecessary REX prefixes.

The existing P_REXB internal opcode flag unconditionally emits
the REX prefix.  Technically it's not needed if the register in
question is %al, %bl, %cl, %dl.

Eliding the prefix requires splitting the P_REXB flag into two,
in order to indicate whether the byte register in question is
in the REG or the R/M field.  Within TCG, the byte register is
in the REG field only for stores.

Signed-off-by: Richard Henderson<address@hidden>
---
  tcg/x86_64/tcg-target.c |   46 ++++++++++++++++++++++++++++++----------------
  1 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/tcg/x86_64/tcg-target.c b/tcg/x86_64/tcg-target.c
index f584c94..8b6b68c 100644
--- a/tcg/x86_64/tcg-target.c
+++ b/tcg/x86_64/tcg-target.c
@@ -217,9 +217,10 @@ static inline int tcg_target_const_match(tcg_target_long 
val,
  #define JCC_JLE 0xe
  #define JCC_JG  0xf

-#define P_EXT   0x100 /* 0x0f opcode prefix */
-#define P_REXW  0x200 /* set rex.w = 1 */
-#define P_REXB  0x400 /* force rex use for byte registers */
+#define P_EXT          0x100           /* 0x0f opcode prefix */
+#define P_REXW         0x200           /* set rex.w = 1 */
+#define P_REXB_R       0x400           /* REG field as byte register */
+#define P_REXB_RM      0x800           /* R/M field as byte register */

  static const uint8_t tcg_cond_to_jcc[10] = {
      [TCG_COND_EQ] = JCC_JE,
@@ -234,17 +235,30 @@ static const uint8_t tcg_cond_to_jcc[10] = {
      [TCG_COND_GTU] = JCC_JA,
  };

-static inline void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
+static void tcg_out_opc(TCGContext *s, int opc, int r, int rm, int x)
  {
-    int rex;
-    rex = ((opc>>  6)&  0x8) | ((r>>  1)&  0x4) |
-        ((x>>  2)&  2) | ((rm>>  3)&  1);
-    if (rex || (opc&  P_REXB)) {
+    int rex = 0;
+
+    rex |= (opc&  P_REXW)>>  6;              /* REX.W */
+    rex |= (r&  8)>>  1;             /* REX.R */
+    rex |= (x&  8)>>  2;             /* REX.X */
+    rex |= (rm&  8)>>  3;            /* REX.B */
+
+    /* P_REXB_{R,RM} indicates that the given register is the low byte.
+       For %[abcd]l we need no REX prefix, but for %{si,di,bp,sp}l we do,
+       as otherwise the encoding indicates %[abcd]h.  Note that the values
+       that are ORed in merely indicate that the REX byte must be present;
+       those bits get discarded in output.  */
+    rex |= (r>= 4 ? opc&  P_REXB_R : 0);
+    rex |= (rm>= 4 ? opc&  P_REXB_RM : 0);
+
+    if (rex) {
          tcg_out8(s, rex | 0x40);
      }

With the above change, rex can be > 0xff. Not sure it's not a good idea
to not have an explicit cast when calling tcg_out8(), even if it
technically works.

Same as below.


-    if (opc&  P_EXT)
+    if (opc&  P_EXT) {
          tcg_out8(s, 0x0f);
-    tcg_out8(s, opc&  0xff);
+    }
+    tcg_out8(s, opc);

What's the reason for removing the '&  0xff' part? tcg_out8() takes an uint8_t.

Yes, and the uint8_t truncates the value just fine. Is there any particular reason you want to clutter the code with a duplicate truncation? It might have been reasonable if the function name didn't quite clearly indicate that a single byte was going to be output...


r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]