qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 00/13] target/i386: optimize string operations


From: Paolo Bonzini
Subject: [PATCH 00/13] target/i386: optimize string operations
Date: Sun, 15 Dec 2024 10:05:59 +0100

This started as an "easy" fix for RF handling in string instructions.
I then realized how broken repz_opt is (patch 5) in that it was optimizing
for the wrong case; and that redoing the optimization would make the RF
handling basically free.

On a microbenchmark running x86-on-x86 user-mode emulation, stos and
movs execute about 40% less instruction and about 60% less branches.
Performance is very variable, because it is limited by memory bandwidth
and because the out-of-order processor does a great job of scheduling
all the useless instructions executed by the older code; but the
microbenchmark results seem to improve by 10-15%.

Paolo

Paolo Bonzini (13):
  target/i386: inline gen_jcc into sole caller
  target/i386: remove trailing 1 from gen_{j,cmov,set}cc1
  target/i386: unify REP and REPZ/REPNZ generation
  target/i386: unify choice between single and repeated string
    instructions
  target/i386: reorganize ops emitted by do_gen_rep, drop repz_opt
  target/i386: tcg: move gen_set/reset_* earlier in the file
  target/i386: fix RF handling for string instructions
  target/i386: make cc_op handling more explicit for repeated string
    instructions.
  target/i386: do not use gen_op_jz_ecx for repeated string operations
  target/i386: optimize CX handling in repeated string operations
  target/i386: execute multiple REP/REPZ iterations without leaving TB
  target/i386: pull computation of string update value out of loop
  target/i386: avoid using s->tmp0 for add to implicit registers

 target/i386/tcg/translate.c | 342 +++++++++++++++++++++---------------
 target/i386/tcg/emit.c.inc  |  56 ++----
 2 files changed, 219 insertions(+), 179 deletions(-)

-- 
2.47.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]