qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [patch] performance improvement (softmmu, x86, GCC 3)


From: André Braga
Subject: Re: [Qemu-devel] [patch] performance improvement (softmmu, x86, GCC 3)
Date: Wed, 4 Aug 2004 14:21:59 -0300

Hmmm, that's very, very interesting (and exciting)! Squeezing any
ounce of performance out of QEMU is *very* desirable IMO, provided
that it doesn't break compatibility with other architectures. I'd
personally enjoy to see a patch (better yet, three discrete and
independent patches) that disable GCSE (this one is done!), introduce
the "ecx thing" (sorry if I have no idea what you meant in your
message about this -- a patch with some code would certainly help) and
another one that manually inlines the function you mentioned. All in
all, I'd like to see all the code working without special GCC switches
that are not pro-optimization ones, because I see those as
regressions.

Could you send me these patches? I'd glad to test them! I'm not sure
if I can be any more helpful than this since I'm just beginning to get
familiar to the emulation techniques of QEMU, let alone the code by
itself...


----- Original Message -----
From: Piotr Krysik <address@hidden>
Date: Wed, 4 Aug 2004 05:50:18 -0700 (PDT)
Subject: Re: [Qemu-devel] [patch] performance improvement (softmmu, x86, GCC 3)
To: address@hidden


Hi, 
  
The "ecx thing" and disabling GCSE are not mutually 
exclusive, but I didn't try to run/benchmark QEMU with 
both. I'm not GCC guru, but I believe that it should not 
significantly impact QEMU performance. If you are willing 
to do some tests I could send you the "ecx" patch. 
  
And yes, I tried different combinations of -fno-gcse 
suboptions, but none worked. 
  
To get more information about the problem, I used 
compiler -da flag to trace GCC optimizations of 
op_rolb_kernel_T0_T1_cc. I discovered that GCSE step 
is introducing transformation that cannot be optimized 
later. GCC insists on using copy of T0 value, instead of 
using register ebx globally reserved for T0 (and as there 
are no free register it gives error). The strangest thing 
I noticed is that if I inline stXXXX function by hand instead 
of using inline directive, problem disappears. 
  

Piotrek

--
"Dealing with failure is easy: Work hard to improve. Success is also
easy to handle: You've solved the wrong problem. Work hard to improve"
Alan J. Perlis




reply via email to

[Prev in Thread] Current Thread [Next in Thread]