qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 00/26] tcg: rework liveness analysis and regi


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH v2 00/26] tcg: rework liveness analysis and register allocator
Date: Wed, 10 Oct 2012 10:24:38 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

On Wed, Oct 10, 2012 at 09:49:53AM +0200, Paolo Bonzini wrote:
> Il 10/10/2012 09:42, Aurelien Jarno ha scritto:
> > On Wed, Oct 10, 2012 at 08:59:43AM +0200, Paolo Bonzini wrote:
> >> Il 09/10/2012 21:55, Aurelien Jarno ha scritto:
> >>> This patch series rework the liveness analysis and register allocator
> >>> in order to generate more optimized code, by avoiding a lot of move
> >>> instructions. I have measured a 9% performance improvement in user mode
> >>> and 4% in system mode.
> >>>
> >>> The idea behind this patch series is to free registers as soon as the
> >>> temps are not used anymore instead of waiting for a basic block end or
> >>> an op with side effects.
> >>
> >> Would it make any sense to express the saves as real TCG ops?  This
> >> would have a couple of advantages:
> > 
> > It depends what you mean by that. Spills are decided more or less at the
> > last moment (no free registers available, clobbered registers in a 
> > function call).
> 
> I'm not talking of spills; only saves of dead globals and local temps.
> These can be computed before the optimizer runs, right?

It's something that is doable, though it might not be that easy to
implement. Such an op should also mark the reg as mem_coherent, and for
that should associate the address passed to the st with the
corresponding global. Failing to do so means that the register might be
saved twice in case of register spill or a call clobber. The second
problem is to insert some ops in the TCG stream, which is not difficult
to solve, but might be more difficult to solve in an optimized way.

That said, I am not sure it will really help, besides providing a way to
save immediates directly to memory.

> > If it's about inserting them in the TCG stream, as it is done at the
> > last step, ie after copy propagation and dead code elimination, it's not
> > really useful anymore.
> > 
> >> - more copy propagation and dead code elimination.  Something like this:
> >>
> >>         mov_i64 cc_dst,rax
> >>
> >> right now is compiled as follows:
> >>
> >> 0x5555557ac37a:  mov    %rbp,(%r14)          # spill rax
> >> 0x5555557ac381:  mov    (%r14),%rbp          # load rax from memory
> >> 0x5555557ac38f:  mov    %rbp,0x98(%r14)      # spill cc_dst to memory
> > 
> > I am surprised by this kind of code, and I think there's a bug somewhere
> > in TCG. With the current TCG code, given rax is not dead, it should be
> > spilled only after the move of cc_dst to memory, and thus second line is
> > not supposed to be emitted. With this patch series applied the second 
> > line should simply be removed.
> 
> Note that the above was without your series.

Do you still have the corresponding log, or at least the TCG code and
the corresponding assembly code for this TB?

> >> - constant propagation using constraints.  This would let tcg-i386 use
> >> effectively the mov $imm,(addr) instruction for spills of known-constant
> >> values.
> > 
> > This is indeed something quite frustrating and even more when the
> > same immediate value is loaded multiple time. One way to do that would
> > be to provide an optional tcg_out_st_immediate().
> 
> Yes, that would be simple.
> 

At some point we should still try to avoid loading the same constant
multiple time. If an op is not able to take an immediate, this immediate
is going to be loaded to a register. I think it is better to reuse this
register instead of passing the same immediate to the op.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]