qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 02/13] tcg/s390x: Remove TCG_REG_TB


From: Ilya Leoshkevich
Subject: Re: [PATCH v3 02/13] tcg/s390x: Remove TCG_REG_TB
Date: Thu, 08 Dec 2022 15:04:46 +0100
User-agent: Evolution 3.46.1 (3.46.1-1.fc37)

On Tue, 2022-12-06 at 18:42 -0600, Richard Henderson wrote:
> On 12/6/22 16:22, Richard Henderson wrote:
> > > Wouldn't it be worth keeping XILF/XIFH here?
> > 
> > I don't know.  It's difficult for me to guess whether a dependency
> > chain like
> > 
> >      val -> xor -> xor
> > 
> > (3 insns with serial dependencies) is better than
> > 
> >      val   --> xor
> >      load  -/
> > 
> > (3 insns, but only one serial dependency) is better.  But there may
> > also be instruction 
> > fusion going on at the micro-architectural level, so that there's
> > really only one xor.
> > 
> > If you have suggestions, I'm all ears.
> 
> Related microarchitectural question:
> 
> If a 32-bit insn and a 64-bit insn have a similar size encoding (and
> perhaps even if they 
> don't), is it better to produce a 64-bit output so that the hw
> doesn't have a false 
> dependency on the upper 32-bits of the register?
> 
> Just wondering whether most of the distinction between 32-bit and 64-
> bit opcodes ought to 
> be discarded, simplifying code generation.  The only items that seem
> most likely to have 
> real execution time differences are multiply and divide.

I think this will definitely lead to false dependencies if one produces
32 bits in one place, and then consumes 64 in the other. But if this
idea is applied consistently, then there should be hopefully not so
many such instances?

Two thing come to mind here: memory and CC generation.

The first is probably not very important: we can implement 32-bit loads
with lgf, which sign-extends to 64 bits.

CC generation can be tricker: for conditional jumps it's still going to
be okay, since the code would produce 64-bit values and consume 32-bit
ones, but if the back-end needs a CC from a 32-bit addition, then we
would probably need to sign-extend the result in order to get rid of a
false dependency later on. However, based on a quick inspection I could
not find any such instances.

So using 64-bit operations instead of 32-bit ones would be an
interesting experiment.

Best regards,
Ilya



reply via email to

[Prev in Thread] Current Thread [Next in Thread]