qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 00/13] tcg/sparc v8plus code generation


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v2 00/13] tcg/sparc v8plus code generation
Date: Sat, 18 Jul 2015 08:21:53 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1

On 07/17/2015 02:42 PM, Aurelien Jarno wrote:
On 2015-07-17 12:23, Aurelien Jarno wrote:
On 2015-07-16 22:29, Richard Henderson wrote:
On 07/15/2015 09:54 PM, Aurelien Jarno wrote:
While I understand why we need the new trunc_shr_i32 opcode for MIPS64
(the 32-bit values must be kept sign-extended), I currently fail to
see why it is needed for SPARC.

As far as I recall, it improves code for extracting high parts of 64-bit
quantities.  Without this, we wind up with a 64-bit shift, requiring a
64-bit temp register, followed by the "real" truncate which can copy the
data to a 32-bit destination register.

Ok, I understand the use case now. So it's not for correctness, but
rather to generate more optimized code.

OTOH, it means that we always have to go through a 32-bit register first
when truncating a 64-bit value.

I mean we gain in the following case:
   shr_i64 t64, t64, i
   trunc_i64_i32 t32, t64
   ...

But we lose in the following case:
   trunc_i64_i32 t32, t64
   neg t32, t32
   ...

Why do you beleive we're using an extra temp here? Certainly you can't "neg t32, t64" in any circumstance.

Anyway, this comes up most often with interfacing with the sparcv8plus calling convention, in which 64-bit quantities must be passed in 2 registers. Before, we'd emit code like

        shrx    %g2, 32, %g1
        mov     %g1, %o0
        mov     %g2, %o1

After, we're able to put the shift output directly to %o0.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]