qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement new TCG target for a


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 2/4] tcg/aarch64: implement new TCG target for aarch64
Date: Fri, 24 May 2013 10:02:09 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

On 05/24/2013 01:53 AM, Claudio Fontana wrote:
>> No real need to special case zero; it's just an extra test slowing down the
>> compiler.
> 
> Yes, we need to handle the special case zero.
> Otherwise no instruction at all would be emitted for value 0.

Hmm, true.  Although I'd been thinking more along the lines of
arranging the code such that we'd use movz to set the zero.

> I actually don't know whether to prefer ext=0 or ext=1,
> in the sense that it would be useful to know whether using the extended 
> registers
> with a small constant is performance-wise preferable to using the 32bit 
> operation,
> and relying on 0-extension. See also the rotation comment below.

>From the armv8 isa overview:

# Rationale: [...] By maintaining this semantic information in the instruction
# set, implementations can exploit this information to avoid expending energy
# or cycles to compute, forward and store the unused upper 32 bits of such
# data types. Implementations are free to exploit this freedom in whatever way
# they choose to save energy.

>> addr_reg almost certainly needs to be zero-extended for 32-bit guests, easily
>> done by setting ext = 0 here.
> 
> I can easily put an #ifdef just to be sure.

No ifdef, just the TARGET_LONG_BITS == 64 comparison works.

>> You initialize FP, but you don't reserve the register, so it's going to get
>> clobbered.  We don't actually use the frame pointer in the translated code, 
>> so
>> I don't think there's any call to actually initialize it either.
> 
> The FP is not going to be clobbered, not by code here and not by called code.
> 
> It is not going to be clobbered between our use before the jump and after the
> jump, because all the called functions need to preserve FP as mandated by the
> calling conventions.
> 
> It is not going to be clobbered from the point of view of our caller,
> because we save (FP, LR) along with (X19, X20) .. (X27, X28) and restore them
> before returning.

Ah, well, I didn't see it mentioned here,

> +    tcg_regset_clear(s->reserved_regs);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_X18); /* platform register 
> */

but hadn't noticed that it's not listed in the reg_alloc_order.

> We use FP to point to the callee_saved registers, and to move to/from them
> in the tcg_out_store_pair and tcg_out_load_pair functions.

I hadn't noticed you'd hard-coded FP into the load/store_pair functions.
Let's *really* not do that.  Even if we decide to continue using it, let's
pass it in explicitly.

But I don't see that you're really gaining anything in the prologue from
using FP instead of SP.  It seems like a waste of a register to me.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]