[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2 07/18] tcg/i386: Implement field extraction o
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH v2 07/18] tcg/i386: Implement field extraction opcodes |
Date: |
Tue, 25 Oct 2016 18:48:03 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 |
On 25/10/2016 18:46, Richard Henderson wrote:
> On 10/25/2016 05:46 AM, Paolo Bonzini wrote:
>>
>>
>> On 18/10/2016 17:10, Richard Henderson wrote:
>>> + case INDEX_op_extract_i32:
>>> + /* On the off-chance that we can use the high-byte registers.
>>> + Otherwise we emit the same ext16 + shift pattern that we
>>> + would have gotten from the normal tcg-op.c expansion. */
>>> + tcg_debug_assert(args[2] == 8 && args[3] == 8);
>>> + if (args[1] < 4 && args[0] < 8) {
>>> + tcg_out_modrm(s, OPC_MOVZBL, args[0], args[1] + 4);
>>> + } else {
>>> + tcg_out_ext16u(s, args[0], args[1]);
>>> + tcg_out_shifti(s, SHIFT_SHR, args[0], 8);
>>> + }
>>
>> Since the opcode is pretty rare, perhaps it's worth restricting the
>> constraints to, respectively, a new constraint for 0xff ("R"?) and "Q"?
>> It should generate slightly better code without constraining the
>> register allocator too much.
>
> I tried that, but since our allocator does nothing to look forward to future
> uses, it will only properly load a value into Q if this is the first use of
> the
> value within the TB. Otherwise it'll generate an extra move to satisfy the
> constraint.
>
> Given that movzwl can operate on any source, and can copy to another
> destination at the same time, it's wasteful to force the register allocator to
> generate the extra move.
>
> This ext16u+shift form is what we'll generate without the special case here.
> So if you prefer I could drop the %[abcd]h special case entirely.
Nah, as you said there's always a chance of satisfying the constraint
(and of getting a better register allocator).
> The one that's particularly valuable is the 32-bit shift as extraction from a
> 64-bit input. That turns out to happen lots for e.g. ppc64abi32 guest.
Sounds good, thanks!
Paolo
- [Qemu-devel] [PATCH v2 00/18] tcg field extract primitives, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 05/18] tcg/arm: Move isa detection to tcg-target.h, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 04/18] tcg/aarch64: Implement field extraction opcodes, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 01/18] tcg: Add field extraction primitives, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 09/18] tcg/ppc: Implement field extraction opcodes, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 06/18] tcg/arm: Implement field extraction opcodes, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 10/18] tcg/s390: Expose host facilities to tcg-target.h, Richard Henderson, 2016/10/18
- [Qemu-devel] [PATCH v2 16/18] target-mips: Use the new extract op, Richard Henderson, 2016/10/18