Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4

From:	Blue Swirl
Subject:	Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4
Date:	Sat, 21 Jan 2012 16:55:39 +0000

On Mon, Jan 9, 2012 at 23:04, Aurelien Jarno <address@hidden> wrote:
> On Sun, Jan 08, 2012 at 12:27:39PM +0000, Blue Swirl wrote:
>> On Sun, Jan 8, 2012 at 00:52, Aurelien Jarno <address@hidden> wrote:
>> > On Sat, Jan 07, 2012 at 10:24:09PM +0000, Blue Swirl wrote:
>> >> In this version, I made basic AREG0 free load/store implementations
>> >> for all targets. Only x86-64 is tested, others have probably problems,
>> >> especially 64 bit guest (Sparc64 in this case) on 32 bit hosts.
>> >>
>> >> I think this should be committed as a starting point if there are no
>> >> major objections.
>> >>
>> >
>> > If it is known to be broken on 32-bit hosts, I am not sure it's really
>> > a good idea to commit these patches. It's just going to prevent all
>> > people developing on x86 to continue testing or developing QEMU.
>>
>> No, the brokenness is limited to 32 bit hosts with 64 bit AREG0 free
>> targets, which currently means Sparc64 on i386, Sparc64 on ARM etc.
>> Sparc32 should run unless there are other problems. Other targets
>> which are not AREG0 free (like x86) should be unchanged on any host.
>>
>> I'm not familiar with various host ABIs, fixing all of the targets is too 
>> much.
>
> Yes, but I guess your goal is to have more qemu targets than only
> sparc, so it's going to become important at some point. I think it
> should not be committed until we have working x86 and x86-64 host
> support.

OK. The problem with 64 bit guest on i386 host is that arguments need
to be pushed to stack since the registers will not be enough. This
happens even before my patches in some cases, but the code that
handles this is not so clean.

> BTW, right now I have just got a quick overview of the patch series, I
> have mostly tested them. I am planning to review them more in details in
> the next days.
>
>
>> > Also I have to say I still don't get the goal of these patch series. I
>> > have just tried it on an x86-64 guest with a 32-bit sparc guest. It
>> > works well, but the generated TB are slightly bigger. I have done a
>> > quick and dirty performance test by compiling some code inside the
>> > guests, and I get a slow down of about 0.6%, though for this kind of
>> > performance changes, it should probably done with other methods.
>>
>> Thank you for testing. I also made some tests earlier which were near
>> that figure but there performance increased. But my test setup adds a
>> lot of noise (CPU throttling etc.) so I didn't trust the figures much.
>>
>> The slow down could come from changed compiler flags. IIRC looking at
>> disassembled functions, stack protector kicks in.
>
> Given all the translation blocks are bigger (there is one more mov
> instruction per memory load/store and also for some helper), I don't
> think it really work it at compiler flags or things like that. The code
> to execute is longer, which also has side effects, like not playing very
> well wrt instructions cache, or even qemu TB cache.
>
> That said 0.8% is not that big, we can probably accept that, if we have
> a good reason for that. And here come the next point...
>
>
>> > Is there another goal behind this patch series beside performance?
>>
>> Yes, we eliminate the global register variable, which has given us a
>> lot of trouble. Those are not supported by for example LLVM or sparse.
>> The semantics of calling code which has not been compiled with
>> register variables from code which has been compiled so is not
>> defined, but it happens to work most of the time by saving the
>> variable by hand. On Sparc hosts, global register handling is a mess.
>
> Even if I personally don't really have this use case, I understand it's
> something that might be important for others.
>
> My main concern there is how to reach this goal for all hosts and all
> targets in a reasonable time frame. I remember the difficult moments
> with both dyngen and TCG, with both softfloat-native and softfloat or
> even now with qdev and non-qdev devices.
>
> That's why I think we should start with at least x86 and x86-64 on the
> host side. On the target side, we should have at least two targets
> converted, and if possible with different word size and different
> endianness. That way maintainers of the TCG host code can tests all the
> possible cases before committing it (experience has shown that it's
> difficult to have all cases right at the first time). MIPS can be a good
> candidate, as it can do both 32 and 64 bits and both big and little
> endian.
>
> Ideally at least all hosts should be converted before the next release,
> so we don't ship QEMU with support for half of the hosts only.
>
> Do you have any comments or plans about that? I guess it's actually the
> critical part in the changes you would like to see.

I think the MIPS part should work except for 64 on 32, though it is by
no means optimal. If one day all QEMU targets are AREG0 free, the TCG
target maintainers probably want to optimize. It should be easy to
avoid most if not all extra register moves in the TLB slow path.

>> After the change has been made, all QEMU code is just plain C. With
>> TCI it should even be portable to any host.
>
> Isn't that already the case with TCI? As far as I know TCI doesn't use
> host registers, so it should already be fine (beside the doc says it's
> not really tested in cross-endianness mode).

I haven't tested. There could be also GCC specific hacks elsewhere
which are not supported by for example LLVM.

>> When AREG0 is only used internally to TCG generated code, we could
>> even put the CPUState structure to stack and use stack pointer to
>> access them, freeing a global variable there too.
>
> That might be a good idea on architectures with very few registers like
> i386, not sure it'll have a big impact on others.
>
> --
> Aurelien Jarno                          GPG: 1024D/F1BCDB73
> address@hidden                 http://www.aurel32.net

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH 0/6] AREG0 patches v4, Blue Swirl, 2012/01/07
- Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4, Aurelien Jarno, 2012/01/07
  - Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4, Blue Swirl, 2012/01/08
    - Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4, Aurelien Jarno, 2012/01/10
    - Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4, Blue Swirl <=

Prev by Date: Re: [Qemu-devel] [PATCH] Init win32 CRITICAL_SECTION before starting thread; crash when attaching disks
Next by Date: Re: [Qemu-devel] [PATCH] sparc-linux-user: Fix missing symbols in .rel/.rela.plt sections
Previous by thread: Re: [Qemu-devel] [PATCH 0/6] AREG0 patches v4
Next by thread: [Qemu-devel] [PATCH 1/6] TCG/i386: use iargs or oargs instead of register names
Index(es):
- Date
- Thread