qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 0/4] Improve -icount, fix it with iothread


From: Jan Kiszka
Subject: [Qemu-devel] Re: [PATCH 0/4] Improve -icount, fix it with iothread
Date: Wed, 23 Feb 2011 13:45:38 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2011-02-23 13:40, Edgar E. Iglesias wrote:
> On Wed, Feb 23, 2011 at 12:39:52PM +0100, Jan Kiszka wrote:
>> On 2011-02-23 12:08, Edgar E. Iglesias wrote:
>>> On Wed, Feb 23, 2011 at 11:25:54AM +0100, Paolo Bonzini wrote:
>>>> On 02/23/2011 11:18 AM, Edgar E. Iglesias wrote:
>>>>> Sorry, I don't know the code well enough to give any sensible feedback
>>>>> on patch 2 - 4. I did test them with some of my guests and things seem
>>>>> to be OK with them but quite a bit slower.
>>>>> I saw around 10 - 20% slowdown with a cris guest and -icount 10.
>>>>>
>>>>> The slow down might be related to the issue with super slow icount 
>>>>> together
>>>>> with iothread (adressed by Marcelos iothread timeout patch).
>>>>
>>>> No, this supersedes Marcelo's patch.  10-20% doesn't seem comparable to 
>>>> "looks like it deadlocked" anyway.  Also, Jan has ideas on how to remove 
>>>> the synchronization overhead in the main loop for TCG+iothread.
>>>
>>> I see. I tried booting two of my MIPS and CRIS linux guests with iothread
>>> and -icount 4. Without your patch, the boot crawls super slow. Your patch
>>> gives a huge improvement. This was the "deadlock" scenario which I
>>> mentioned in previous emails.
>>>
>>> Just to clarify the previous test where I saw slowdown with your patch:
>>> A CRIS setup that has a CRIS and basically only two peripherals,
>>> a timer block and a device (X) that computes stuff but delays the results
>>> with a virtual timer. The guest CPU is 99% of the time just
>>> busy-waiting for device X to get ready.
>>>
>>> This latter test runs in 3.7s with icount 4 and without iothread,
>>> with or without your patch.
>>>
>>> With icount 4 and iothread it runs in ~1m5s without your patch and
>>> ~1m20s with your patch. That was the 20% slowdown I mentioned earlier.
>>>
>>> Don't know if that info helps...
>>
>> You should try to trace the event flow in qemu, either via strace, via
>> the built-in tracer (which likely requires a bit more tracepoints), or
>> via a system-level tracer (ftrace / kernelshark).
> 
> Thanks, I'll see if I can get some time to run this more carefully during
> some weekend.
> 
>>
>> Did my patches contribute a bit to overhead reduction? They specifically
>> target the costly vcpu/iothread switches in TCG mode (caused by TCGs
>> excessive lock-holding times).
> 
> Do you have a tree for quick access to your patches? (couldnt find them
> on my inbox).

http://thread.gmane.org/gmane.comp.emulators.qemu/93765
(looks like I failed to CC you)

and they are also part of

git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream

> 
> I could give them a quick go and post results.
> 
> Cheers

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]