qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6


From: Ming Lei
Subject: Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2
Date: Tue, 1 Jul 2014 22:49:22 +0800

On Tue, Jul 1, 2014 at 10:31 PM, Stefan Hajnoczi <address@hidden> wrote:
> On Tue, Jul 1, 2014 at 3:53 PM, Ming Lei <address@hidden> wrote:
>> On Mon, Jun 30, 2014 at 4:08 PM, Stefan Hajnoczi <address@hidden> wrote:
>>>
>>> Try:
>>> $ perf record -e syscalls:* --tid <iothread-tid>
>>> ^C
>>> $ perf script # shows the trace log
>>>
>>> The difference between syscalls in QEMU 2.0 and qemu.git/master could
>>> reveal the problem.
>>
>> The difference is that there are tons of write() and rt_sigprocmask()
>> in qemu.git/master, I guess it is related coroutinue.
>>
>> For linux-aio, the coroutinue shouldn't be necessary because
>> io_submit() won't block at most of times for O_DIRECT read/write.
>
> You're forgetting about image formats and the other QEMU block layer
> features like I/O throttling.  They do require coroutines.

I mean from linux-aio view, io_submit() won't block most of times, like
your previous implementation of dataplane.

>
> Are you sure it's the extra syscall overhead?  Any ideas for avoiding them?

Yes, I am sure, and it can be felt obviously when running perf to
trace system call, :-)

Let me provide some data when running randread(bs 4k, libaio)
from VM for 10sec:

1), qemu.git/master
- write(): 731K
- rt_sigprocmask(): 417K
- read(): 21K
- ppoll(): 10K
- io_submit(): 5K
- io_getevents(): 4K

2), qemu 2.0
- write(): 9K
- read(): 28K
- ppoll(): 16K
- io_submit(): 12K
- io_getevents(): 10K

> The sigprocmask can probably be optimized away since the thread's
> signal mask remains unchanged most of the time.
>
> I'm not sure what is causing the write().

I am investigating it...


Thanks,
-- 
Ming Lei



reply via email to

[Prev in Thread] Current Thread [Next in Thread]