qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode


From: Ming Lei
Subject: Re: [Qemu-devel] [PATCH 01/15] qemu coroutine: support bypass mode
Date: Fri, 1 Aug 2014 23:21:58 +0800

On Fri, Aug 1, 2014 at 10:17 PM, Paolo Bonzini <address@hidden> wrote:
> Il 01/08/2014 15:48, Ming Lei ha scritto:
>> On Fri, Aug 1, 2014 at 9:13 PM, Stefan Hajnoczi <address@hidden> wrote:
>>> On Fri, Aug 01, 2014 at 10:54:02AM +0800, Ming Lei wrote:
>>>> On Fri, Aug 1, 2014 at 12:30 AM, Paolo Bonzini <address@hidden> wrote:
>>>>> Il 31/07/2014 18:13, Ming Lei ha scritto:
>>>>>> Follows 'perf report' result on cycles event for with/without bypass
>>>>>> coroutine:
>>>>>>
>>>>>>     http://pastebin.com/ae0vnQ6V
>>>>>>
>>>>>> From the profiling result, looks bdrv_co_do_preadv() is a bit slow
>>>>>> without bypass coroutine.
>>>>>
>>>>> Yeah, I can count at least 3.3% time spent here:
>>>>>
>>>>> 0.87%          bdrv_co_do_preadv
>>>>> 0.79%          bdrv_aligned_preadv
>>>>> 0.71%          qemu_coroutine_switch
>>>>> 0.52%          tracked_request_begin
>>>>> 0.45%          coroutine_swap
>>>>>
>>>>> Another ~3% wasted in malloc, etc.
>>>>
>>>> That should be related with coroutine and the BH in bdrv_co_do_rw().
>>>> In this post I didn't apply Stephan's coroutine resize patch which might
>>>> decrease usage of malloc() for coroutine.
>>>
>>> Please rerun with "[PATCH v3 0/2] coroutine: dynamically scale pool
>>> size".
>>
>> No problem, will do that. Actually in my last post with rfc, this patchset
>> was against your coroutine resize patches.
>>
>> I will provide the profile data tomorrow.
>>
>>>
>>>> At least, coroutine isn't cheap from the profile result.
>>>
>>> Instead of bypassing coroutines we should first understand the overhead
>>> that they impose.  Is it due to the coroutine implementation (switching
>>> stacks) or due to the bdrv_co_*() code that happens to use coroutines
>>> but slow for other reasons.
>>
>> From the 3th patch(block: support to bypass qemu coroutinue)
>> and the 5th patch(dataplane: enable selective bypassing coroutine),
>> the change is to bypass coroutine and BH, and the other bdrv code
>> path is same, so it is due to the coroutine implementation, IMO.
>
> But your code breaks all sort of invariants.  For example, the aiocb
> must be valid when bdrv_aio_readv/writev return.  virtio-blk does not
> use it, but virtio-scsi does.  If we apply your patches now, we will
> have to redo it soon.

It can be supported by scheduling BH in bdrv_co_io_em_complete() too.

>
> Basically we should be rewriting parts of block.c so that
> bdrv_co_readv/writev calls bdrv_aio_readv/writev instead of vice versa.
>  Coroutine creation should be pushed down to the
> bdrv_aligned_preadv/bdrv_aligned_pwritev and, in the fast path, you can
> simply call the driver's bdrv_aio_readv/writev.

This idea is very constructive, and I will investigate further to see
if it is easy to start.


Thanks,



reply via email to

[Prev in Thread] Current Thread [Next in Thread]