[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC P
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"" |
Date: |
Wed, 10 Oct 2012 09:11:39 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1 |
Il 09/10/2012 20:26, Anthony Liguori ha scritto:
> Paolo Bonzini <address@hidden> writes:
>
>> Il 09/10/2012 17:37, Anthony Liguori ha scritto:
>>>>>>> In the very short term, I can imagine an aio fastpath that was only
>>>>>>> implemented in terms of the device API. We could have a slow path that
>>>>>>> acquired the BQL.
>>>>>
>>>>> Not sure I follow.
>>>
>>> As long as the ioeventfd thread can acquire qemu_mutex in order to call
>>> bdrv_* functions. The new device-only API could do this under the
>>> covers for everything but the linux-aio fast path initially.
>>
>> Ok, so it's about the locking. I'm not even sure we need locking if we
>> have cooperative multitasking. For example if bdrv_aio_readv/writev
>> is called from a VCPU thread, it can just schedule a bottom half for
>> itself in the appropriate AioContext. Similarly for block jobs.
>
> Okay, let's separate out the two issues here though. One is whether we
> need a device specific block API. The second is whether we should short
> cut to a fast path in the short term and go after a fully unlocked bdrv_
> layer in the long(shortish?) term.
>
> So let's talk about your proposal...
>
>> The only part where I'm not sure how it would work is bdrv_read/write,
>> because of the strange "qemu_aio_wait() calls select with a lock taken".
>> Maybe we can just forbid synchronous I/O if you set a non-default
>> AioContext.
>
> Not sure how practical that is. The is an awful lot of sync I/O still left.
Hmm, yeah, perhaps we need to bite the bullet and use a recursive lock.
The lock would be taken by:
- sync I/O ops
- monitor commands that currently call bdrv_drain_all
- aio_poll when calling bottom halves or handlers
The rest of the proposal however would stand (especially with reference
to block jobs).
I think we can proceed incrementally. The first obvious step is to
s/qemu_bh_new/aio_bh_new/ in the whole block layer (including the
CoQueue stuff), which would also help fixing the qemu-char bug that Jan
reported.
>> This would be entirely hidden in the block layer. For example the
>> following does it for bdrv_aio_readv/writev:
>>
>> diff --git a/block.c b/block.c
>> index e95f613..7165e82 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -3712,15 +3712,6 @@ static AIOPool bdrv_em_co_aio_pool = {
>> .cancel = bdrv_aio_co_cancel_em,
>> };
>>
>> -static void bdrv_co_em_bh(void *opaque)
>> -{
>> - BlockDriverAIOCBCoroutine *acb = opaque;
>> -
>> - acb->common.cb(acb->common.opaque, acb->req.error);
>> - qemu_bh_delete(acb->bh);
>> - qemu_aio_release(acb);
>> -}
>> -
>> /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
>> static void coroutine_fn bdrv_co_do_rw(void *opaque)
>> {
>> @@ -3735,8 +3726,17 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
>> acb->req.nb_sectors, acb->req.qiov, 0);
>> }
>>
>> - acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
>> - qemu_bh_schedule(acb->bh);
>> + acb->common.cb(acb->common.opaque, acb->req.error);
>> + qemu_aio_release(acb);
>> +}
>> +
>> +static void bdrv_co_em_bh(void *opaque)
>> +{
>> + BlockDriverAIOCBCoroutine *acb = opaque;
>> +
>> + qemu_bh_delete(acb->bh);
>> + co = qemu_coroutine_create(bdrv_co_do_rw);
>> + qemu_coroutine_enter(co, acb);
>> }
>>
>> static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
>> @@ -3756,8 +3756,8 @@ static BlockDriverAIOCB
>> *bdrv_co_aio_rw_vector(BlockDriverState *bs,
>> acb->req.qiov = qiov;
>> acb->is_write = is_write;
>>
>> - co = qemu_coroutine_create(bdrv_co_do_rw);
>> - qemu_coroutine_enter(co, acb);
>> + acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
>> + qemu_bh_schedule(acb->bh);
>>
>> return &acb->common;
>> }
>>
>>
>> Then we can add a bdrv_aio_readv/writev_unlocked API to the protocols, which
>> would run outside the bottom half and provide the desired fast path.
>
> This works for some of the block layer I think. How does this interact
> with thread pools for AIO?
>
> But this wouldn't work well with things like NBD or curl, right? What's
> the plan there?
NBD uses coroutines; curl can use the non-unlocked
bdrv_aio_readv/writev. In both cases they would execute in the
dataplane thread. qcow2-over-raw would also execute its read/write code
entirely from the dataplane thread, for example.
Paolo
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", (continued)
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Stefan Hajnoczi, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Anthony Liguori, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Paolo Bonzini, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Anthony Liguori, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Paolo Bonzini, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Anthony Liguori, 2012/10/09
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"",
Paolo Bonzini <=
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Anthony Liguori, 2012/10/10
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Paolo Bonzini, 2012/10/10
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Anthony Liguori, 2012/10/10
- Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts"", Kevin Wolf, 2012/10/11