qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC P


From: Paolo Bonzini
Subject: Re: [Qemu-devel] Block I/O outside the QEMU global mutex was "Re: [RFC PATCH 00/17] Support for multiple "AIO contexts""
Date: Tue, 09 Oct 2012 18:26:23 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120911 Thunderbird/15.0.1

Il 09/10/2012 17:37, Anthony Liguori ha scritto:
>>> >> In the very short term, I can imagine an aio fastpath that was only
>>> >> implemented in terms of the device API.  We could have a slow path that
>>> >> acquired the BQL.
>> >
>> > Not sure I follow.
> 
> As long as the ioeventfd thread can acquire qemu_mutex in order to call
> bdrv_* functions.  The new device-only API could do this under the
> covers for everything but the linux-aio fast path initially.

Ok, so it's about the locking.  I'm not even sure we need locking if we
have cooperative multitasking.  For example if bdrv_aio_readv/writev
is called from a VCPU thread, it can just schedule a bottom half for
itself in the appropriate AioContext.  Similarly for block jobs.

The only part where I'm not sure how it would work is bdrv_read/write,
because of the strange "qemu_aio_wait() calls select with a lock taken".
Maybe we can just forbid synchronous I/O if you set a non-default
AioContext.

This would be entirely hidden in the block layer.  For example the
following does it for bdrv_aio_readv/writev:

diff --git a/block.c b/block.c
index e95f613..7165e82 100644
--- a/block.c
+++ b/block.c
@@ -3712,15 +3712,6 @@ static AIOPool bdrv_em_co_aio_pool = {
     .cancel             = bdrv_aio_co_cancel_em,
 };
 
-static void bdrv_co_em_bh(void *opaque)
-{
-    BlockDriverAIOCBCoroutine *acb = opaque;
-
-    acb->common.cb(acb->common.opaque, acb->req.error);
-    qemu_bh_delete(acb->bh);
-    qemu_aio_release(acb);
-}
-
 /* Invoke bdrv_co_do_readv/bdrv_co_do_writev */
 static void coroutine_fn bdrv_co_do_rw(void *opaque)
 {
@@ -3735,8 +3726,17 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque)
             acb->req.nb_sectors, acb->req.qiov, 0);
     }
 
-    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
-    qemu_bh_schedule(acb->bh);
+    acb->common.cb(acb->common.opaque, acb->req.error);
+    qemu_aio_release(acb);
+}
+
+static void bdrv_co_em_bh(void *opaque)
+{
+    BlockDriverAIOCBCoroutine *acb = opaque;
+
+    qemu_bh_delete(acb->bh);
+    co = qemu_coroutine_create(bdrv_co_do_rw);
+    qemu_coroutine_enter(co, acb);
 }
 
 static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
@@ -3756,8 +3756,8 @@ static BlockDriverAIOCB 
*bdrv_co_aio_rw_vector(BlockDriverState *bs,
     acb->req.qiov = qiov;
     acb->is_write = is_write;
 
-    co = qemu_coroutine_create(bdrv_co_do_rw);
-    qemu_coroutine_enter(co, acb);
+    acb->bh = qemu_bh_new(bdrv_co_em_bh, acb);
+    qemu_bh_schedule(acb->bh);
 
     return &acb->common;
 }


Then we can add a bdrv_aio_readv/writev_unlocked API to the protocols, which
would run outside the bottom half and provide the desired fast path.

Paolo

> That means that we can convert block devices to use the device-only API
> across the board (provided we make BQL recursive).
> 
> It also means we get at least some of the benefits of data-plane in the
> short term.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]