qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] fix bdrv_read/write_em and qemu_aio_flush


From: Andrea Arcangeli
Subject: [Qemu-devel] fix bdrv_read/write_em and qemu_aio_flush
Date: Thu, 28 May 2009 18:33:10 +0200

Hello,

the debug code in my ide_dma_cancel patch (not yet included upstream)
made us notice that when qemu_aio_flush returns, there are still
pending aio commands that waits to complete. Auditing the code I found
strange stuff like the fact qemu_aio_waits does nothing if there's an
unrelated (no aio related) bh executed. And I think I found the reason
of why there was still pending aio when qemu_aio_flush because
qemu_aio_wait does a lot more than wait, it can start aio, and if the
previous ->io_flush returned zero, the loop ends and ->io_flush isn't
repeated. The fact an unrelated bh can make qemu_aio_wait a noop seems
troublesome for all callers that aren't calling qemu_aio_wait in a
loop like qemu_aio_flush, so I preferred to change those callers to a
safer qemu_aio_flush in case the bh executed generates more pending
I/O. What you think about this patch against qemu git?

------------
Subject: fix bdrv_read/write_em and qemu_aio_flush
From: Andrea Arcangeli <address@hidden>

qemu_aio_wait() was used to start aio through qemu_bh_poll(), like in case of
qcow2 reads from holes. The bh is global. I can't see how it can be possibly
safe to make qemu_aio_wait a noop, if any unrelated bh was pending and had to
be executed. 

In addition qemu_aio_wait by invoking the bh, could execute new aio callbacks
that would issue more aio commands during their completion handlers, breaking
the invariant that qemu_aio_poll returns only when all ->io_flush methods
returns 0 (which lead to a failure in my fix to ide_dma_cancel that has debug
code to catch that, code that isn't yet in qemu upstream).

        ->io_flush returns 0
        qemu_aio_wait()
        qemu_aio_bh() -> qcow_aio_read_bh -> ide_read_dma_cb -> 
ide_dma_submit_check
        break the loop despite ->io_flush wouldn't return 0

I also changed most aio_wait callers to call aio_flush instead, this will
ensure that even a read from hole submitted with bdrv_aio_readv will be
complete by the time aio_flush returns.

Signed-off-by: Andrea Arcangeli <address@hidden>
---

diff --git a/aio.c b/aio.c
index 11fbb6c..b304852 100644
--- a/aio.c
+++ b/aio.c
@@ -103,6 +103,12 @@ void qemu_aio_flush(void)
     do {
         ret = 0;
 
+       /*
+        * If there are pending emulated aio start them so
+        * flush will be able to return 1.
+        */
+       qemu_bh_poll();
+
         LIST_FOREACH(node, &aio_handlers, node) {
             ret |= node->io_flush(node->opaque);
         }
@@ -115,9 +121,6 @@ void qemu_aio_wait(void)
 {
     int ret;
 
-    if (qemu_bh_poll())
-        return;
-
     do {
         AioHandler *node;
         fd_set rdfds, wrfds;
diff --git a/block.c b/block.c
index c80d4b9..78088a9 100644
--- a/block.c
+++ b/block.c
@@ -1487,7 +1487,7 @@ static int bdrv_read_em(BlockDriverState *bs, int64_t 
sector_num,
         return -1;
 
     while (async_ret == NOT_DONE) {
-        qemu_aio_wait();
+        qemu_aio_flush();
     }
 
     return async_ret;
@@ -1510,7 +1510,7 @@ static int bdrv_write_em(BlockDriverState *bs, int64_t 
sector_num,
     if (acb == NULL)
         return -1;
     while (async_ret == NOT_DONE) {
-        qemu_aio_wait();
+        qemu_aio_flush();
     }
     return async_ret;
 }
diff --git a/qemu-io.c b/qemu-io.c
index f0a17b9..0d52074 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -153,7 +153,7 @@ static int do_aio_readv(QEMUIOVector *qiov, int64_t offset, 
int *total)
                return -EIO;
 
        while (async_ret == NOT_DONE)
-               qemu_aio_wait();
+               qemu_aio_flush();
 
        *total = qiov->size;
        return async_ret < 0 ? async_ret : 1;
@@ -170,7 +170,7 @@ static int do_aio_writev(QEMUIOVector *qiov, int64_t 
offset, int *total)
                return -EIO;
 
        while (async_ret == NOT_DONE)
-               qemu_aio_wait();
+               qemu_aio_flush();
 
        *total = qiov->size;
        return async_ret < 0 ? async_ret : 1;




reply via email to

[Prev in Thread] Current Thread [Next in Thread]