|
From: | Anthony Liguori |
Subject: | Re: [Qemu-devel] [PATCH][RFC] Linux AIO support when using O_DIRECT |
Date: | Mon, 23 Mar 2009 13:10:30 -0500 |
User-agent: | Thunderbird 2.0.0.21 (X11/20090320) |
Christoph Hellwig wrote:
On Mon, Mar 23, 2009 at 12:14:58PM -0500, Anthony Liguori wrote:I'd like to see the O_DIRECT bounce buffering removed in favor of the DMA API bouncing. Once that happens, raw_read and raw_pread can disappear. block-raw-posix becomes much simpler.See my vectored I/O patches for doing the bounce buffering at the optimal place for the aio path. Note that from my reading of the qcow/qcow2 code they might send down unaligned requests, which is something the dma api would not help with.
I was going to look today at applying those.
For the buffered I/O path we will always have to do some sort of buffering due to all the partition header reading / etc. And given how that part isn't performance critical my preference would be to keep doing it in bdrv_pread/write and guarantee the lowlevel drivers proper alignment.
I really dislike having so many APIs. I'd rather have an aio API that took byte accesses or have pread/pwrite always be emulated with a full sector read/write
We would drop the signaling stuff and have the thread pool use an fd to signal. The big problem with that right now is that it'll cause a performance regression for certain platforms until we have the IO thread in place.Talking about signaling, does anyone remember why the Linux signalfd/ eventfd support is only in kvm but not in upstream qemu?
Because upstream QEMU doesn't yet have an IO thread.TCG chains together TBs and if you have a tight loop in a VCPU, then the only way to break out of the loop is to receive a signal. The signal handler will call cpu_interrupt() which will unchain TBs allowing TCG execution to break once you return from the signal handler.
An IO thread solves this in a different way by letting select() always run in parallel to TCG VCPU execution. When select() returns you can send a signal to the TCG VCPU thread to break it out of chained TBs.
Not all IO in qemu generates a signal so this a potential problem but in practice, if we don't generate a signal for disk IO completion, a number of real world guests breaks (mostly non-x86 boards).
Regards, Anthony Liguori
[Prev in Thread] | Current Thread | [Next in Thread] |