qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?


From: Blue Swirl
Subject: Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?
Date: Mon, 29 Oct 2007 21:33:06 +0200

On 10/28/07, Jamie Lokier <address@hidden> wrote:
> Blue Swirl wrote:
> > Currently scsi-disk provides a buffer. For true zero copy, this needs
> > to be changed so that instead the buffer is provided by the caller at
> > each stage until we reach the host memory. But I'll use the scsi-disk
> > buffer for now.
>
> This might actually work in Qemu.
>
> But in general, a zero-copy I/O interface needs to allow for the
> possibility that either the source of data, or the sink, might need to
> be in charge of buffer allocations for a particular sequence.
> Otherwise you get situations where the data has to be copied to meet a
> technical constraint of a source of a sink, and the copy could have
> been avoided if the addresses were allocated to meet that constraint
> in the first place.  The most common technical constraint is probably
> the need for large contiguous blocks.
>
> I deal with this in my own program by having an I/O call from source
> to sink for requesting memory (through a chain of sources/sinks like
> your example if necessary), but only when the source is preparing to
> do an I/O and hasn't yet prepared the data.  If the data is already
> prepared before setting up the I/O for a write, then there's no point
> asking the sink to allocate memory, and if it has to anyway (e.g. if
> it needs a large contiguous block), that's an unavoidable copy anyway.
>
> A couple of examples of sinks with constraints are:
>
>    - Can't use writev().  E.g. you're using a slightly old Linux
>      kernel, want to do AIO, and it doesn't have async writev(), only async
>      write().
>
>    - Writing to sound card through memory-mapped ring buffer.  The
>      sink is the code which opens /dev/dsp, and then it can provide
>      buffers for zero-copy only if it picks the address where data
>      will be prepared.
>
>    - Async I/O using "database writer" style separate processes which
>      actually do the writes synchronously, and the data is passed to
>      them using shared memory.  For this, the sink is the code which
>      sends a request to one of the writer processes, and it must use a
>      buffer which is in the mapped shared memory.

I think this also shows that the system may become quite complex. Some
kind of hooks may be needed before and after the transfer.

We could cache the resolved addresses to overcome the additional setup
overhead. Each stage should install cache invalidation callbacks or a
method to call for recalculation of the addresses. For example IOMMU
or ESPDMA mappings change very often.

IO vector based API seems to be hard to use, so a simple list should
be better. The vectors may not be compatible with the host anyway.
I'll make a new version.

It's good to get some feedback. Designing a high performance IO
framework suitable for all use cases seems to be very challenging.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]