[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?
From: |
Paul Brook |
Subject: |
Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO? |
Date: |
Sun, 28 Oct 2007 02:29:09 +0100 |
User-agent: |
KMail/1.9.7 |
> I changed Slirp output to use vectored IO to avoid the slowdown from
> memcpy (see the patch for the work in progress, gives a small
> performance improvement). But then I got the idea that using AIO would
> be nice at the outgoing end of the network IO processing. In fact,
> vectored AIO model could even be used for the generic DMA! The benefit
> is that no buffering or copying should be needed.
An interesting idea, however I don't want to underestimate the difficulty of
implementing this correctly. I suspect to get real benefits you need to
support zero-copy async operation all the way through. Things get really
hairy if you allow some operations to complete synchronously, and some to be
deferred.
I've done async operation for SCSI and USB. The latter is really not pretty,
and the former has some notable warts. A generic IODMA framework needs to
make sure it covers these requirements without making things worse. Hopefully
it'll also help fix the things that are wrong with them.
> For the specific Sparc32 case, unfortunately Lance bus byte swapping
> makes buffering necessary at that stage, unless we can make N vectors
> with just a single byte faster than memcpy + bswap of memory block
> with size N.
We really want to be dealing with largeish blocks. The {ptr,size} vector is 64
or 128 bytes per element, so the overhead on blocks < 64 bytes if going to be
really brutal. Also time taken to do address translation will be O(number of
vectors).
> Inside Qemu the vectors would use target physical addresses (struct
> qemu_iovec), but at some point the addresses would change to host
> pointers suitable for real AIO.
Phrases like "at some point" worry me :-)
I think it would be good to get a top-down description of what each different
entity (initiating device, host endpoint, bus translation, memory) is
responsible for, and how they all fit together.
I have some ideas, but without more detailed investigation can't tell if they
will actually work in practice, or if they fit into the code fragments you've
posted. My suspicion is they don't as I can't make head or tail of how your
gdma_aiov.diff patch would be used in practice.
Paul
- [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/27
- [Qemu-devel] Re: Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/27
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?,
Paul Brook <=
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/28
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/28
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Jamie Lokier, 2007/10/28
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/29
- Re: [Qemu-devel] Faster, generic IO/DMA model with vectored AIO?, Blue Swirl, 2007/10/30