qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO


From: Avi Kivity
Subject: Re: [Qemu-devel] Re: [PATCH 2 of 5] add can_dma/post_dma for direct IO
Date: Sun, 14 Dec 2008 08:03:33 +0200
User-agent: Thunderbird 2.0.0.18 (X11/20081119)

Anthony Liguori wrote:

There are N users of this code, all of which would need to cope with the
failure.  Or there could be one user (dma.c) which handles the failure
and the bouncing.

N should be small long term. It should only be for places that would interact directly with CPU memory. This would be the PCI BUS, the ISA BUS, some speciality devices, and possibly virtio (although you could argue it should go through the PCI BUS).

Fine, then let's rename it pci-dma.c.


map() has to fail and that has nothing to do with bouncing or not bouncing. In the case of Xen, you can have a guest that has 8GB of memory, and you only have 2GB of virtual address space. If you try to DMA to more than 2GB of memory, there will be a failure. Whoever is accessing memory directly in this fashion needs to cope with that.

The code already allows for failure, by partitioning the dma into segments. Currently, this happens only on bounce buffer overflow, when the Xen code is integrated it can be expanded to accommodate this.

(There's a case for partitioning 2GB DMAs even without Xen; just to reduce the size of iovec allocations)


dma.c _is_ a map/unmap api, except it doesn't expose the mapped data,
which allows it to control scheduling as well as be easier to use.

As I understand dma.c, it performs the following action: map() as much as possible, call an actor on mapped memory, repeat until done, signal completion.

As an abstraction, it may be useful. I would argue that it should be a bit more generic though. It should take a function pointer for map and unmap too, and then you wouldn't need N versions of it for each different type of API.

I don't follow. What possible map/unmap pairs would it call, other than cpu_physical_memory_(map/unmap)()?


Right, but who would it notify?

We need some place that can deal with this, and it isn't
_map()/_unmap(), and it isn't ide.c or scsi.c.

The pattern of try to map(), do IO, unmap(), repeat only really works for block IO. It doesn't really work for network traffic. You have to map the entire packet and send it all at once. You cannot accept a partial mapping result. The IO pattern to send an IO packet is much simpler: try to map the packet, if the mapping fails, either wait until more space frees up or drop the packet. For the other uses of direct memory access, like kernel loading, the same is true.

If so, the API should be extended to support more I/O patterns.


What this is describing is not a DMA API. It's a very specific IO pattern. I think that's part of what's causing confusion in this series. It's certainly not at all related to PCI DMA.

It deals with converting scatter/gather lists to iovecs, bouncing when this is not possible, and managing the bounce buffers. If this is not dma, I'm not sure what is. It certainly isn't part of block device emulation, it isn't part of the block layer (since bouncing is common to non-block devices). What is it?


I would argue, that you really want to add a block driver interface that takes the necessary information, and implements this pattern but that's not important. Reducing code duplication is a good thing so however it ends up working out is fine.

Right now the qemu block layer is totally independent of device emulation, and I think that's a good thing.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]