qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH 1/5] Add target memory mapping API
Date: Tue, 20 Jan 2009 19:23:14 +0200
User-agent: Thunderbird 2.0.0.19 (X11/20090105)

Ian Jackson wrote:
I think the key points in Avi's message are this:

Avi Kivity writes:
You don't know afterwards either. Maybe read() is specced as you
say, but practical implementations will return the minimum bytes
read, not exact.

And this:

I really doubt that any guest will be affected by this. It's a tradeoff between decent performance and needlessly accurate emulation. I don't see how we can choose the latter.

I don't think this is the right way to analyse this situation.  We are
trying to define a general-purpose DMA API for _all_ emulated devices,
not just the IDE emulation and block devices that you seem to be
considering.

No. There already exists a general API: cpu_physical_memory_rw(). We are trying to define an API which will allow the high-throughput devices (IDE, scsi, virtio-blk, virtio-net) to be implemented efficiently.

If device X does not work well with the API, then, unless it's important for some reason, it shouldn't use it. If it is important, we can adapt the API then.

If there is ever any hardware which behaves `properly' with partial
DMA, and any host kernel device which can tell us what succeeded and
what failed, then it is necessary for the DMA API we are now inventing
to allow that device to be properly emulated.

Even if we can't come up with an example right now of such a device
then I would suggest that it's very likely that we will encounter one
eventually.  But actually I can think of one straight away: a SCSI
tapestreamer.  Tapestreamers often give partial transfers at the end
of tapefiles; hosts (ie, qemu guests) talking to the SCSI controller
do not expect the controller to DMA beyond the successful SCSI
transfer length; and the (qemu host's) kernel's read() call will not
overwrite beyond the successful transfer length either.

That will work out fine as the DMA will be to kernel memory, and read() will copy just the interesting parts.

If it is difficult for a block device to provide the faithful
behaviour then it might be acceptable for the block device to always
indicate to the DMA API that the entire transfer had taken place, even
though actually some of it had failed.

But personally I think you're mistaken about the behaviour of the
(qemu host's) kernel's {aio_,p,}read(2).

I'm pretty sure reads to software RAIDs will be submitted in parallel. If those reads are O_DIRECT, then it's impossible to maintain DMA ordering.

In the initial implementation in Xen, we will almost certainly simply
emulate everything with cpu_physical_memory_rw.  So it will happen all
the time.
Try it out. I'm sure it will work just fine (if incredibly slowly, unless you provide multiple bounce buffers).

It will certainly work except (a) there are partial (interrupted)
transfers and (b) the host relies on the partial DMA not overwriting
more data than it successfully transferred.  So what that means is
that if this introduces bugs they will be very difficult to find in
testing.  I don't think testing is the answer here.

The only workaround I can think of is not to DMA. But that will be horribly slow.

--
error compiling committee.c: too many arguments to function





reply via email to

[Prev in Thread] Current Thread [Next in Thread]