qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [Qemu-devel] [PATCH] qemu and qemu-xen: support empt


From: Jeremy Fitzhardinge
Subject: Re: [Xen-devel] Re: [Qemu-devel] [PATCH] qemu and qemu-xen: support empty write barriers in xen_disk
Date: Wed, 24 Nov 2010 10:18:40 -0800
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101027 Fedora/3.1.6-1.fc13 Lightning/1.0b3pre Thunderbird/3.1.6

On 11/24/2010 08:58 AM, Christoph Hellwig wrote:
> I had the discussion with Jeremy in Boston before, but let's repeat it
> here:
>
>  - is there actually any pre-existing xen backend that does properly
>    implement empty barries.  Back then we couldn't find any.
>  - if this is a new concept to Xen please do not define an empty
>    barrier primitive, but a new flush cache primitive.  That one
>    maps natively to the qemu I/O layer, and with recent Linux, NetBSD,
>    Windows, or Solaris guest will be a lot faster than a barrier
>    which drains the queue.
>
> Note that what your patch implements actually is a rather inefficient
> implementation of the latter.  You do none of the queue draining which
> the in-kernel blkback implementation does by submitting the old-style
> barrier bio.  While most filesystem do not care you introduce a quite
> subtile chance of data corruption for reiserfs, or ext4 with
> asynchronous journal commits on pre-2.6.37 kernels.

Yeah, I agree.  I think semantically empty WRITE_BARRIERs are supposed
to work, as evidenced by the effort blkback makes in trying to
specifically support them.  I haven't traced through to find out why it
EIOs them regardless - it seems to be coming from deeper in the block
subsystem (is there a "no payload" flag or something missing?).

But in the future, I think defining an unordered FLUSH operator like
Linux wants is a useful thing to do and implement (especially since it
amounts to standardising the ?BSD extension).  I'm not sure of their
precise semantics (esp WRT ordering), but I think its already OK.

(BTW, in case it wasn't clear, we're seriously considering - but not yet
committed to - using qemu as the primary PV block backend for Xen
instead of submitting the existing blkback code for upstream.  We still
need to do some proper testing and measuring to make sure it stacks up
OK, and work out how it would fit together with the rest of the
management stack.  But so far it looks promising.)

    J



reply via email to

[Prev in Thread] Current Thread [Next in Thread]