[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Re: Notes on block I/O data integrity
From: |
Christoph Hellwig |
Subject: |
[Qemu-devel] Re: Notes on block I/O data integrity |
Date: |
Thu, 27 Aug 2009 15:42:39 +0200 |
User-agent: |
Mutt/1.3.28i |
On Thu, Aug 27, 2009 at 08:21:55PM +0930, Rusty Russell wrote:
> > - virtio-blk needs to advertise ordered queue by default.
> > This makes cache=writethrough safe on virtio.
>
> >From a guest POV, that's "we don't know, let's say we're ordered because that
> may make us safer". Of course, it may not help: how much does it cost to
> drain the queue?
>
> The bug, IMHO is that we *should* know. And in future I'd like to fix that,
> either by adding an VIRTIO_BLK_F_ORDERED feature, or a VIRTIO_BLK_F_UNORDERED
> feature.
>
> > Action plan for QEMU:
> >
> > - IDE needs to set the write cache enabled bit
> > - virtio needs to implement a cache flush command and advertise it
> > (also needs a small change to the host driver)
>
> So, virtio-blk needs to be enhanced for this as well.
Really, enabling volatile write caches without advertising a cache flush
command is a bug in the storage, where in our case qemu is the storage.
So I don't really see the need for two feature bits. Here's my plan for
virtio-blk:
- add a new VIRTIO_BLK_F_WCACHE feature. If this feature is set we
do
(a) implement the prepare_flush queue operation to send a
standalone cache flush
(b) set a proper barrier ordering flag on the queue
Now I'm not entirely sure which queue ordering feature we will
use. It is not going to be QUEUE_ORDERED_TAG as for
VIRTIO_BLK_F_BARRIER as that leaves all the queue draining to
the host. Which for everything that uses something resembling
Posix I/O as a backed and has more than one outstanding command
at a time just means duplicating all the queue management we
already do in the guest for no gain.
The easiest one would be QUEUE_ORDERED_DRAIN_FLUSH, in which
case the cache flush command really is everything we need.
As a slight optimization of it we could make it
QUEUE_ORDERED_DRAIN_FUA which still does all the queue draining
in the guest, but only sends one explicit cache flush before the
barrier and gthen sets the FUA bit on the actual barrier
request. In qemu we still would implement this as fdatasync
before and after the request, but we would save one protocol
roundtrip.
Now the big question is when do we set the VIRTIO_BLK_F_WCACHE feature.
The proper thing to do would be to set it for cache=writeback and
cache=none, because they do need the fdatasync, and not for
cache=writethrough because it does not require it.
Now Avi is a big advocate for the cache=writethrough should mean go fast
and loose and don't care about data integrity. There's a certain point
to that as I don't really see a good use case for that mode, but I
really hate to make something unsafe that doesn't explicitly say so
in the option name.
The complex (not to say over engineered) verison would be to split
the caching and data integrity setting into two options:
(1) hostcache=on|off
use buffered vs O_DIRECT I/O
(2) integrity=osync|fsync|none
use O_SYNC, use f(data)sync or do not care about data integrity
[Qemu-devel] Re: Notes on block I/O data integrity, Nikola Ciprich, 2009/08/25
[Qemu-devel] Re: Notes on block I/O data integrity, Rusty Russell, 2009/08/27
- [Qemu-devel] Re: Notes on block I/O data integrity,
Christoph Hellwig <=
Re: [Qemu-devel] Notes on block I/O data integrity, Jamie Lokier, 2009/08/27