[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag
From: |
Christoph Hellwig |
Subject: |
Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag |
Date: |
Tue, 1 Sep 2009 01:06:12 +0200 |
User-agent: |
Mutt/1.3.28i |
On Mon, Aug 31, 2009 at 11:46:45PM +0100, Jamie Lokier wrote:
> > On Mon, Aug 31, 2009 at 11:09:50PM +0100, Jamie Lokier wrote:
> > > Right now, on a Linux host O_SYNC is unsafe with hardware that has a
> > > volatile write cache. That might not be changed, but if it is than
> > > performance with cache=writethrough will plummet (due to issuing a
> > > CACHE FLUSH to the hardware after every write), while performance with
> > > cache=writeback will be reasonable.
> >
> > Currenly all modes are more or less unsafe with volatile write caches
> > at least when using ext3 or raw block device accesses. XFS is safe
> > two thirds due to doing the right thing and one third due to sheer
> > luck.
>
> Right, but now you've made it worse. By not calling fdatasync at all,
> you've reduced the integrity. Previously it would reach the drive's
> cache, and take whatever (short) time it took to reach the platter.
> Now you're leaving data in the host cache which can stay for much
> longer, and is vulnerable to host kernel crashes.
Your last comment is for data=writeback, which in Avi's proposal that
I implemented would indeed lost any guarantees and be for all pratical
matters unsafe. It's not true for any of the other options.
> Oh, and QEMU could call whatever "hdparm -F" does when using raw block
> devices ;-)
Actually for ide/scsi implementing cache control is on my todo list.
Not sure about virtio yet.
> Well I'd like to start by pointing out your patch introduces a
> regression in the combination cache=writeback with emulated SCSI,
> because it effectively removes the fdatasync calls in that case :-)
Yes, you already pointed this out above.
> It goes to show no matter how hard we try, data integrity is a
> slippery thing where getting it wrong does not show up under normal
> circumstances, only during catastrophic system failures.
Honestly, it should not. Digging through all this was a bit of work,
but I was extremly how carelessly most people that touched it before
were. It's not rocket sciense and can be tested quite easily using
various tools - qemu beeing the easiest nowdays but scsi_debug or
an instrumented iscsi target would do the same thing.
> It failed with fsync, which
> is also important to applications, but filesystem integrity is the
> most important thing and it's been good at that for many years.
Users might disagree with that. With my user hat on I couldn't care
less on what state the internal metadata is as long as I get back at
my data which the OS has guaranteed me to reach the disk after a
successfull fsync/fdatasync/O_SYNC write.
> > E.g. if you want to move your old SCO Unix box into a VM it's the
> > only safe option.
>
> I agree, and for that reason, cache=writethrough or cache=none are the
> only reasonable defaults.
despite the extremly misleading name cache=none is _NOT_ an alternative,
unless we make it open the image using O_DIRECT|O_SYNC.
- [Qemu-devel] [PATCH 0/4] data integrity fixes, Christoph Hellwig, 2009/08/31
- [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Christoph Hellwig, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Jamie Lokier, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Anthony Liguori, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Jamie Lokier, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Christoph Hellwig, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Jamie Lokier, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Christoph Hellwig, 2009/08/31
- Re: [Qemu-devel] [PATCH 1/4] block: add enable_write_cache flag, Christoph Hellwig, 2009/08/31
[Qemu-devel] [PATCH 2/4] block: use fdatasync instead of fsync, Christoph Hellwig, 2009/08/31