qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Ensuring data is written to disk


From: Jamie Lokier
Subject: Re: [Qemu-devel] Ensuring data is written to disk
Date: Tue, 1 Aug 2006 15:17:05 +0100
User-agent: Mutt/1.4.1i

Jens Axboe wrote:
> On Tue, Aug 01 2006, Jamie Lokier wrote:
> > > Of course, guessing the disk drive write buffer size and trying not to 
> > > kill
> > > system I/O performance with all these writes is another question entirely
> > > ... sigh !!!
> > 
> > If you just want to evict all data from the drive's cache, and don't
> > actually have other data to write, there is a CACHEFLUSH command you
> > can send to the drive which will be more dependable than writing as
> > much data as the cache size.
> 
> Exactly, and this is what the OS fsync() should do once the drive has
> acknowledged that the data has been written (to cache). At least
> reiserfs w/barriers on Linux does this.

1. Are you sure this happens, w/ reiserfs on Linux, even if the disk
   is an SATA or SCSI type that supports ordered tagged commands?  My
   understanding is that barriers force an ordering between write
   commands, and that CACHEFLUSH is used only with disks that don't have
   more sophisticated write ordering commands.  Is the data still
   committed to the disk platter before fsync() returns on those?

2. Do you know if ext3 (in ordered mode) w/barriers on Linux does it too,
   for in-place writes which don't modify the inode and therefore don't
   have a journal entry?

On Darwin, fsync() does not issue CACHEFLUSH to the drive.  Instead,
it has an fcntl F_FULLSYNC which does that, which is documented in
Darwin's fsync() page as working with all Darwin's filesystems,
provided the hardware honours CACHEFLUSH or the equivalent.

>From what little documentation I've found, on Linux it appears to be
much less predictable.  It seems that some filesystems, with some
kernel versions, and some mount options, on some types of disk, with
some drive settings, will commit data to a platter before fsync()
returns, and others won't.  And an application calling fsync() has no
easy way to find out.  Have I got this wrong?

ps. (An aside question): do you happen to know of a good patch which
implements IDE barriers w/ ext3 on 2.4 kernels?  I found a patch by
googling, but it seemed that the ext3 parts might not be finished, so
I don't trust it.  I've found turning off the IDE write cache makes
writes safe, but with a huge performance cost.

Thanks,
-- Jamie




reply via email to

[Prev in Thread] Current Thread [Next in Thread]