qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Disk integrity in QEMU


From: Steve Ofsthun
Subject: Re: [Qemu-devel] [RFC] Disk integrity in QEMU
Date: Mon, 13 Oct 2008 10:38:09 -0400
User-agent: Thunderbird 2.0.0.17 (X11/20080922)

Mark Wagner wrote:
> Anthony Liguori wrote:
>> Mark Wagner wrote:
>>> If you stopped and listened to yourself, you'd see that you are
>>> making my point...
>>>
>>> AFAIK, QEMU is neither designed nor intended to be an Enterprise
>>> Storage Array,
>>> I thought this group is designing a virtualization layer.  However,
>>> the persistent
>>> argument is that since Enterprise Storage products will often
>>> acknowledge a write
>>> before the data is actually on the disk, its OK for QEMU to do the same.
>>
>> I think you're a little lost in this thread.  We're going to have QEMU
>> only acknowledge writes when they complete.  I've already sent out a
>> patch.  Just waiting a couple days to let everyone give their input.
>>
> Actually, I'm just don't being clear enough in trying to point out that I
> don't think just setting a default value for "cache" goes far enough. My
> argument has nothing to do with the default value. It has to do with
> what the
> right thing to do is in specific situations regardless of the value of the
> cache setting.
> 
> My point is that if a file is opened in the guest with the O_DIRECT (or
> O_DSYNC)
> then QEMU *must* honor that regardless of whatever value the current
> value of
> "cache" is.

I disagree here.  QEMU's contract is not with any particular guest OS 
interface.  QEMU's contract is with the faithfulness of the hardware emulation. 
 The guest OS must perform appropriate actions that would guarantee the 
behavior advertised to any particular application.  So your discussion should 
focus on what should QEMU do when asked to flush an I/O stream on a virtual 
device.  While the specific actions QEMU might perform may be different based 
on caching mode, the end result should be host caching flushed to the 
underlying storage hierarchy.  Note that this still doesn't guarantee the I/O 
is on the disk unless the storage is configured properly.  QEMU shouldn't 
attempt to provide stronger guarantees than the host OS provides.

Looking at a parallel in the real world.  Most disk drives today ship with 
write caching enabled.  Most OSes will accept this and allow delayed writes to 
the actual media.  Is this completely safe?  No.  Is this accepted?  Yes.  Now, 
to become safe an application will perform extraordinary actions (various sync 
modes, etc) to guarantee the data is on the media.  Yet even this can be 
circumvented by specific performance modes in the storage hierarchy.  However, 
there are best practices to follow to avoid unexpected vulnerabilities.  For 
certain application environments is to mandatory to disable writeback caching 
on the drives.  Yet we wouldn't want to impose this constraint on all 
application environments.  There are always tradeoffs.

Now given that there are data safety issues to deal with, it is important to 
prevent a default behavior that recklessly endangers guest data.  A customer 
will expect a single virtual machine to exhibit the same data safety as a 
single physical machine.  However, running a group of virtual machines on a 
single host, the guest user will expect the same reliability as a group of 
physical machines.  Note that the virtualization layer adds vulnerabilities (a 
host OS crash for example) that reduce the reliability of the virtual machines 
over the physical machines they replace.  So the default behavior of a 
virtualization stack may need to be more conservative that the corresponding 
physical stack it replaces.

On the flip side though, the virtualization layer can exploit new opportunities 
for optimization.  Imagine a single macro operation running within a virtual 
machine (backup, OS installation).  Data integrity of the entire operation is 
important, not the individual I/Os.  So by disabling all individual I/O 
synchronization semantics, I get a backup or installation to run in half the 
time.  This can be a key advantage for virtual deployments.  We don't want to 
prevent this situation because we want to guarantee the integrity of half a 
backup, or half an install.

Steve






reply via email to

[Prev in Thread] Current Thread [Next in Thread]