qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH 06/10] qemu-img: Prepare for locked


From: Kevin Wolf
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH 06/10] qemu-img: Prepare for locked images
Date: Wed, 13 Jan 2016 15:19:58 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

Am 13.01.2016 um 09:44 hat Markus Armbruster geschrieben:
> Kevin Wolf <address@hidden> writes:
> 
> > Am 12.01.2016 um 16:20 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <address@hidden> writes:
> >> 
> >> > Am 11.01.2016 um 16:49 hat Markus Armbruster geschrieben:
> >> >> Eric Blake <address@hidden> writes:
> >> >> 
> >> >> > On 12/22/2015 09:46 AM, Kevin Wolf wrote:
> >> >> >> This patch extends qemu-img for working with locked images. It 
> >> >> >> prints a
> >> >> >> helpful error message when trying to access a locked image 
> >> >> >> read-write,
> >> >> >> and adds a 'qemu-img force-unlock' command as well as a 'qemu-img 
> >> >> >> check
> >> >> >> -r all --force' option in order to override a lock left behind after 
> >> >> >> a
> >> >> >> qemu crash.
> >> >> >> 
> >> >> >> Signed-off-by: Kevin Wolf <address@hidden>
> >> >> >> ---
> >> >> >>  include/block/block.h |  1 +
> >> >> >>  include/qapi/error.h  |  1 +
> >> >> >>  qapi/common.json      |  3 +-
> >> >> >>  qemu-img-cmds.hx      | 10 ++++--
> >> >> >>  qemu-img.c | 96
> >> >> >> +++++++++++++++++++++++++++++++++++++++++++--------
> >> >> >>  qemu-img.texi         | 20 ++++++++++-
> >> >> >>  6 files changed, 113 insertions(+), 18 deletions(-)
> >> >> >> 
> >> >> >
> >> >> >> +++ b/include/qapi/error.h
> >> >> >> @@ -102,6 +102,7 @@ typedef enum ErrorClass {
> >> >> >>      ERROR_CLASS_DEVICE_NOT_ACTIVE = 
> >> >> >> QAPI_ERROR_CLASS_DEVICENOTACTIVE,
> >> >> >>      ERROR_CLASS_DEVICE_NOT_FOUND = QAPI_ERROR_CLASS_DEVICENOTFOUND,
> >> >> >>      ERROR_CLASS_KVM_MISSING_CAP = QAPI_ERROR_CLASS_KVMMISSINGCAP,
> >> >> >> +    ERROR_CLASS_IMAGE_FILE_LOCKED = 
> >> >> >> QAPI_ERROR_CLASS_IMAGEFILELOCKED,
> >> >> >>  } ErrorClass;
> >> >> >
> >> >> > Wow - a new ErrorClass.  It's been a while since we could justify one 
> >> >> > of
> >> >> > these, but I think you might have found a case.
> >> >> 
> >> >> Spell out the rationale for the new ErrorClass, please.
> >> >
> >> > Action to be taken for this error class: Decide whether the lock is a
> >> > leftover from a previous qemu run that ended in an unclean shutdown. If
> >> > so, retry with overriding the lock.
> >> >
> >> > Currently used by qemu-img when ordered to override a lock. libvirt
> >> > will need to do the same.
> >> 
> >> Let's see whether I understand the intended use:
> >> 
> >>     open image
> >>     if open fails with ImageFileLocked:
> >>         guess whether the lock is stale
> >>         if guessing not stale:
> >>             error out
> >>         open image with lock override
> >> 
> >> Correct?
> >
> > Yes. Where "guess" is more or less "check whether the management tool
> > started qemu with this image, but didn't cleanly shut it down". This can
> > guess wrong if, and only if, some other user used a different algorithm
> > and forced an unlock even though the image didn't belong to them before
> > the crash.
> >
> >> Obvious troublespots:
> >> 
> >> 1. If you guess wrong, you destroy the image.  No worse than before, so
> >>    okay, declare documentation problem.
> >> 
> >> 2. TOCTTOU open to open with lock override
> >>    [...]
> >> 
> >> 3. TOCTTOU within open (hypothetical, haven't read your code)
> >>    [...]
> >
> > Yes, these exist in theory. The question is what scenarios you want to
> > protect against and whether improving the mechanism to cover these cases
> > is worth the effort.
> >
> > The answer for what I wanted to protect is a manual action on an image
> > that is already in use. The user isn't quick enough to manually let two
> > processes open the same image at the same time, so I didn't consider
> > that scenario relevant.
> >
> > But assuming that everyone (including the human user) follows the above
> > protocol (force-unlock only what was yours before the crash), at least
> > cases 1 and 2 don't happen anyway.
> 
> "Force-unlock only what you locked yourself" is easier to stipulate than
> to adhere to when the tools can't give you a hint on who did the
> locking.  This is particularly true when "you" is a human, with human
> imperfect memory.
> 
> I understand that this locking can't provide complete protection, and
> merely aims to catch certain common accidents.
> 
> However, to avoid a false sense of security, its limitations need to be
> clearly documented.  This very much includes the rule "force-unlock only
> what you locked yourself".  In my opinion, it should also include the
> raciness.
> 
> Sometimes, solving a problem is easier than documenting it.

Maybe I'll just merge the migration fixes and shelve the rest of the
series until I'm bored enough to implement the "real thing" with an
incompatible feature flag, lock IDs with an autogenerated part and
another part from the user, saved host name and PID, and a qcow2 driver
that refuses to write anything to an image it doesn't hold the lock for
even in corner cases. For now, I've already used more time for this than
I intended (didn't expect all that live migration fun initially...).

And after all, it's not my VMs that get corrupted.

If you think that solving the problem "for real" is too easy to take
shortcuts, I'll happily create a BZ and assign it to you if you'd like
me to.

Kevin



reply via email to

[Prev in Thread] Current Thread [Next in Thread]