qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] BlockDriverState member in_use (was: KVM Forum 2012 Blo


From: Marcelo Tosatti
Subject: Re: [Qemu-devel] BlockDriverState member in_use (was: KVM Forum 2012 Block BoF minutes)
Date: Mon, 19 Nov 2012 18:13:00 -0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Nov 19, 2012 at 06:03:55PM +0100, Markus Armbruster wrote:
> Marcelo Tosatti <address@hidden> writes:
> 
> > On Thu, Nov 15, 2012 at 02:31:53PM +0100, Markus Armbruster wrote:
> [...]
> >> BlockDriverState member in_use and DriveInfo member refcount are a mess:
> >> 
> >> - in_use is used to avoid running certain things concurrently, but the
> >>   rules are unclear, except they're clearly incomplete; possible rules
> >>   could be about need for consistent views of image contents
> >
> >     block: enable in_use flag
> >
> >     Set block device in use during block migration, disallow drive_del and
> >     bdrv_truncate for in use devices.
> >
> > 1) drive_del is not allowed because memory object used by block
> > streaming/migration can disappear.
> > 2) bdrv_truncate changes device size, which the block migration code was
> > unable to handle at the time.
> >
> > These are the rules (accordingly to the changeset).
> 
> Yup.  Your commits db593f25^..8591675f.
> 
> >                                                     IIRC new rules have
> > been added (new uses for bdrv_in_use).
> 
> Stefan's commit 2d3735d3.
> 
> Let's have a look at the users of in_use (from notes I took back in
> August, possibly stale now):
> 
> * bdrv_commit()
> 
>   Since commit 2d3735d3:
> 
>     block: check bdrv_in_use() before blockdev operations
>     
>     Long-running block operations like block migration and image streaming
>     must have continual access to their block device.  It is not safe to
>     perform operations like hotplug, eject, change, resize, commit, or
>     external snapshot while a long-running operation is in progress.
>     
>     This patch adds the missing bdrv_in_use() checks so that block migration
>     and image streaming never have the rug pulled out from underneath
>     them.
> 
>   Users are monitor command commit and terminal escape 's', directly and
>   via bdrv_commit_all()
> 
>   What exactly would go wrong if someone commited during block migration
>   / image streaming?

    if (ro) {
        if (bdrv_reopen(bs->backing_hd, open_flags | BDRV_O_RDWR, NULL)) {
            return -EACCES;
        }
    }

>   Strange: also tests whether backing_hd is in_use.  The other users
>   don't, and I can't see how the test could ever succeed.
> 
>   Cock-up: bdrv_commit_all() stops on first error.
> 
> * bdrv_truncate()
> 
>   Since commit 8591675f:
>   
>     block: enable in_use flag
>     
>     Set block device in use during block migration, disallow drive_del and
>     bdrv_truncate for in use devices.
> 
>   Users are monitor command block_resize and a bunch of block drivers.
> 
>   I don't think block drivers are affected by in_use.  They resize their
>   internal BDSs such as bs->file, and I can't see how in_use could be
>   set there.
> 
>   "No resize while block migration is running" is (was?) an artificial
>   restriction; it should migrate the resize just like any other change.

>   Image streaming, too, I guess.

If its necessary to allow concurrent operation, then yes.

> * block_job_create()
> 
>   Since commit eeec61f2:
> 
>     block: add BlockJob interface for long-running operations
> 
>   Use is monitor command block-stream.
> 
>   Why does the block job infrastructure set in_use?  I suspect just
>   because its only user "image streaming" needs it, but future users
>   might not.  If that's correct, it's set in the wrong place.

Because block stream supposedly does not handle a BDS changing size (and
it was not necessary to handle this case at the time). Changing it
requires auditing the code.

> * qmp_transaction()
> 
>   Since commit 2d3735d3 (see above).  Function was called
>   qmp_blockdev_snapshot_sync() then.
> 
>   User is monitor command snapshot_blkdev.
> 
>   What exactly would go wrong if someone snaphotted during block
>   migration / image streaming?
> 
> * eject_device()
> 
>   Since commit 2d3735d3 (see above).
> 
>   Users are monitor command eject, change.
> 
>   What would go wrong is obvious, for a change: we can't just kill the
>   BDS while a long-running operation is using it.
> 
>   Could we just disassociate the BDS from the drive without killing it?
>   So that when the job completes, the BDS's reference count drops to
>   zero, and the BDS gets destroyed.
> 
> * drive_del
> 
>   Since commit 8591675f (see above).
> 
>   In theory, in_use is unnecessary here: reference counting should delay
>   the delete just fine.  In practice, do_drive_del() *ignores* the
>   reference count.  Even if that was fixed, the count only delays
>   drive_uninit(), not bdrv_drain_all()..bdrv_close().  Another fine
>   mess.
> 
>   Note how we treat the device model's reference to the drive: if it
>   exists, we make the drive anonymous instead of destroying it.  It gets
>   destroyed only when the device model drops the reference.  Similar to
>   reference counting.
> 
> Tests of in_use are spread over block.c and monitor commands.  Smells
> bad.  If it was only about excluding monitor commands during certain
> long-running operations, the montor commands would be the appropriate
> place.  Is it?
> 
> > "Consistent views of image contents" is not defined.
> 
> Imprecise language for "in_use seems to be about changes to the image,
> while refcount is for keeping objects alive".  We were trying to make
> sense of the mess.
> 
> > Question: why are the rules "clearly incomplete" ?
> 
> Are we sure we catch everything that could interfere with block
> migration / image streaming?  Why only these two?  What about other
> long-running jobs?  Do they need interference protection, too?

Good question. 

> Shouldn't there be clear rules on what changes can happen while a job
> runs, and which ones can be prevented, and how?

It depends on what operation can't fail to be performed while block
stream is executing. If such an operation exists, then you take the
energy to guarantee block streaming can perform with such an operation
in between.

There is no other basis for the rules.

Agree its a mess now (perhaps you'd want to work out the details of
every operation and corresponding effect on block streaming).

> >> - refcount is only used together with in_use, and appears to be for
> >>   avoiding premature deletion
> >
> > Refcount is used to stop block auto deletion when device is unplugged
> > by the guest. If you can reach BlockDriverState and use its in_use
> > flag, then you can kill refcount.
> 
> Except it'll come right back when we switch to QOM :)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]