qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [Qemu-devel] [PATCH v7 22/24] block: Rewrite bdrv_close


From: Fam Zheng
Subject: Re: [Qemu-block] [Qemu-devel] [PATCH v7 22/24] block: Rewrite bdrv_close_all()
Date: Wed, 18 Nov 2015 10:52:59 +0800
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, 11/16 17:15, Max Reitz wrote:
> >> @@ -1971,13 +1969,44 @@ static void bdrv_close(BlockDriverState *bs)
> >>  void bdrv_close_all(void)
> >>  {
> >>      BlockDriverState *bs;
> >> +    AioContext *aio_context;
> >> +    int original_refcount = 0;
> >>  
> >> -    QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> >> -        AioContext *aio_context = bdrv_get_aio_context(bs);
> >> +    /* Drop references from requests still in flight, such as canceled 
> >> block
> >> +     * jobs whose AIO context has not been polled yet */
> >> +    bdrv_drain_all();
> >>  
> >> -        aio_context_acquire(aio_context);
> >> -        bdrv_close(bs);
> >> -        aio_context_release(aio_context);
> >> +    blockdev_close_all_bdrv_states();
> >> +    blk_remove_all_bs();
> >> +
> >> +    /* Cancel all block jobs */
> >> +    while (!QTAILQ_EMPTY(&all_bdrv_states)) {
> >> +        QTAILQ_FOREACH(bs, &all_bdrv_states, bs_list) {
> >> +            aio_context = bdrv_get_aio_context(bs);
> >> +
> >> +            aio_context_acquire(aio_context);
> >> +            if (bs->job) {
> >> +                /* So we can safely query the current refcount */
> >> +                bdrv_ref(bs);
> >> +                original_refcount = bs->refcnt;
> >> +
> >> +                block_job_cancel_sync(bs->job);
> >> +                aio_context_release(aio_context);
> >> +                break;
> >> +            }
> >> +            aio_context_release(aio_context);
> >> +        }
> >> +
> >> +        /* All the remaining BlockDriverStates are referenced directly or
> >> +         * indirectly from block jobs, so there needs to be at least one 
> >> BDS
> >> +         * directly used by a block job */
> >> +        assert(bs);
> >> +
> >> +        /* Wait for the block job to release its reference */
> >> +        while (bs->refcnt >= original_refcount) {
> >> +            aio_poll(aio_context, true);
> >> +        }
> >> +        bdrv_unref(bs);
> > 
> > If at this point bs->refcnt is greater than 1, why don't we care where are 
> > the
> > remaining references from?
> 
> We do care. A BDS will not be removed from all_bdrv_states until it is
> deleted (i.e. its refcount becomes 0). Therefore, this loop will
> continue until all BDSs have been deleted.
> 
> So where might additional references come from? Since this loop only
> cares about direct or indirect references from block jobs, that's
> exactly it:
> 
> (1) You might have multiple block jobs running on a BDS in the future.
>     Then, you'll cancel them one by one, and after having canceled the
>     first one, the refcount will still be greater than one before the
>     bdrv_unref().
> 
> (2) Imagine a BDS A with a parent BDS B. There are block jobs running on
>     both of them. Now, B is referenced by both its block job and by A
>     (indirectly by the block job referencing A). If we cancel the job on
>     B before the one on A, then the refcount on B will still be greater
>     than 1 before bdrv_unref() because it is still referenced by its
>     parent (until the block job on A is canceled, too).
> 
> The first cannot happen right now, but the second one may, I'm not sure
> (depending on whether op blockers allow it).

OK, that makes sense. The all_bdrv_states is the central place to make sure all
refcnts reaching 0.

> 
> 
> And we do make sure that there are no additional references besides:
> - directly from the monitor (monitor-owned BDSs)
> - from a BB
> - from block jobs
> - (+ everything transitively through the respective BDS tree)
> 
> If there were additional references, the inner loop would at some point
> no longer find a BDS with a block job while all_bdrv_states is still not
> empty. That's what the "assert(bs)" after the inner loop is for.
> 
> If you can imagine another way where a reference to a BDS may come from,
> that would be a bug in this patch and we have to make sure to respect
> that case, too.

I think the only one reference I'm not sure is in xen_disk. blk_unref is called
in the .free and .disconnect call backs but I have no idea if they are called
before bdrv_close_all.

Otherwise this patch looks good for me.

Thanks,

Fam




reply via email to

[Prev in Thread] Current Thread [Next in Thread]