qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [PATCH v2] block: Fix bdrv_drain in coroutine
Date: Tue, 5 Apr 2016 10:39:56 +0100
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, Apr 05, 2016 at 09:27:39AM +0800, Fam Zheng wrote:
> On Mon, 04/04 12:57, Stefan Hajnoczi wrote:
> > On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote:
> > > +    BdrvCoDrainData data;
> > > +
> > > +    assert(qemu_in_coroutine());
> > > +    data = (BdrvCoDrainData) {
> > > +        .co = qemu_coroutine_self(),
> > > +        .bs = bs,
> > > +        .done = false,
> > > +        .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, 
> > > &data),
> > > +    };
> > > +    qemu_bh_schedule(data.bh);
> > > +
> > > +    qemu_coroutine_yield();
> > > +    /* If we are resumed from some other event (such as an aio 
> > > completion or a
> > > +     * timer callback), it is a bug in the caller that should be fixed. 
> > > */
> > > +    assert(data.done);
> > > +}
> > > +
> > >  /*
> > >   * Wait for pending requests to complete on a single BlockDriverState 
> > > subtree,
> > >   * and suspend block driver's internal I/O until next request arrives.
> > > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs)
> > >      bool busy = true;
> > >  
> > >      bdrv_drain_recurse(bs);
> > > +    if (qemu_in_coroutine()) {
> > > +        bdrv_co_drain(bs);
> > > +        return;
> > > +    }
> > >      while (busy) {
> > >          /* Keep iterating */
> > >           bdrv_flush_io_queue(bs);
> > > -- 
> > > 2.7.4
> > 
> > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain()
> > should assert(!qemu_in_coroutine()).
> > 
> > The reason why existing bdrv_read() and friends detect coroutine context
> > at runtime is because it is meant for legacy code that runs in both
> > coroutine and non-coroutine contexts.
> 
> blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations),
> and this doesn't just work with the assertion. Should I clean up this "legacy"
> code first, i.e. move bdrv_unref calls to BHs in the callers and
> assert(!qemu_in_coroutine()) there too? I didn't think this because it
> complicates the code somehow.

This is a messy problem.

In general I don't like introducing yields into non-coroutine_fn
functions because it can lead to bugs when the caller didn't expect a
yield point.

For example, I myself wouldn't have expected bdrv_unref() to be a yield
point.  So maybe coroutine code I've written would be vulnerable to
interference (I won't call it a race condition) from another coroutine
across the bdrv_unref() call.  This could mean that another coroutine
now sees intermediate state that would never be visible without the new
yield point.

I think attempting to invoke qemu_co_queue_run_restart() directly
instead of scheduling a BH and yielding does not improve the situation.
It's also a layering violation since qemu_co_queue_run_restart() is just
meant for the core coroutine code and isn't a public interface.

Anyway, let's consider bdrv_drain() legacy code that can call if
(qemu_in_coroutine()) but please make bdrv_co_drain() public so
block/mirror.c can at least call it directly.

Stefan

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]