qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-block] [PATCH 15/18] block: only call aio_poll on


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 15/18] block: only call aio_poll on the current thread's AioContext
Date: Sun, 16 Oct 2016 17:40:10 +0100
User-agent: Mutt/1.7.0 (2016-08-17)

On Thu, Oct 13, 2016 at 07:34:19PM +0200, Paolo Bonzini wrote:
> aio_poll is not thread safe; for example bdrv_drain can hang if
> the last in-flight I/O operation is completed in the I/O thread after
> the main thread has checked bs->in_flight.
> 
> The bug remains latent as long as all of it is called within
> aio_context_acquire/aio_context_release, but this will change soon.
> 
> To fix this, if bdrv_drain is called from outside the I/O thread,
> signal the main AioContext through a dummy bottom half.  The event
> loop then only runs in the I/O thread.
> 
> Reviewed-by: Fam Zheng <address@hidden>
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
>  async.c                   |  1 +
>  block.c                   |  2 ++
>  block/io.c                |  7 +++++++
>  include/block/block.h     | 24 +++++++++++++++++++++---
>  include/block/block_int.h |  1 +
>  5 files changed, 32 insertions(+), 3 deletions(-)
> 
> diff --git a/async.c b/async.c
> index f30d011..fb37b03 100644
> --- a/async.c
> +++ b/async.c
> @@ -61,6 +61,7 @@ void aio_bh_schedule_oneshot(AioContext *ctx, QEMUBHFunc 
> *cb, void *opaque)
>      smp_wmb();
>      ctx->first_bh = bh;
>      qemu_mutex_unlock(&ctx->bh_lock);
> +    aio_notify(ctx);
>  }
>  
>  QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
> diff --git a/block.c b/block.c
> index fbe485c..a17baab 100644
> --- a/block.c
> +++ b/block.c
> @@ -2090,7 +2090,9 @@ int bdrv_reopen_multiple(AioContext *ctx, 
> BlockReopenQueue *bs_queue, Error **er
>  
>      assert(bs_queue != NULL);
>  
> +    aio_context_release(ctx);
>      bdrv_drain_all();
> +    aio_context_acquire(ctx);
>  
>      QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
>          if (bdrv_reopen_prepare(&bs_entry->state, bs_queue, &local_err)) {
> diff --git a/block/io.c b/block/io.c
> index 7d3dcfc..cd28909 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -474,8 +474,15 @@ void bdrv_inc_in_flight(BlockDriverState *bs)
>      atomic_inc(&bs->in_flight);
>  }
>  
> +static void dummy_bh_cb(void *opaque)
> +{
> +}
> +
>  void bdrv_wakeup(BlockDriverState *bs)
>  {
> +    if (bs->wakeup) {
> +        aio_bh_schedule_oneshot(qemu_get_aio_context(), dummy_bh_cb, NULL);
> +    }
>  }

Why use a dummy BH instead of aio_notify()?

>  
>  void bdrv_dec_in_flight(BlockDriverState *bs)
> diff --git a/include/block/block.h b/include/block/block.h
> index ba4318b..72d5d8e 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -343,9 +343,27 @@ void bdrv_drain_all(void);
>  #define bdrv_poll_while(bs, cond) ({                       \
>      bool waited_ = false;                                  \
>      BlockDriverState *bs_ = (bs);                          \
> -    while ((cond)) {                                       \
> -        aio_poll(bdrv_get_aio_context(bs_), true);         \
> -        waited_ = true;                                    \
> +    AioContext *ctx_ = bdrv_get_aio_context(bs_);          \
> +    if (aio_context_in_iothread(ctx_)) {                   \
> +        while ((cond)) {                                   \
> +            aio_poll(ctx_, true);                          \
> +            waited_ = true;                                \
> +        }                                                  \
> +    } else {                                               \
> +        assert(qemu_get_current_aio_context() ==           \
> +               qemu_get_aio_context());                    \

The assumption is that IOThread #1 will never call bdrv_poll_while() on
IOThread #2's AioContext.  I believe that is true today.  Is this what
you had in mind?

Please add a comment since it's not completely obvious from the assert
expression.

> +        /* Ask bdrv_dec_in_flight to wake up the main      \
> +         * QEMU AioContext.                                \
> +         */                                                \
> +        assert(!bs_->wakeup);                              \
> +        bs_->wakeup = true;                                \
> +        while ((cond)) {                                   \
> +            aio_context_release(ctx_);                     \
> +            aio_poll(qemu_get_aio_context(), true);        \
> +            aio_context_acquire(ctx_);                     \
> +            waited_ = true;                                \
> +        }                                                  \
> +        bs_->wakeup = false;                               \
>      }                                                      \
>      waited_; })
>  
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 11f877b..0516f62 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -470,6 +470,7 @@ struct BlockDriverState {
>      NotifierWithReturnList before_write_notifiers;
>  
>      /* number of in-flight requests; overall and serialising */
> +    bool wakeup;
>      unsigned int in_flight;
>      unsigned int serialising_in_flight;
>  
> -- 
> 2.7.4
> 
> 
> 

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]