qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH 2/2] blockjob: Fix hang in block_job_finish_sync


From: Paolo Bonzini
Subject: Re: [Qemu-block] [PATCH 2/2] blockjob: Fix hang in block_job_finish_sync
Date: Thu, 28 Jan 2016 10:10:06 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0


On 28/01/2016 08:02, Fam Zheng wrote:
> 
> This is because the aio_poll() only processes the AIO context of bs
> which has no more work to do, while the main loop BH that is scheduled
> for setting the job->completed flag is never processed.
> 
> Fix this by adding a "ctx" pointer in BlockJob structure, to track which
> context to poll for the block job to make progress. Its value is set to
> the BDS context at block job creation, until
> block_job_coroutine_complete() is called by the block job coroutine.
> After that point, the block job's work is deferred to main loop BH.
> 
> Signed-off-by: Fam Zheng <address@hidden>
> ---
>  blockjob.c               | 4 +++-
>  include/block/blockjob.h | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 4b16720..4ea1ce0 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -74,6 +74,7 @@ void *block_job_create(const BlockJobDriver *driver, 
> BlockDriverState *bs,
>      job->opaque        = opaque;
>      job->busy          = true;
>      job->refcnt        = 1;
> +    job->ctx           = bdrv_get_aio_context(bs);

Can the context change if dataplane is started/stopped in the middle of
a job?  (For example if you start migration).  Perhaps job->ctx == NULL
could mean "use bdrv_get_aio_context(bs)".

Paolo

>      bs->job = job;
>  
>      /* Only set speed when necessary to avoid NotSupported error */
> @@ -304,7 +305,7 @@ static int block_job_finish_sync(BlockJob *job,
>          return -EBUSY;
>      }
>      while (!job->completed) {
> -        aio_poll(bdrv_get_aio_context(bs), true);
> +        aio_poll(job->ctx, true);
>      }
>      ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
>      block_job_unref(job);
> @@ -497,6 +498,7 @@ void block_job_coroutine_complete(BlockJob *job,
>      data->aio_context = bdrv_get_aio_context(job->bs);
>      data->fn = fn;
>      data->opaque = opaque;
> +    job->ctx = qemu_get_aio_context();
>  
>      qemu_bh_schedule(data->bh);
>  }
> diff --git a/include/block/blockjob.h b/include/block/blockjob.h
> index de59fc2..5c6a884 100644
> --- a/include/block/blockjob.h
> +++ b/include/block/blockjob.h
> @@ -92,6 +92,8 @@ struct BlockJob {
>       */
>      char *id;
>  
> +    AioContext *ctx;
> +
>      /**
>       * The coroutine that executes the job.  If not NULL, it is
>       * reentered when busy is false and the job is cancelled.
> -- 2.4.3



reply via email to

[Prev in Thread] Current Thread [Next in Thread]