[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH 4/5] block: Drop AioContext lock in bdrv_dra
From: |
Fam Zheng |
Subject: |
Re: [Qemu-devel] [RFC PATCH 4/5] block: Drop AioContext lock in bdrv_drain_poll_top_level() |
Date: |
Fri, 24 Aug 2018 15:24:56 +0800 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Fri, 08/17 19:02, Kevin Wolf wrote:
> Simimlar to AIO_WAIT_WHILE(), bdrv_drain_poll_top_level() needs to
> release the AioContext lock of the node to be drained before calling
> aio_poll(). Otherwise, callbacks called by aio_poll() would possibly
> take the lock a second time and run into a deadlock with a nested
> AIO_WAIT_WHILE() call.
>
> Signed-off-by: Kevin Wolf <address@hidden>
> ---
> block/io.c | 25 ++++++++++++++++++++++++-
> 1 file changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/block/io.c b/block/io.c
> index 7100344c7b..832d2536bf 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -268,9 +268,32 @@ bool bdrv_drain_poll(BlockDriverState *bs, bool
> recursive,
> static bool bdrv_drain_poll_top_level(BlockDriverState *bs, bool recursive,
> BdrvChild *ignore_parent)
> {
> + AioContext *ctx = bdrv_get_aio_context(bs);
> +
> + /*
> + * We cannot easily release the lock unconditionally here because many
> + * callers of drain function (like qemu initialisation, tools, etc.)
> don't
> + * even hold the main context lock.
> + *
> + * This means that we fix potential deadlocks for the case where we are
> in
> + * the main context and polling a BDS in a different AioContext, but
> + * draining a BDS in the main context from a different I/O thread would
> + * still have this problem. Fortunately, this isn't supposed to happen
> + * anyway.
> + */
> + if (ctx != qemu_get_aio_context()) {
> + aio_context_release(ctx);
> + } else {
> + assert(qemu_get_current_aio_context() == qemu_get_aio_context());
> + }
> +
> /* Execute pending BHs first and check everything else only after the BHs
> * have executed. */
> - while (aio_poll(bs->aio_context, false));
> + while (aio_poll(ctx, false));
> +
> + if (ctx != qemu_get_aio_context()) {
> + aio_context_acquire(ctx);
> + }
>
> return bdrv_drain_poll(bs, recursive, ignore_parent, false);
> }
> --
> 2.13.6
>
The same question as patch 3: why not just use AIO_WAIT_WHILE() here? It takes
care to not release any lock if both running and polling in the main context
(taking the in_aio_context_home_thread() branch).
Fam
- [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs, Kevin Wolf, 2018/08/17
- [Qemu-devel] [RFC PATCH 2/5] tests: Acquire AioContext around job_finish_sync(), Kevin Wolf, 2018/08/17
- [Qemu-devel] [RFC PATCH 3/5] job: Drop AioContext lock around aio_poll(), Kevin Wolf, 2018/08/17
- [Qemu-devel] [RFC PATCH 4/5] block: Drop AioContext lock in bdrv_drain_poll_top_level(), Kevin Wolf, 2018/08/17
- Re: [Qemu-devel] [RFC PATCH 4/5] block: Drop AioContext lock in bdrv_drain_poll_top_level(),
Fam Zheng <=
- [Qemu-devel] [RFC PATCH 1/5] blockjob: Wake up BDS when job becomes idle, Kevin Wolf, 2018/08/17
- [Qemu-devel] [RFC PATCH 5/5] [WIP] Lock AioContext in bdrv_co_drain_bh_cb(), Kevin Wolf, 2018/08/17
- Re: [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs, no-reply, 2018/08/18
- Re: [Qemu-devel] [RFC PATCH 0/5] Fix some jobs/drain/aio_poll related hangs, Fam Zheng, 2018/08/21