[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when
From: |
Kevin Wolf |
Subject: |
Re: [PATCH 1/1] block: Do not poll in bdrv_set_aio_context_ignore() when acquiring new_context |
Date: |
Mon, 19 Jul 2021 12:24:53 +0200 |
Am 12.07.2021 um 07:38 hat Zhiyong Ye geschrieben:
> When bdrv_set_aio_context_ignore() is called in the main loop to change
> the AioContext onto the IO thread, the bdrv_drain_invoke_entry() never
> gets to run and the IO thread hangs at co_schedule_bh_cb().
>
> This is because the AioContext is occupied by the main thread after
> being attached to the IO thread, and the main thread poll in
> bdrv_drained_end() waiting for the IO request to be drained, but the IO
> thread cannot acquire the AioContext, which leads to deadlock.
This shouldn't be the case:
* The caller must own the AioContext lock for the old AioContext of bs, but it
* must not own the AioContext lock for new_context (unless new_context is the
* same as the current context of bs).
Then bdrv_set_aio_context_ignore() switches the AioContext of bs, and
then calls bdrv_drained_end() while holding only the lock for the new
context. AIO_WAIT_WHILE() will temporarily drop that lock, so aio_poll()
should run without holding any AioContext locks.
If I'm not missing anything, the scenario you're seeing means that the
caller already held a lock for the new AioContext, so that it's locked
twice while AIO_WAIT_WHILE() drops the lock only once. This would be a
bug in the caller because the documentation I quoted explicitly forbids
holding the AioContext lock for the new context.
> Just like below:
>
> <------>
> [Switching to thread 1 (Thread 0x7fd810bbef40 (LWP 533312))]
> (gdb) bt
> ...
> 3 0x00005601f6ea93aa in fdmon_poll_wait at ../util/fdmon-poll.c:80
> 4 0x00005601f6e81a1c in aio_poll at ../util/aio-posix.c:607
> 5 0x00005601f6dcde87 in bdrv_drained_end at ../block/io.c:496
> 6 0x00005601f6d798cd in bdrv_set_aio_context_ignore at ../block.c:6502
> 7 0x00005601f6d7996c in bdrv_set_aio_context_ignore at ../block.c:6472
> 8 0x00005601f6d79cb8 in bdrv_child_try_set_aio_context at ../block.c:6587
> 9 0x00005601f6da86f2 in blk_do_set_aio_context at
> ../block/block-backend.c:2026
> 10 0x00005601f6daa96d in blk_set_aio_context at ../block/block-backend.c:2047
> 11 0x00005601f6c71883 in virtio_scsi_hotplug at ../hw/scsi/virtio-scsi.c:831
Which version of QEMU are you using? In current git master, line 831 is
something entirely different.
Are you using something before commit c7040ff6? Because this is a commit
that fixed a virtio-scsi bug where it would hold the wrong lock before
calling blk_set_aio_context().
>
> [Switching to thread 4 (Thread 0x7fd8092e7700 (LWP 533315))]
> (gdb) bt
> ...
> 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at
> ../util/qemu-thread-posix.c:79
> 5 0x00005601f6e7ce88 in co_schedule_bh_cb at ../util/async.c:489
> 6 0x00005601f6e7c404 in aio_bh_poll at ../util/async.c:164
> 7 0x00005601f6e81a46 in aio_poll at ../util/aio-posix.c:659
> 8 0x00005601f6d5ccf3 in iothread_run at ../iothread.c:73
> 9 0x00005601f6eab512 in qemu_thread_start at ../util/qemu-thread-posix.c:521
> 10 0x00007fd80d7b84a4 in start_thread at pthread_create.c:456
> 11 0x00007fd80d4fad0f in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
> (gdb) f 4
> 4 0x00005601f6eab6a8 in qemu_mutex_lock_impl at
> ../util/qemu-thread-posix.c:79
> (gdb) p *mutex
> $2 = {lock = {__data = {__lock = 2, __count = 1, __owner = 533312, __nusers =
> 1,
> __kind = 1, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next
> = 0x0}},
> __size = "\002\000\000\000\001\000\000\000@#\b\000\001\000\000\000\001",
> '\000' <repeats 22 times>, __align = 4294967298}, initialized = true}
> <------>
>
> Therefore, we should never poll anywhere in
> bdrv_set_aio_context_ignore() when acquiring the new context. In fact,
> commit e037c09c has also already elaborated on why we can't poll at
> bdrv_do_drained_end().
>
> Signed-off-by: Zhiyong Ye <yezhiyong@bytedance.com>
> ---
> block.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/block.c b/block.c
> index be083f389e..ebbea72d64 100644
> --- a/block.c
> +++ b/block.c
> @@ -6846,6 +6846,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
> GSList *parents_to_process = NULL;
> GSList *entry;
> BdrvChild *child, *parent;
> + int drained_end_counter = 0;
>
> g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>
> @@ -6907,7 +6908,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
> aio_context_release(old_context);
> }
>
> - bdrv_drained_end(bs);
> + bdrv_drained_end_no_poll(bs, &drained_end_counter);
>
> if (qemu_get_aio_context() != old_context) {
> aio_context_acquire(old_context);
> @@ -6915,6 +6916,7 @@ void bdrv_set_aio_context_ignore(BlockDriverState *bs,
> if (qemu_get_aio_context() != new_context) {
> aio_context_release(new_context);
> }
> + BDRV_POLL_WHILE(bs, qatomic_read(&drained_end_counter) > 0);
This would be wrong because bs is already in the new context, but you
wouldn't hold the lock for it. AIO_WAIT_WHILE() would try to drop the
lock for a context that isn't even locked, resulting in a crash.
> }
>
> static bool bdrv_parent_can_set_aio_context(BdrvChild *c, AioContext *ctx,
Kevin