qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1] coroutine: avoid inserting duplicate coroutine to co_queu


From: Eric Blake
Subject: Re: [PATCH v1] coroutine: avoid inserting duplicate coroutine to co_queue_wakeup
Date: Thu, 28 Mar 2024 07:40:14 -0500
User-agent: NeoMutt/20240201

On Mon, Mar 25, 2024 at 05:18:50PM +0800, zhuyangyang via wrote:
> If g_main_loop_run()/aio_poll() is called in the coroutine context,
> the pending coroutine may be woken up repeatedly, and the co_queue_wakeup
> may be disordered.
> 
> When the poll() syscall exited in g_main_loop_run()/aio_poll(), it means
> some listened events is completed. Therefore, the completion callback
> function is dispatched.
> 
> If this callback function needs to invoke aio_co_enter(), it will only
> wake up the coroutine (because we are already in coroutine context),
> which may cause that the data on this listening event_fd/socket_fd
> is not read/cleared. When the next poll () exits, it will be woken up again
> and inserted into the wakeup queue again.
> 
> For example, if TLS is enabled in NBD, the server will call g_main_loop_run()
> in the coroutine, and repeatedly wake up the io_read event on a socket.
> The call stack is as follows:
> 
> aio_co_enter()
> aio_co_wake()
> qio_channel_restart_read()
> aio_dispatch_handler()
> aio_dispatch_handlers()
> aio_dispatch()
> aio_ctx_dispatch()
> g_main_context_dispatch()
> g_main_loop_run()
> nbd_negotiate_handle_starttls()
> nbd_negotiate_options()
> nbd_negotiate()
> nbd_co_client_start()
> coroutine_trampoline()

zhuyangyang, do you have a reliable reproduction setup for how you
were able to trigger this?  Obviously, it only happens when TLS is
enabled (we aren't creating a g_main_loop_run for any other NBD
command), and only when the server is first starting to serve a
client; is this a case where you were hammering a long-running qemu
process running an NBD server with multiple clients trying to
reconnect to the server all near the same time?

If we can come up with a reliable formula for reproducing the
corrupted coroutine list, it would make a great iotest addition
alongside the existing qemu-iotests 233 for ensuring that NBD TLS
traffic is handled correctly in both server and client.

> 
> Signed-off-by: zhuyangyang <zhuyangyang14@huawei.com>

Side note: this appears to be your first qemu contribution (based on
'git shortlog --author zhuyangyang').  While I am not in a position to
presume how you would like your name Anglicized, I will point out that
the prevailing style is to separate given name from family name (just
because your username at work has no spaces does not mean that your
S-o-b has to follow suit).  It is also permissible to list your name
in native characters alongside or in place of the Anglicized version;
for example, 'git log --author="Stefano Dong"' shows this technique.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]