qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/1] nbd/server: do not poll within a coroutine context


From: Zhu Yangyang
Subject: Re: [PATCH v2 1/1] nbd/server: do not poll within a coroutine context
Date: Tue, 2 Apr 2024 16:53:51 +0800

On Mon, 1 Apr 2024 11:33:09AM -0500, Eric Blake wrote:
> On Mon, Apr 01, 2024 at 08:41:20PM +0800, Zhu Yangyang wrote:
> > Coroutines are not supposed to block. Instead, they should yield.
> > 
> > Fixes: f95910f ("nbd: implement TLS support in the protocol negotiation")
> > Signed-off-by: Zhu Yangyang <zhuyangyang14@huawei.com>
> > ---
> >  nbd/client.c       |  7 ++++---
> >  nbd/common.c       | 19 ++++++++++++++++---
> >  nbd/nbd-internal.h |  6 +++---
> >  nbd/server.c       | 10 +++++-----
> >  4 files changed, 28 insertions(+), 14 deletions(-)
> > 
> > diff --git a/nbd/client.c b/nbd/client.c
> > index 29ffc609a4..1ab91ed205 100644
> > --- a/nbd/client.c
> > +++ b/nbd/client.c
> > @@ -619,18 +619,19 @@ static QIOChannel *nbd_receive_starttls(QIOChannel 
> > *ioc,
> >          return NULL;
> >      }
> >      qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-client-tls");
> > -    data.loop = g_main_loop_new(g_main_context_default(), FALSE);
> >      trace_nbd_receive_starttls_tls_handshake();
> >      qio_channel_tls_handshake(tioc,
> > -                              nbd_tls_handshake,
> > +                              nbd_client_tls_handshake,
> >                                &data,
> >                                NULL,
> >                                NULL);
> >  
> >      if (!data.complete) {
> > +        data.loop = g_main_loop_new(g_main_context_default(), FALSE);
> >          g_main_loop_run(data.loop);
> > +        g_main_loop_unref(data.loop);
> >      }
> > -    g_main_loop_unref(data.loop);
> > +
> 
> Aha - you figured out an elegant way around the client code not being
> in coroutine context that had been stumping me, at least for the sake
> of a minimal patch.
> 
> >      if (data.error) {
> >          error_propagate(errp, data.error);
> >          object_unref(OBJECT(tioc));
> > diff --git a/nbd/common.c b/nbd/common.c
> > index 3247c1d618..01ca30a5c4 100644
> > --- a/nbd/common.c
> > +++ b/nbd/common.c
> > @@ -47,14 +47,27 @@ int nbd_drop(QIOChannel *ioc, size_t size, Error **errp)
> >  }
> >  
> >  
> > -void nbd_tls_handshake(QIOTask *task,
> > -                       void *opaque)
> > +void nbd_client_tls_handshake(QIOTask *task, void *opaque)
> >  {
> >      struct NBDTLSHandshakeData *data = opaque;
> >  
> >      qio_task_propagate_error(task, &data->error);
> >      data->complete = true;
> > -    g_main_loop_quit(data->loop);
> > +    if (data->loop) {
> > +        g_main_loop_quit(data->loop);
> > +    }
> > +}
> > +
> > +
> > +void nbd_server_tls_handshake(QIOTask *task, void *opaque)
> > +{
> > +    struct NBDTLSHandshakeData *data = opaque;
> > +
> > +    qio_task_propagate_error(task, &data->error);
> > +    data->complete = true;
> > +    if (!qemu_coroutine_entered(data->co)) {
> > +        aio_co_wake(data->co);
> > +    }
> >  }
> 
> This matches up with what I was experimenting with last week, in that
> server is in coroutine context while client is not.  However, it means
> we no longer have a common implementation between the two, so keeping
> the code isolated in nbd/common.c makes less sense than just placing
> the two specific callbacks in the two specific files right where their
> only use case exists (and even making them static at that point).

Yes, we can implement nbd_tls_handshake() on both client and server side.
It looks a lot clearer.

We can even extract the common code to an nbd_tls_handshake_complete().

nbd/common.c
void nbd_tls_handshake_complete(QIOTask *task, void *opaque) {
    struct NBDTLSHandshakeData *data = opaque;

    qio_task_propagate_error(task, &data->error);
    data->complete = true;
}

server.c / client.c
static void nbd_tls_handshake(QIOTask *task, void *opaque)
{
    struct NBDTLSHandshakeData *data = opaque;

    nbd_tls_handshake_complete(task, opaque);
    ...
}

> 
> Or, can we still merge it into one helper?  It looks like we now have
> 3 viable possibilities:
> 
> data->loop data->co
> non-NULL   non-NULL    impossible
> NULL       NULL        client, qio_task completed right away
> non-NULL   NULL        client, qio_task did not complete right away
> NULL       non-NULL    server, waking the coroutine depends on if we are in 
> one

This seems a little complicated.

> 
> With that, we can still get by with one function, but need good
> documentation.  I'll post a v3 along those lines, to see what you
> think.
> 
> >  
> >  
> > diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
> > index dfa02f77ee..99cca9382c 100644
> > --- a/nbd/nbd-internal.h
> > +++ b/nbd/nbd-internal.h
> > @@ -74,13 +74,13 @@ static inline int nbd_write(QIOChannel *ioc, const void 
> > *buffer, size_t size,
> >  
> >  struct NBDTLSHandshakeData {
> >      GMainLoop *loop;
> > +    Coroutine *co;
> >      bool complete;
> >      Error *error;
> >  };
> 
> I had tried to get rid of the GMainLoop *loop member altogether, but
> your change has the benefit of a smaller diff than what I was facing
> (I got lost in the weeds trying to see if I could convert all of the
> client into running in coroutine context).

I saw your reply and also tried to put the client in the coroutine context,
And then I found that the event listener is registered to the default 
main_context,
This means that we can't use aio_poll(ctx) and AIO_WAIT_WHILE() to wait for
the coroutine to complete.

GMainLoop *loop may not be circumvented.

g_source_attach(source, NULL)
qio_channel_add_watch_full()
qio_channel_tls_handshake_task()
qio_channel_tls_handshake()
nbd_receive_starttls()
nbd_start_negotiate()

> >  
> > -
> > -void nbd_tls_handshake(QIOTask *task,
> > -                       void *opaque);
> > +void nbd_server_tls_handshake(QIOTask *task, void *opaque);
> > +void nbd_client_tls_handshake(QIOTask *task, void *opaque);
> >  
> >  int nbd_drop(QIOChannel *ioc, size_t size, Error **errp);
> >  
> > diff --git a/nbd/server.c b/nbd/server.c
> > index c3484cc1eb..b218512ced 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -777,17 +777,17 @@ static QIOChannel 
> > *nbd_negotiate_handle_starttls(NBDClient *client,
> >  
> >      qio_channel_set_name(QIO_CHANNEL(tioc), "nbd-server-tls");
> >      trace_nbd_negotiate_handle_starttls_handshake();
> > -    data.loop = g_main_loop_new(g_main_context_default(), FALSE);
> > +    data.co = qemu_coroutine_self();
> >      qio_channel_tls_handshake(tioc,
> > -                              nbd_tls_handshake,
> > +                              nbd_server_tls_handshake,
> >                                &data,
> >                                NULL,
> >                                NULL);
> >  
> > -    if (!data.complete) {
> > -        g_main_loop_run(data.loop);
> > +    while (!data.complete) {
> > +        qemu_coroutine_yield();
> >      }
> > -    g_main_loop_unref(data.loop);
> > +
> >      if (data.error) {
> >          object_unref(OBJECT(tioc));
> >          error_propagate(errp, data.error);
> > -- 
> > 2.33.0
> >
> 
> Thanks for the updated patch - it looks like we are heading in a good 
> direction.
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization:  qemu.org | libguestfs.o

--
Best Regards,
Zhu Yangyang



reply via email to

[Prev in Thread] Current Thread [Next in Thread]