qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag


From: Peter Xu
Subject: Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
Date: Tue, 15 Aug 2017 17:47:08 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Tue, Aug 15, 2017 at 10:27:07AM +0100, Daniel P. Berrange wrote:
> On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> > On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > > MigrationIncomingState struct.
> > > > 
> > > > For defered migration, no need to store task tag since there is no task
> > > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > > 
> > > > Signed-off-by: Peter Xu <address@hidden>
> > > > ---
> > > >  migration/migration.c | 22 ++++++++++++++++++----
> > > >  migration/migration.h |  2 ++
> > > >  2 files changed, 20 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > index c9b7085..daf356b 100644
> > > > --- a/migration/migration.c
> > > > +++ b/migration/migration.c
> > > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > > >          mis->from_src_file = NULL;
> > > >      }
> > > >  
> > > > +    mis->listen_task_tag = 0;
> > > >      qemu_event_destroy(&mis->main_thread_load_event);
> > > >  }
> > > >  
> > > > @@ -265,25 +266,31 @@ int 
> > > > migrate_send_rp_req_pages(MigrationIncomingState *mis, const char 
> > > > *rbname,
> > > >  void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > >  {
> > > >      const char *p;
> > > > +    guint task_tag = 0;
> > > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > >  
> > > >      qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > > >      if (!strcmp(uri, "defer")) {
> > > >          deferred_incoming_migration(errp);
> > > >      } else if (strstart(uri, "tcp:", &p)) {
> > > > -        tcp_start_incoming_migration(p, errp);
> > > > +        task_tag = tcp_start_incoming_migration(p, errp);
> > > >  #ifdef CONFIG_RDMA
> > > >      } else if (strstart(uri, "rdma:", &p)) {
> > > > +        /* TODO: store task tag for RDMA migrations */
> > > >          rdma_start_incoming_migration(p, errp);
> > > >  #endif
> > > >      } else if (strstart(uri, "exec:", &p)) {
> > > > -        exec_start_incoming_migration(p, errp);
> > > > +        task_tag = exec_start_incoming_migration(p, errp);
> > > >      } else if (strstart(uri, "unix:", &p)) {
> > > > -        unix_start_incoming_migration(p, errp);
> > > > +        task_tag = unix_start_incoming_migration(p, errp);
> > > >      } else if (strstart(uri, "fd:", &p)) {
> > > > -        fd_start_incoming_migration(p, errp);
> > > > +        task_tag = fd_start_incoming_migration(p, errp);
> > > >      } else {
> > > >          error_setg(errp, "unknown migration protocol: %s", uri);
> > > > +        return;
> > > >      }
> > > > +
> > > > +    mis->listen_task_tag = task_tag;
> > > 
> > > This is unsafe as currently written. The main loop watches that are
> > > associated with these tags are all unregistered when incoming migration
> > > starts. So if you save them, and then unregister later when postcopy
> > > is "stuck", then you're likely unregistering a tag associated with
> > > some other random part of QEMU, because the saved value is no longer
> > > valid.
> > 
> > IIUC for incoming migration, the watch is not detached until
> > migration_fd_process_incoming() completes. So:
> > 
> > - for precopy, the watch should be detached when migration completes
> > 
> > - for postcopy, the watch should be detached when postcopy starts,
> >   then main loop thread returns, while the ram_load_thread starts to
> >   continue the migration.
> > 
> > So basically the watch is detaching itself after
> > migration_fd_process_incoming() completes. To handle that, this patch
> > has this change:
> > 
> > @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
> >          co = qemu_coroutine_create(process_incoming_migration_co, f);
> >          qemu_coroutine_enter(co);
> >      }
> > +
> > +    /*
> > +     * When reach here, we should not need the listening port any
> > +     * more. We'll detach the listening task soon, let's reset the
> > +     * listen task tag.
> > +     */
> > +    mis->listen_task_tag = 0;
> > 
> > Would this make sure the listen_task_tag is always valid if nonzero?
> 
> Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
> for hard to diagnose bugs, so should be avoided in general.

Makes sense.

> 
> So, IMHO, you should change all the callbacks to 'return TRUE' so that
> they keep the watch registered, and then explicitly call g_source_remove()
> passing the listen tag, when incoming migration starts.

Yes this sounds better.  Thanks!

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]