qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 0/2] Fix migration issues


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH RFC 0/2] Fix migration issues
Date: Fri, 26 Oct 2018 16:24:58 +0100
User-agent: Mutt/1.10.1 (2018-07-13)

* Peter Xu (address@hidden) wrote:
> On Fri, Oct 26, 2018 at 09:10:19PM +0800, Fei Li wrote:
> > 
> > 
> > On 10/25/2018 08:58 PM, Peter Xu wrote:
> > > On Thu, Oct 25, 2018 at 05:04:00PM +0800, Fei Li wrote:
> > > 
> > > [...]
> > > 
> > > > @@ -1325,22 +1325,24 @@ bool multifd_recv_all_channels_created(void)
> > > >   /* Return true if multifd is ready for the migration, otherwise false 
> > > > */
> > > >   bool multifd_recv_new_channel(QIOChannel *ioc)
> > > >   {
> > > > +    MigrationIncomingState *mis = migration_incoming_get_current();
> > > >       MultiFDRecvParams *p;
> > > >       Error *local_err = NULL;
> > > >       int id;
> > > > 
> > > >       id = multifd_recv_initial_packet(ioc, &local_err);
> > > >       if (id < 0) {
> > > > -        multifd_recv_terminate_threads(local_err);
> > > > -        return false;
> > > > +        error_reportf_err(local_err,
> > > > +                          "failed to receive packet via multifd 
> > > > channel %x:
> > > > ",
> > > > +                          multifd_recv_state->count);
> > > > +        goto fail;
> > > >       }
> > > > 
> > > >       p = &multifd_recv_state->params[id];
> > > >       if (p->c != NULL) {
> > > >           error_setg(&local_err, "multifd: received id '%d' already 
> > > > setup'",
> > > >                      id);
> > > > -        multifd_recv_terminate_threads(local_err);
> > > > -        return false;
> > > > +        goto fail;
> > > >       }
> > > >       p->c = ioc;
> > > >       object_ref(OBJECT(ioc));
> > > > @@ -1352,6 +1354,11 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
> > > >                          QEMU_THREAD_JOINABLE);
> > > >       atomic_inc(&multifd_recv_state->count);
> > > >       return multifd_recv_state->count == migrate_multifd_channels();
> > > > +fail:
> > > > +    multifd_recv_terminate_threads(local_err);
> > > > +    qemu_fclose(mis->from_src_file);
> > > > +    mis->from_src_file = NULL;
> > > > +    exit(EXIT_FAILURE);
> > > >   }
> > > Yeah I think it makes sense to at least report some details when error
> > > happens, but I'm not sure whether it's good to explicitly exit() here.
> > > IMHO you can add an Error** in multifd_recv_new_channel() parameter
> > > list to do that, and even through migration_ioc_process_incoming().
> > > What do you think?
> > > 
> > > Regards,
> > > 
> > You mean exit() in migration_ioc_process_incoming(), or further
> > caller migration_channel_process_incoming()? Actually either is
> > ok for me. :) But today I find if using postcopy and multifd together
> > to do live migration, it seems the hang still occurs even with the
> > above codes, so sad about that. I will keep debugging and see
> > how to fix this.
> 
> Maybe you can move the error_report_err() in
> migration_channel_process_incoming() out of the TLS path so we can
> report the error if either TLS or non-TLS case got something wrong.
> 
> And I don't even know whether multifd could work with postcopy...

Nope, it's not expected to work yet.

Dave

> Regards,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]