[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH RFC 0/2] Fix migration issues
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [PATCH RFC 0/2] Fix migration issues |
Date: |
Fri, 26 Oct 2018 14:35:46 +0100 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Fri, Oct 26, 2018 at 09:10:19PM +0800, Fei Li wrote:
>
>
> On 10/25/2018 08:58 PM, Peter Xu wrote:
> > On Thu, Oct 25, 2018 at 05:04:00PM +0800, Fei Li wrote:
> >
> > [...]
> >
> > > @@ -1325,22 +1325,24 @@ bool multifd_recv_all_channels_created(void)
> > > /* Return true if multifd is ready for the migration, otherwise false */
> > > bool multifd_recv_new_channel(QIOChannel *ioc)
> > > {
> > > + MigrationIncomingState *mis = migration_incoming_get_current();
> > > MultiFDRecvParams *p;
> > > Error *local_err = NULL;
> > > int id;
> > >
> > > id = multifd_recv_initial_packet(ioc, &local_err);
> > > if (id < 0) {
> > > - multifd_recv_terminate_threads(local_err);
> > > - return false;
> > > + error_reportf_err(local_err,
> > > + "failed to receive packet via multifd channel
> > > %x:
> > > ",
> > > + multifd_recv_state->count);
> > > + goto fail;
> > > }
> > >
> > > p = &multifd_recv_state->params[id];
> > > if (p->c != NULL) {
> > > error_setg(&local_err, "multifd: received id '%d' already
> > > setup'",
> > > id);
> > > - multifd_recv_terminate_threads(local_err);
> > > - return false;
> > > + goto fail;
> > > }
> > > p->c = ioc;
> > > object_ref(OBJECT(ioc));
> > > @@ -1352,6 +1354,11 @@ bool multifd_recv_new_channel(QIOChannel *ioc)
> > > QEMU_THREAD_JOINABLE);
> > > atomic_inc(&multifd_recv_state->count);
> > > return multifd_recv_state->count == migrate_multifd_channels();
> > > +fail:
> > > + multifd_recv_terminate_threads(local_err);
> > > + qemu_fclose(mis->from_src_file);
> > > + mis->from_src_file = NULL;
> > > + exit(EXIT_FAILURE);
> > > }
> > Yeah I think it makes sense to at least report some details when error
> > happens, but I'm not sure whether it's good to explicitly exit() here.
> > IMHO you can add an Error** in multifd_recv_new_channel() parameter
> > list to do that, and even through migration_ioc_process_incoming().
> > What do you think?
> >
> > Regards,
> >
> You mean exit() in migration_ioc_process_incoming(), or further
> caller migration_channel_process_incoming()? Actually either is
> ok for me. :) But today I find if using postcopy and multifd together
> to do live migration, it seems the hang still occurs even with the
> above codes, so sad about that. I will keep debugging and see
> how to fix this.
Maybe you can move the error_report_err() in
migration_channel_process_incoming() out of the TLS path so we can
report the error if either TLS or non-TLS case got something wrong.
And I don't even know whether multifd could work with postcopy...
Regards,
--
Peter Xu
Re: [Qemu-devel] [PATCH RFC 0/2] Fix migration issues, Dr. David Alan Gilbert, 2018/10/25