[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v4 33/47] Postcopy: Postcopy startup in migratio
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH v4 33/47] Postcopy: Postcopy startup in migration thread |
Date: |
Thu, 20 Nov 2014 17:12:44 +0000 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
* Paolo Bonzini (address@hidden) wrote:
> Il 03/10/2014 19:47, Dr. David Alan Gilbert (git) ha scritto:
> > From: "Dr. David Alan Gilbert" <address@hidden>
> >
> > Rework the migration thread to setup and start postcopy.
> >
> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > ---
> > include/migration/migration.h | 3 +
> > migration.c | 201
> > ++++++++++++++++++++++++++++++++++++++----
> > 2 files changed, 185 insertions(+), 19 deletions(-)
> >
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index b01cc17..f401775 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
<snip>
> > +/* Switch from normal iteration to postcopy
> > + * Returns non-0 on error
> > + */
> > +static int postcopy_start(MigrationState *ms)
> > +{
> > + int ret;
> > + const QEMUSizedBuffer *qsb;
> > + migrate_set_state(ms, MIG_STATE_ACTIVE, MIG_STATE_POSTCOPY_ACTIVE);
> > +
> > + DPRINTF("postcopy_start\n");
> > + qemu_mutex_lock_iothread();
> > + DPRINTF("postcopy_start: setting run state\n");
> > + ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
> > +
> > + if (ret < 0) {
> > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR);
> > + qemu_mutex_unlock_iothread();
> > + return -1;
>
> Please use "goto" for error returns, like
>
> fail_locked:
> qemu_mutex_unlock_iothread();
> fail:
> migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR);
> return -1;
Done; they all end up unlocking, but I've got another label
for a case that has to close the fb later.
> > + }
> > +
> > + /*
> > + * in Finish migrate and with the io-lock held everything should
> > + * be quiet, but we've potentially still got dirty pages and we
> > + * need to tell the destination to throw any pages it's already
> > received
> > + * that are dirty
> > + */
> > + if (ram_postcopy_send_discard_bitmap(ms)) {
> > + DPRINTF("postcopy send discard bitmap failed\n");
> > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR);
> > + qemu_mutex_unlock_iothread();
> > + return -1;
> > + }
> > +
> > + DPRINTF("postcopy_start: sending req 2\n");
> > + qemu_savevm_send_reqack(ms->file, 2);
>
> Perhaps move it below qemu_file_set_rate_limit, and add
> trace_qemu_savevm_send_reqack?
Trace added, and also moved as requested - was the request to move
it just to elimintate the other DPRINTF?
> Also what is 2/3/4? Is this just for debugging or is it part of the
> protocol?
Debug; they're very useful for matching the debug streams up, especially
when the timers on the two hosts are very different.
(I'm up for suggestions on how to mark the 2/3/4 for debug more clearly,
especially if it meant that it didn't make the ping (ne reqack) dedicated
to debug).
> > + /*
> > + * send rest of state - note things that are doing postcopy
> > + * will notice we're in MIG_STATE_POSTCOPY_ACTIVE and not actually
> > + * wrap their state up here
> > + */
> > + qemu_file_set_rate_limit(ms->file, INT64_MAX);
> > + DPRINTF("postcopy_start: do state_complete\n");
> > +
> > + /*
> > + * We need to leave the fd free for page transfers during the
> > + * loading of the device state, so wrap all the remaining
> > + * commands and state into a package that gets sent in one go
> > + */
>
> The comments in the code are very nice. Thanks. This is a huge
> improvement from the last version I received.
>
> > + QEMUFile *fb = qemu_bufopen("w", NULL);
> > + if (!fb) {
> > + error_report("Failed to create buffered file");
> > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR);
> > + qemu_mutex_unlock_iothread();
> > + return -1;
> > + }
> > +
> > + qemu_savevm_state_complete(fb);
> > + DPRINTF("postcopy_start: sending req 3\n");
> > + qemu_savevm_send_reqack(fb, 3);
> > +
> > + qemu_savevm_send_postcopy_ram_run(fb);
> > +
> > + /* <><> end of stuff going into the package */
> > + qsb = qemu_buf_get(fb);
> > +
> > + /* Now send that blob */
> > + if (qsb_get_length(qsb) > MAX_VM_CMD_PACKAGED_SIZE) {
> > + DPRINTF("postcopy_start: Unreasonably large packaged state: %lu\n",
> > + (unsigned long)(qsb_get_length(qsb)));
> > + migrate_set_state(ms, MIG_STATE_POSTCOPY_ACTIVE, MIG_STATE_ERROR);
> > + qemu_mutex_unlock_iothread();
> > + qemu_fclose(fb);
>
> Close fb above migrate_set_state, and use goto as above. Or just have
> three labels.
Done, it's a separate label.
>
> > + return -1;
> > + }
> > + qemu_savevm_send_packaged(ms->file, qsb);
> > + qemu_fclose(fb);
> > +
> > + qemu_mutex_unlock_iothread();
> > +
> > + DPRINTF("postcopy_start not finished sending ack\n");
> > + qemu_savevm_send_reqack(ms->file, 4);
> > +
> > + ret = qemu_file_get_error(ms->file);
> > + if (ret) {
> > + error_report("postcopy_start: Migration stream errored");
>
> This should have been reported already.
No, sorry - I don't trust qemu_file reporting errors by itself.
Dave
(Again, the rest of the comments on this patch can wait for another mail)
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK