qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 16/29] qmp: hmp: add migrate "resume" option


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [RFC 16/29] qmp: hmp: add migrate "resume" option
Date: Tue, 1 Aug 2017 12:03:48 +0100
User-agent: Mutt/1.8.3 (2017-05-23)

On Fri, Jul 28, 2017 at 04:06:25PM +0800, Peter Xu wrote:
> It will be used when we want to resume one paused migration.
> 
> Signed-off-by: Peter Xu <address@hidden>
> ---
>  hmp-commands.hx       | 7 ++++---
>  hmp.c                 | 4 +++-
>  migration/migration.c | 2 +-
>  qapi-schema.json      | 5 ++++-
>  4 files changed, 12 insertions(+), 6 deletions(-)

I'm not seeing explicit info about how we handle the original failure
and how it relates to this resume command, but this feels like a
potentially racy approach to me.

If we have a network problem between source & target, we could see
two results. Either the TCP stream will simply hang (it'll still
appear open to QEMU but no traffic will be flowing), or the connection
may actually break such that we get EOF and end up closing the file
descriptor.

In the latter case, we're ok because the original channel is now
gone and we can safely establish the new one by issuing the new
'migrate --resume URI' command.

In the former case, however, there is the possibility that the
hang may come back to life at some point, concurrently with us
trying to do 'migrate --resume URI' and I'm unclear on the
semantics if that happens.

Should the original connection carry on, and thus cause the
'migrate --resume' command to fail, or will we forcably terminate
the original connection no matter what and use the new "resumed"
connection.

There's also synchronization with the target host - at the time we
want to recover, we need to be able to tell the target to accept
new incoming clients again, but we don't want to do that if the
original connection comes back to life.

It feels to me that if the mgmt app or admin believes the migration
is in a stuck state, we should be able to explicitly terminate the
existing connection via a monitor command. Then setup the target
host to accept new client, and then issue this migrate resume on
the source.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]