qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/17 v3] Localhost migration with side channel f


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [PATCH 0/17 v3] Localhost migration with side channel for ram
Date: Fri, 22 Nov 2013 11:36:47 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Nov 22, 2013 at 07:29:05PM +0800, Lei Li wrote:
> On 11/21/2013 06:19 PM, Daniel P. Berrange wrote:
> >On Thu, Nov 21, 2013 at 05:11:23PM +0800, Lei Li wrote:
> >In discussions about supporting this for libvirt, we were told that
> >when this localhost migration fails, you cannot re-start the guest
> >on the original source QEMU.
> >
> >If this is true, this implementation is not satisfactory IMHO. One
> >of the main motivations of this feature is to allow for in-place
> >live upgrades of QEMU binaries, for people who can't tolerate the
> >downtime of restarting their guests, and whom don't have a spare
> >host to migrate them to.
> >
> >If people are using this because they can't tolerate any downtime
> >of the guest, then we need to be able to fully deal with failure to
> >complete migration by switching back to the original QEMU process,
> >as we can do with normal non-localhost migration.
> 
> Hi Daniel,
> 
> Page flipping is introduced here not primarily for low downtime, but
> more to avoid requiring that there is enough free memory to fit an
> additional copy of the largest guest which is the requirement today
> with current localhost migration as the additional explanation from
> Anthony in first proposal version [1].
> 
> Of course low downtime is also important to the page flipping
> migration as the use case of it is to allow 'live' upgrade of a
> running QEMU instance, so we expect page flipping through vmsplice
> is fast enough to meet it. As an initial implementation of this
> feature right now, the downtime is not good, but we are working on
> it as there has been some work on kernel side [2].
> 
> During the page flipping migration, ram page of source guest would
> be flipped to the destination, that's why the source guest can not
> be resumed. AFAICT, the page flipping migration may fail at the
> connection stage (including the exchange of pipe fd) and migration
> register stage (say any blocker like unsupported migration device),
> but it could be resumed for such situation since the memory has not
> been flipped to another content. Once the connection is successfully
> setup, it would proceed the transmission of ram page which hardly
> fails. And for the failure handling in Libvirt, ZhengSheng has proposed
> that restarts the old QEMU instead of resume. I know 'hardly' is not
> an good answer to your concern, but it is the cost of the limited
> memory IMO.

If you can flip the pages in one direction, then you can surely
flip them back in the other direction upon failure. Suggesting
people restart QEMU upon failure is just not an acceptable
"recovery" strategy, since it does not in fact recover anything
useful from the user's POV. You've lost all the state of whatever
was running.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]