qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 20/45] Modify savevm handlers for postcopy


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v5 20/45] Modify savevm handlers for postcopy
Date: Thu, 26 Mar 2015 11:44:36 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

* David Gibson (address@hidden) wrote:
> On Wed, Mar 25, 2015 at 04:40:11PM +0000, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (address@hidden) wrote:
> > > * David Gibson (address@hidden) wrote:
> > > > On Tue, Mar 24, 2015 at 08:04:14PM +0000, Dr. David Alan Gilbert wrote:
> > > > > * David Gibson (address@hidden) wrote:
> > > > > > On Fri, Mar 20, 2015 at 12:37:59PM +0000, Dr. David Alan Gilbert 
> > > > > > wrote:
> > > > > > > * David Gibson (address@hidden) wrote:
> > > > > > > > On Fri, Mar 13, 2015 at 10:19:54AM +0000, Dr. David Alan 
> > > > > > > > Gilbert wrote:
> > > > > > > > > * David Gibson (address@hidden) wrote:
> > > > > > > > > > On Wed, Feb 25, 2015 at 04:51:43PM +0000, Dr. David Alan 
> > > > > > > > > > Gilbert (git) wrote:
> > > > > > > > > > > From: "Dr. David Alan Gilbert" <address@hidden>
> > > > > > > > > > > 
> > > > > > > > > > > Modify save_live_pending to return separate postcopiable 
> > > > > > > > > > > and
> > > > > > > > > > > non-postcopiable counts.
> > > > > > > > > > > 
> > > > > > > > > > > Add 'can_postcopy' to allow a device to state if it can 
> > > > > > > > > > > postcopy
> > > > > > > > > > 
> > > > > > > > > > What's the purpose of the can_postcopy callback?  There are 
> > > > > > > > > > no callers
> > > > > > > > > > in this patch - is it still necessary with the change to
> > > > > > > > > > save_live_pending?
> > > > > > > > > 
> > > > > > > > > The patch 'qemu_savevm_state_complete: Postcopy changes' uses
> > > > > > > > > it in qemu_savevm_state_postcopy_complete and 
> > > > > > > > > qemu_savevm_state_complete
> > > > > > > > > to decide which devices must be completed at that point.
> > > > > > > > 
> > > > > > > > Couldn't they check for non-zero postcopiable state from
> > > > > > > > save_live_pending instead?
> > > > > > > 
> > > > > > > That would be a bit weird.
> > > > > > > 
> > > > > > > At the moment for each device we call the:
> > > > > > >        save_live_setup method (from qemu_savevm_state_begin)
> > > > > > > 
> > > > > > >    0...multiple times we call:
> > > > > > >        save_live_pending
> > > > > > >        save_live_iterate
> > > > > > > 
> > > > > > >    and then we always call
> > > > > > >        save_live_complete
> > > > > > > 
> > > > > > > 
> > > > > > > To my mind we have to call save_live_complete for any device
> > > > > > > that we've called save_live_setup on (maybe it allocated something
> > > > > > > in _setup that it clears up in _complete).
> > > > > > > 
> > > > > > > save_live_pending could perfectly well return 0 remaining at the 
> > > > > > > end of
> > > > > > > the migrate for our device, and thus if we used that then we 
> > > > > > > wouldn't
> > > > > > > call save_live_complete.
> > > > > > 
> > > > > > Um.. I don't follow.  I was suggesting that at the precopy->postcopy
> > > > > > transition point you call save_live_complete for everything that
> > > > > > reports 0 post-copiable state.
> > > > > > 
> > > > > > 
> > > > > > Then again, a different approach would be to split the
> > > > > > save_live_complete hook into (possibly NULL) "complete precopy" and
> > > > > > "complete postcopy" hooks.  The core would ensure that every chunk 
> > > > > > of
> > > > > > state has both completion hooks called (unless NULL).  That might 
> > > > > > also
> > > > > > address my concerns about the no longer entirely accurate
> > > > > > save_live_complete function name.
> > > > > 
> > > > > OK, that one I prefer.  Are you OK with:
> > > > >     qemu_savevm_state_complete_precopy
> > > > >        calls -> save_live_complete_precopy
> > > > > 
> > > > >     qemu_savevm_state_complete_postcopy
> > > > >        calls -> save_live_complete_postcopy
> > > > > 
> > > > > ?
> > > > 
> > > > Sounds ok to me.  Fwiw, I was thinking that both the complete_precopy
> > > > and complete_postcopy hooks should always be called.  For a
> > > > non-postcopy migration, the postcopy hooks would just be called
> > > > immediately after the precopy hooks.
> > > 
> > > OK, I've made the change as described in my last mail; but I haven't 
> > > called
> > > the complete_postcopy hook in the precopy case.  If it was as simple as 
> > > making
> > > all devices use one or the other then it would work, however there are
> > > existing (precopy) assumptions about ordering of device state on the wire 
> > > that
> > > I want to be careful not to alter; for example RAM must come first is the 
> > > one
> > > I know.
> > 
> > Actually, I spoke too soon; testing this found a bad breakage.
> > 
> > the functions in savevm.c add the per-section headers, and then call the 
> > _complete
> > methods on the devices.  Those _complete methods can't elect to do nothing, 
> > because
> > a header has already been planted.
> 
> Hrm.. couldn't you move the test for presence of the hook earlier so
> you don't sent the header if the hook is NULL?

There's two tests that you have to make:
      a) in qemu_savevm_state_complete_precopy do you call 
save_live_complete_precopy
      b) in qemu_savevm_state_complete_postcopy do you call 
save_live_complete_postcopy

The obvious case is if either hook is NULL you don't call it.
(a) is the harder cases, if we're doing postcopy then we don't want to call the
   save_live_complete_precopy method on a device which isn't expecting to 
complete until
   postcopy.
   The code in qemu_savevm_state_complete_precopy checks for the presence of 
the *postcopy*
   hook, and doesn't emit the header or call the precopy commit if the postcopy 
hook
   is present and we're in postcopy.

> > I've ended up with something between the two;  we still have a 
> > complete_precopy and
> > complete_postcopy method on the devices; if the complete_postcopy method 
> > exists and
> > we're in postcopy mode, the complete_precopy method isn't called at all.
> > A device could decide to do something different in complete_postcopy from 
> > complete_precopy
> > but it must do something to complete the section.
> > Effectively the presence of the complete_postcopy is now doing what
> > can_postcopy() used to do.
> 
> Hmm.. but it means there's no per-device hook for the precopy to
> postcopy transition point.  I'm not sure if that might matter.

This is true, but if we needed a generic hook for that (which might be useful)
it probably shouldn't be 'complete'.

Dave

> 
> -- 
> David Gibson                  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au        | minimalist, thank you.  NOT _the_ 
> _other_
>                               | _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]