[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 01/45] Start documenting how postcopy works.
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [Qemu-devel] [PATCH v5 01/45] Start documenting how postcopy works. |
Date: |
Fri, 13 Mar 2015 13:07:40 +0000 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
* David Gibson (address@hidden) wrote:
> On Thu, Mar 05, 2015 at 09:21:39AM +0000, Dr. David Alan Gilbert wrote:
> > * David Gibson (address@hidden) wrote:
> > > On Wed, Feb 25, 2015 at 04:51:24PM +0000, Dr. David Alan Gilbert
> > (git) wrote:
> [snip]
> > > > +=== Enabling postcopy ===
> > > > +
> > > > +To enable postcopy (prior to the start of migration):
> > > > +
> > > > +migrate_set_capability x-postcopy-ram on
> > > > +
> > > > +The migration will still start in precopy mode, however issuing:
> > > > +
> > > > +migrate_start_postcopy
> > > > +
> > > > +will now cause the transition from precopy to postcopy.
> > > > +It can be issued immediately after migration is started or any
> > > > +time later on. Issuing it after the end of a migration is harmless.
> > >
> > > It's not quite clear to me what this means. Does
> > > "migrate_start_postcopy" mean it will immediately transfer execution
> > > and transfer any remaining pages postcopy, or does it just mean it
> > > will start postcopying once the remaining data to transfer is small
> > > enough?
> >
> > Yes; it will flip into postcopy soon after issuing that command irrespective
> > of the amount of data remaining.
> >
> > > What's the reason for this rather awkward two stage activation of
> > > postcopy?
> >
> > We need to keep track of the pages that are received during the precopy
> > phase,
> > and do some madvise and other setups on the destination RAM area before
> > precopy
> > starts; and so we need to know we might want to do postcopy - so we need
> > to be told early. In the earliest posted version of my patches I had a
> > time-limit setting and after the time limit expired QEMU would switch into
> > the second phase of postcopy itself, but Paolo suggested the
> > migrate_start_postcopy:
> >
> > https://lists.nongnu.org/archive/html/qemu-devel/2014-07/msg00943.html
> >
> > and it works out simpler anyway.
>
> Ok, that makes sense.
>
> > > > +=== Postcopy device transfer ===
> > > > +
> > > > +Loading of device data may cause the device emulation to access guest
> > > > RAM
> > > > +that may trigger faults that have to be resolved by the source, as such
> > > > +the migration stream has to be able to respond with page data *during*
> > > > the
> > > > +device load, and hence the device data has to be read from the stream
> > > > completely
> > > > +before the device load begins to free the stream up. This is achieved
> > > > by
> > > > +'packaging' the device data into a blob that's read in one go.
> > > > +
> > > > +Source behaviour
> > > > +
> > > > +Until postcopy is entered the migration stream is identical to normal
> > > > +precopy, except for the addition of a 'postcopy advise' command at
> > > > +the beginning, to tell the destination that postcopy might happen.
> > > > +When postcopy starts the source sends the page discard data and then
> > > > +forms the 'package' containing:
> > > > +
> > > > + Command: 'postcopy ram listen'
> > > > + The device state
> > > > + A series of sections, identical to the precopy streams device
> > > > state stream
> > > > + containing everything except postcopiable devices (i.e. RAM)
> > > > + Command: 'postcopy ram run'
> > > > +
> > > > +The 'package' is sent as the data part of a Command: 'CMD_PACKAGED',
> > > > and the
> > > > +contents are formatted in the same way as the main migration stream.
> > >
> > > It seems to me the "ram listen", "ram run" and CMD_PACKAGED really
> > > have to be used in conjuction this way, they don't really have any use
> > > on their own. So why not make it all CMD_POSTCOPY_TRANSITION and have
> > > the "listen" and "run" take effect implicitly at the beginning and end
> > > of the device data.
> >
> > CMD_PACKAGED seems like something that was generally useful; it's fairly
> > complicated on it's own and so it seemed best to keep it separate.
>
> And can you actually think of another use case for it?
>
> The thing that bothers me is that the "listen" and "run" operations
> will not work correctly anywhere other than at the beginning and end
> of the packaged blob.
It feels similar to the packaged blobs that checkpointing schemes
like COLO and microcheckpointing use; although they seem to craft their own wire
protocol rather than sticking with a migration protocol.
What they do in terms of controlling the CPU etc is certainly different
(so the RUN/listen stuff is different).
Dave
> --
> David Gibson | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_
> _other_
> | _way_ _around_!
> http://www.ozlabs.org/~dgibson
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK