qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v9 00/56] Postcopy implementation
Date: Mon, 9 Nov 2015 15:13:59 +1100
User-agent: Mutt/1.5.23 (2015-06-09)

On Fri, Nov 06, 2015 at 12:22:23PM +0000, Dr. David Alan Gilbert wrote:
> * Bharata B Rao (address@hidden) wrote:
> > On Fri, Nov 6, 2015 at 2:39 PM, Dr. David Alan Gilbert
> > <address@hidden> wrote:
> > > * Bharata B Rao (address@hidden) wrote:
> > >> On Thu, Nov 05, 2015 at 06:10:27PM +0000, Dr. David Alan Gilbert (git) 
> > >> wrote:
> > >> > From: "Dr. David Alan Gilbert" <address@hidden>
> > >> >
> > >> >   This is the 9th cut of my version of postcopy.
> > >> >
> > >> > The userfaultfd linux kernel code is now in the upstream kernel
> > >> > tree, and so 4.3 can be used without modification.
> > >> >
> > >> > This qemu series can be found at:
> > >> > https://github.com/orbitfp7/qemu.git
> > >> > on the wp3-postcopy-v9 tag
> > >> >
> > >> > Testing status:
> > >> >   * Tested heavily on x86
> > >> >   * Smoke tested on aarch64 (so it does work on different page sizes)
> > >>
> > >> Tested minimally on ppc64 with back and forth postcopy migration of
> > >> unloaded pseries guest within the localhost - works as expected.
> > >>
> > >> However I am seeing a failure in one case. I am not sure if this is
> > >> a user error or a real issue in postcopy migration. If I switch to 
> > >> postcopy
> > >> migration immediately after starting the migration, I see the migration
> > >> failing with error:
> > >>
> > >> qemu-system-ppc64: qemu_savevm_send_packaged: Unreasonably large 
> > >> packaged state: 25905005
> > >
> > > I put an arbitrary limit of 16MB (see MAX_VM_CMD_PACKAGED_SIZE in 
> > > include/sysemu/sysemu.h)
> > > on the size of the data accepted into the packaged blob.  How big is the 
> > > htab data likely to be?
> > 
> > HTAB size is a variable and depends on maxmem size. It will be 1/128
> > th of maxmem. So for a 32G guest, HTAB will be 256M in size.
> 
> OK, that does get a bit big.
> Two possible fixes;
>  1 - postcopy htab (I don't know htab to know how hard that is)

It's.. awkward.  We'd need a way to set up the mappings on the
destination so that faults on bits of the hash table not yet up to
date get flagged and handed to qemu, rather than causing a fatal fault
in the guest.  I suspect that will need host kernel changes, although
maybe there's a way of setting up the htab on destination so that
unmapping things look like MMIO (which already goes to qemu).

>  2 - do one pass of iterable/non-postcopiable devices before we start the 
> package;
>      I'm just writing a patch to try that; I'll send it to you to let
>      you try once I get it to not-break normal migration.

Hm.  So, depends a bit on what you mean by "one pass".  If we've had
one complete pass through the hash table, I'd expect that to be enough
to get the package down to a reasonable size.  But one pass through
the full hash table can be multiple calls to the htab iterator.

Which makes me think it's a bit odd that we're not already getting
most of the htab data across during the precopy phase.  Don't we
already delay entering the postcopy phase until precopy is "complete"
in the sense that the remaining non-postcopiable data is below the
downtime limit?  I would have thought that would also ensure we'd only
have a reasonable number of remaining htab updates for the package.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]