qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Migration ToDo list (a.k.a. Rant)


From: Greg Kurz
Subject: Re: [Qemu-devel] Migration ToDo list (a.k.a. Rant)
Date: Wed, 4 May 2016 18:35:22 +0200

On Wed, 4 May 2016 13:47:12 +0100
"Dr. David Alan Gilbert" <address@hidden> wrote:

> * Juan Quintela (address@hidden) wrote:
> > 
> > Hi
> > 
> > I am lots of times asked about what is the ToDo list for migration, that
> > was on my head, and random notes over my desk, so, trying some
> > organization (Yes, I would put this in the wiki).  
> 
> Let me add:
>   Getting everything to use VMState;  I intend to try and fix virtio to use
> VMState as much as possible.
> 

I had tried to revive Juan's 41-patch series from 2009 some time ago but it
was really tedious and the virtio 1.0 work started and I gave up...

> And yes, a wiki entry would be good; then people might notice it and fix 
> things
> for us :-)
> 

But I'm still willing to help if I can. :)

> > - migration thread on reception
> >   would make trivial to do other things while receiving, and would make
> >   postcopy easier also (I was going to put much easier, but postcopy is
> >   never easy).  
> 
> I don't think it makes much difference to postcopy.
> 
> > - migration capabilities and parameters
> >   this is a mess.  Not, is worse than that.  I don't know who is to
> >   blame here, but something needs to be done:
> > 
> >      void qmp_migrate_set_parameters(bool has_compress_level,
> >                                 int64_t compress_level,
> >                                 bool has_compress_threads,
> >                                 int64_t compress_threads,
> >                                 bool has_decompress_threads,
> >                                 int64_t decompress_threads,
> >                                 bool has_x_cpu_throttle_initial,
> >                                 int64_t x_cpu_throttle_initial,
> >                                 bool has_x_cpu_throttle_increment,
> >                                 int64_t x_cpu_throttle_increment,
> >                                 bool has_multifd_threads,
> >                                 int64_t multifd_threads,
> >                                 Error **errp)
> > 
> > 
> > 
> >     Can we move this to an array of structs, please, pretty please?
> >     I think that for this one, the blame is on qmp  
> 
> Yes; zhanghailiang had a patch to try and help that and there was
> some discussion at about the same time (June last year?!)
> That function is VERY delicate; if you screw up and get those in the
> wrong order then everything will appear to be just fine....
> 
> > - info migrate
> >   This deserves its own item.  Lets see a typical output
> > 
> > (qemu)info migrate
> > 
> > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: 
> > off compress: off events: off postcopy-ram: off x-multifd: on 
> > 
> >    Aha, we have the capabilities, but not the parameters.  This is
> >    historical, I know, but don't belong here.  
> 
> Well, for the HMP version we can fix any of this IMHO without a problem;
> lets add more detail/fix names/etc.
> 
> > And we still have more optional information that appears if we are doing
> > block migration, xbzrle, compression, rdma, etc, etc.
> > 
> > We need to decide some units also internal.  Some things are in bytes,
> > some are in kilobytes, some are in pages.  Some are in host pages, or
> > guest pages, or who knows :-(  
> 
> I don't - every time I look at some of it I end up going back to the source.
> 
> > - Block migration (the migration/block.c one).  This is the bastard
> >   child of migration.  Much less tested, we should make a decision
> >   about letting it live or deprecating it.  Things needed from memory:
> >      - functions should return the same values than ram.c
> >        some functions don't have "exact" values, and return 1 when there
> >        are more than one block dirty, etc, etc
> >      - if we continue maintaing it, allowing it to have _some_ shared
> >        devices and some non shared ones, insntead of everything?  
> 
> My vague understanding was that there were still configurations that were
> only useable with block migration; mostly those things that only wanted
> a single socket because they wanted to tunnel it;  this might change with
> Dan's TLS setup.
> Having said that, I don't understand all of the block migration alternatives.
> 
> > - RDMA: Another step child
> > 
> >   This is really, really weird.  We don't use the normal infrastructure
> >   for RDMA, we use the ram_control_* stuff.  We should really move to
> >   use the normal stuff here.  
> 
> I'm not sure that's possible - while the RDMA code is huge and horribly
> complex, some of that is just down to the kernel APIs and standards it
> has to deal with; it might be possibl to glue it into ram.c better
> but I wouldn't bet on it.
> 
> > - autoconverge code:  This could be used outside of migration (i.e. just
> >   to slow down a guess).  We should really do some measurement here to
> >   see how useful it is for migration.  If the guest is using lots of
> >   memory dirtying, we end having to throttle the guest 90% or so :-(  
> 
> Dan's doing some I think.  The other question is how it compares to using
> an external cgroup based converge (which I think is what oVirt does).
> 
> > - xbzrle.  We only have one cache, we should decide how to work with
> >   this for multithread/compression.
> > 
> > - When we do migration, we have spaguetti code to decide if:
> >   * it is a zero page
> >   * it is a duplicated page
> >   * it is a xbzrle page
> >   * it is a compressed page
> >   And as the code is written, it is not trivial to add new "options".  I
> >   think that we should "re-think" what combinations are allowed an which
> >   ones make nosense.  
> 
> Yeh, and find a way to express to libvirt what combinations are legal.
> 
> > - savevm and migration: they use two different paths for not really good
> >   reason.  We should really abstract this to a single code path.
> >   We always forget the savevm one when we do changes.
> > 
> > - error handling.  Every function should return an error.  Every
> >   function should return an error.  
> 
> Yeh.
> 
> > - qemu_get_buffer() don't give one error if there is nothing to read,
> >   sniff.
> > 
> > - Multipage support: Welcome to the XXI century.  Now almost all
> >   architectures have HugePages.  And other have different sized pages
> >   (in PPC is not strange that page size of host and guest differ).  We
> >   have work to do here.  For starters, sending Huge pages as one chunk
> >   will make TransparentHugePages happier.  
> 
> Yeh, Andrea has pushed me about this a bit; the only problem I have
> here is with postcopy where getting a page request stuck behind a huge
> page request would do nasty things to the latency - but your multifd might
> fix that.
> 
> > - Bitmaps.  Related with previous one.  We should really be better about
> >   walking them and about synchronising them between qemu/kernel.  
> 
> Oh yes, they're a nightmare on things with different page sizes; especially
> when people worry that the source and destination might have different host
> page sizes.
> 
> > - COLO: We need to integrate it.
> > 
> > I will continue the rant at some other point O:-)  Just now I need to
> > left for the bar.  
> 
> One that's related to that, is the big-lock around the last stage of migrate;
> we really could do with being able to recover from a migrate that hangs during
> the final stage due to a block-IO or network issue.
> 
> > Thanks for your attention, Juan.
> > 
> > PD.  I just looked while I wrote this to the channel code from Daniel, a
> > step on the right direction.  
> 
> Dave
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]