qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/16] Multifd v4


From: Daniel P. Berrange
Subject: Re: [Qemu-devel] [PATCH 00/16] Multifd v4
Date: Tue, 14 Mar 2017 12:34:20 +0000
User-agent: Mutt/1.7.1 (2016-10-04)

On Tue, Mar 14, 2017 at 12:22:23PM +0000, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (address@hidden) wrote:
> > On Tue, Mar 14, 2017 at 10:21:43AM +0000, Dr. David Alan Gilbert wrote:
> > > * Juan Quintela (address@hidden) wrote:
> > > > Hi
> > > > 
> > > > This is the 4th version of multifd. Changes:
> > > > - XBZRLE don't need to be checked for
> > > > - Documentation and defaults are consistent
> > > > - split socketArgs
> > > > - use iovec instead of creating something similar.
> > > > - We use now the exported size of target page (another HACK removal)
> > > > - created qio_chanel_{wirtev,readv}_all functions.  the _full() name
> > > >   was already taken.
> > > >   What they do is the same that the without _all() function, but if it
> > > >   returns due to blocking it redo the call.
> > > > - it is checkpatch.pl clean now.
> > > > 
> > > > Please comment, Juan.
> > > 
> > > High level things,
> > >   a) I think you probably need to do some bandwidth measurements to show
> > >     that multifd is managing to have some benefit - it would be good
> > >     for the cover letter.
> > 
> > Presumably this would be a building block to solving the latency problems
> > with post-copy, by reserving one channel for use transferring out of band
> > pages required by target host page faults.
> 
> Right, it's on my list to look at;  there's some interesting questions about
> the way in which the main fd carrying the headers interacts, and also what
> happens to pages immediately after the requested page; for example, lets
> say we're currently streaming at address 'S' and a postcopy request (P) comes 
> in;
> so what we currently have on one FD is:
> 
>     S,S+1....S+n,P,P+1,P+2,P+n
> 
> Note that when a request comes in we flip location so we start sending 
> background
> pages from P+1 on the assumption that they'll be wanted soon.
> 
> with 3 FDs this would go initially as:
>     S    S+3 P+1 P+4
>     S+1  S+4 P+2 ..
>     S+2  P   P+3 ..
> 
> now if we had a spare FD for postcopy we'd do:
>     S    S+3 P+1 P+4
>     S+1  S+4 P+2 ..
>     S+2  S+5 P+3 ..
>     -    P   -   -
> 
> So 'P' got there quickly - but P+1 is stuck behind the S's; is that what we 
> want?
> An interesting alternative would be to switch which fd we keep free:
>     S    S+3 -   -   -
>     S+1  S+4 P+2 P+4
>     S+2  S+5 P+3 P+5
>     -    P   P+1 P+6
>   
> So depending on your buffering P+1 might also now be pretty fast; but that's
> starting to get into heuristics about guessing how much you should put on
> your previously low-queue'd fd.

Ah, I see, so you're essentially trying todo read-ahead when post-copy
faults. It becomes even more fun when you have multiple page faults
coming in, (quite likely with multi-vCPU guests), as you have P, Q, R, S
come in, all of which want servicing quickly. So if you queue up too
many P+n pages for read-ahead, you'd delay Q, R & S

     S    S+3 -   -   -
     S+1  S+4 P+2 P+4 Q   R   ...
     S+2  S+5 P+3 P+5 Q+1 R+1 ...
     -    P   P+1 P+6 Q+2 ... ...

this tends to argue for overcommitting threads vs cpus. eg even if QEMU
is confined to only use 2 host CPUs, it would be worth having 4 migration
threads. They would contend for CPU time for AES encryption, but you
would reduce chance of getting stuck behind large send-buffers.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]