qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] XBRLE page delta compression for live migration


From: Shribman, Aidan
Subject: Re: [Qemu-devel] [PATCH] XBRLE page delta compression for live migration of large memory apps
Date: Wed, 6 Jul 2011 14:04:54 +0200

> From: Stefan Hajnoczi [mailto:address@hidden 
> Sent: Wednesday, June 22, 2011 3:26 PM
> 
> On Wed, Jun 22, 2011 at 1:01 PM, Anthony Liguori 
> <address@hidden> wrote:
> >>
> >> By using XBRLE (Xor Based Run-Length-Encoding) we can 
> reduce required
> >> bandwidth for transfering of dirty memory pages during 
> live migration
> >>         migrate_set_cachesize<size>
> >>         migrate -x<url>
> >
> > By how much?

See "Evaluation of delta compression techniques for efficient live migration of 
large virtual machines" (http://portal.acm.org/citation.cfm?id=1952698) 
subsection 5.2.3:

In the final test, a VM running a SAP Central Instance ERP system
was migrated over Gigabit Ethernet. With XBRLE, the total migration time was 
reduced from 235 s to 139 s, the suspension time was reduced from 3 s to 0.2 s, 
and the ping downtime from 5 s to 1 s...

> >
> > This is a change to the live migration protocol, it would 
> also require
> > documentation and an understanding of how it affects compatibility.

The default behavior (i.e. using the -x migrate option) has not changed thus 
still compatible with previous qemu versions. When initiating a migration with 
XBRLE the remote peer must also support XBRLE else migration will fail.

With regards to documentation please advise.

> >
> > The patch really needs to be split into logical pieces too. 
>  It's a bit too
> > big for a meaningful review.
> 
> Two places where you could consider splitting the patch is the caching
> and the sampling.  Are they necessary for correctness and could they
> be submitted as follow-up patches to a core patch which does just the
> XBRLE?

Some changes were done to reduce code size:
(1) Sampling code - which was used for early detection that the page changed so 
much thus XBRLE not applicable - has been replaced with a simple check that the 
XBRLE delta does not overflow a 1/3 of a page size (4096 bytes).
(2) Check-summing/page-logging code - existing only under debug compile ifdef 
option - was removed.
(3) XBRLE migration statistics - were replaced with a detailed 'info migrate' 
output - vital for tracking the XBRLE operation.
(4) 2-way associative cache - was not separated from the XBRLE code as the 
cache is a fundamental part of XBRLE implementation (XBRLE updates are the 
difference between the new page and the old cached page on the sender side).

Currently I don't see howto split the code into smaller meaningful pieces - for 
now I have re-submitted a single patch with correction. (see separate email 
[PATCH v2]).

> 
> Also, whenever there are heuristics and use of floating point then
> there is some magic going on.  It may be necessary and give a huge
> performance boost but needs explanation so it is not a black box or
> fragile mechanism once it has been merged upstream.

The code mentioned (which was responsible for sampling page to test it for 
being eligible for XBRLE encoding) has been removed.

> 
> Stefan

Aidan 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]