qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence ru


From: Blue Swirl
Subject: Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule
Date: Wed, 20 May 2009 20:28:45 +0300

On 5/20/09, Uri Lublin <address@hidden> wrote:
> On 05/19/2009 09:17 PM, Anthony Liguori wrote:
>
> > Glauber Costa wrote:
> >
> > > On Tue, May 19, 2009 at 05:59:14PM +0300, Dor Laor wrote:
> > >
> > >
> > > > We can also make it configurable using the monitor migrate command.
> > > > For example:
> > > > migrate -d -no_progress -threshold=x tcp:....
> > > >
> > > it can be done, but it fits better as a different monitor command
> > >
> > > anthony, do you have any strong opinions here, or is this scheme
> > > acceptable?
> > >
> >
> > Threshold is a bad metric. There's no way to choose a right number. If
> > we were going to have a means to support metrics-based forced
> > convergence (and I really think this belongs in libvirt) I'd rather see
> > something based on bandwidth or wall clock time.
> >
> > Let me put it this way, why 50? What were the guidelines for choosing
> > that number and how would you explain what number a user should choose?
> >
>
>  I've changed the threshold of the first convergence rule, to 50 from 10.
> Why 10 ? For this rule the threshold (number of dirty pages) and the number
> of bytes to transfer are equivalent.
>
>  50 pages is about 200K, which can be still sent quickly.
>  I've added debug messages and noticed we never hit a number smaller than 10
> (excluding 0). The truth is there were very little number of runs with less
> than 50 dirty pages too. I don't mind leaving it at 10 (should be
> configurable too).
>
>  For the second migration convergence rule I've set the limit to 10, as it
> seems much larger than what I've needed (all the runs I've made a number of
> 2-4 no-progress iterations was good enough, as it seems to have a repetitive
> behavior later), but I've enlarged it "just in case". No real research work
> was done here.
>
>  Note that a no-progress iteration depends on both network bandwidth and
> guest actions.

Instead of freezing the guest or aborting the migration, the guest
could be throttled a bit by giving it less CPU time relative to
migration, or by incurring a small delay for each page dirtying write
access. Maybe this method would find the balance faster.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]