On 5/20/09, Uri Lublin<address@hidden> wrote:
On 05/19/2009 09:17 PM, Anthony Liguori wrote:
Glauber Costa wrote:
On Tue, May 19, 2009 at 05:59:14PM +0300, Dor Laor wrote:
We can also make it configurable using the monitor migrate command.
For example:
migrate -d -no_progress -threshold=x tcp:....
it can be done, but it fits better as a different monitor command
anthony, do you have any strong opinions here, or is this scheme
acceptable?
Threshold is a bad metric. There's no way to choose a right number. If
we were going to have a means to support metrics-based forced
convergence (and I really think this belongs in libvirt) I'd rather see
something based on bandwidth or wall clock time.
Let me put it this way, why 50? What were the guidelines for choosing
that number and how would you explain what number a user should choose?
I've changed the threshold of the first convergence rule, to 50 from 10.
Why 10 ? For this rule the threshold (number of dirty pages) and the number
of bytes to transfer are equivalent.
50 pages is about 200K, which can be still sent quickly.
I've added debug messages and noticed we never hit a number smaller than 10
(excluding 0). The truth is there were very little number of runs with less
than 50 dirty pages too. I don't mind leaving it at 10 (should be
configurable too).
For the second migration convergence rule I've set the limit to 10, as it
seems much larger than what I've needed (all the runs I've made a number of
2-4 no-progress iterations was good enough, as it seems to have a repetitive
behavior later), but I've enlarged it "just in case". No real research work
was done here.
Note that a no-progress iteration depends on both network bandwidth and
guest actions.
Instead of freezing the guest or aborting the migration, the guest
could be throttled a bit by giving it less CPU time relative to
migration, or by incurring a small delay for each page dirtying write
access. Maybe this method would find the balance faster.