[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting f

From: Wei Wang
Subject: Re: [Qemu-devel] [PATCH v2 2/3] migration: use the free page reporting feature from balloon
Date: Thu, 08 Feb 2018 11:54:47 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

On 02/07/2018 07:57 AM, Michael S. Tsirkin wrote:
On Tue, Feb 06, 2018 at 07:08:18PM +0800, Wei Wang wrote:
Use the free page reporting feature from the balloon device to clear the
bits corresponding to guest free pages from the dirty bitmap, so that the
free memory are not sent.

Signed-off-by: Wei Wang <address@hidden>
CC: Michael S. Tsirkin <address@hidden>
CC: Juan Quintela <address@hidden>
What the patch seems to do is stop migration
completely - blocking until guest completes the reporting.

Which makes no sense to me, since it's just an optimization.
Why not proceed with the migration? What do we have to loose?

If we want the optimization to run in parallel with the migration thread, we will need to create another polling thread, like multithreading compression. In that way, we will waste some host CPU. For example, the migration thread may proceed to send pages to the destination while the optimization thread is in progress, but those pages may turn out to be free pages (this is likely in the bulk stage) which don't need to be sent. In that case, why not let the migration thread wait a little bit (i.e. put the optimization into the migration thread) and proceed to do some useful things, instead of pretending to proceed but doing useless things?

The current plan of this patch is to skip free pages for the bulk stage only. I'm not sure if it would be useful for the 2nd stage onward, which basically relies on the dirty logging to send pages that have been written by the guest. For example, if the guest is not so active while live migration happens, there will be very few dirty bits. This optimization would be mostly clearing "0" bits from the dirty bitmap.

I imagine some people might want to defer migration until reporting
completes to reduce the load on the network. Fair enough,
but it does not look like you actually measured the reduction
in traffic. So I suggest you work on that as a separate feature.

I have the traffic data actually. Tested with 8G idle guest, Legacy v.s. Optimization: ~390MB v.s. ~337MB. The legacy case has the zero page checking optimization, so the traffic reduction is not very obvious. But zero checking has much more overhead, which is demonstrated by the migration time (this optimization takes ~14% of the legacy migration time).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]