|
From: | Avi Kivity |
Subject: | Re: [Qemu-devel] Re: [Qemu-commits] [COMMIT 3086844] Instead of writing a zero page, madvise it away |
Date: | Mon, 22 Jun 2009 19:38:40 +0300 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Lightning/1.0pre Thunderbird/3.0b2 |
On 06/22/2009 07:25 PM, Anthony Liguori wrote:
Avi Kivity wrote:On 06/22/2009 06:51 PM, Anthony Liguori wrote:From: Anthony Liguori<address@hidden>Otherwise, after migration, we end up with a much larger RSS size then weought to have.We have the same issue on the migration source node. I don't see a simple way to solve it, though.I don't follow. In this case, the issue is:1) Start a guest with 1024, balloon down to 128MB. RSS size is now ~128MB2) Live migrate to a different node 3) RSS on different node jumps to ~1GB
3.5) RSS on source node jumps to ~1GB, since reading the page instantiates the pte
4) Weep at all your lost memory
4.5) And at the swapping going on in the source node
Xen had a similar issue. This ends up biting people who overcommit their VMs via ballooning, live migration, and badness ensues. At least for us, the error is swapping but madvise also avoids the issue by never consuming that memory to begin with.
Right. I'd love to do madvise() on the source node as well if we fault in a page and find out it's zero, but the guest (and aio) is still running and we might drop live data. We need a madvise(MADV_DONTNEED_IFZERO), or a mincore() flag that tells us if the page exists (vs. swapped). ksm would also do this, but it is overkill for some applications.
Note that the patch contains a small bug -- the kernel is allowed to ignore the advise according to the manual page, so it's better to memset() the memory before dropping it.
-- error compiling committee.c: too many arguments to function
[Prev in Thread] | Current Thread | [Next in Thread] |