On Tue, Mar 19, 2013 at 01:49:34PM -0400, Michael R. Hines wrote:
I also did a test using RDMA + cgroup, and the kernel killed my QEMU :)
So, infiniband is not smart enough to know how to avoid pinning a
zero page, I guess.
- Michael
On 03/19/2013 01:14 PM, Paolo Bonzini wrote:
Il 19/03/2013 18:09, Michael R. Hines ha scritto:
Allowing QEMU to swap due to a cgroup limit during migration is a viable
overcommit option?
I'm trying to keep an open mind, but that would kill the migration
time.....
Would it swap? Doesn't the kernel back all zero pages with a single
copy-on-write page? If that still accounts towards cgroup limits, it
would be a bug.
Old kernels do not have a shared zero hugepage, and that includes some
distro kernels. Perhaps that's the problem.
Paolo
I really shouldn't break COW if you don't request LOCAL_WRITE.
I think it's a kernel bug, and apparently has been there in the code since the
first version: get_user_pages parameters swapped.
I'll send a patch. If it's applied, you should also
change your code from
+ IBV_ACCESS_LOCAL_WRITE |
+ IBV_ACCESS_REMOTE_WRITE |
+ IBV_ACCESS_REMOTE_READ);
to
+ IBV_ACCESS_REMOTE_READ);
on send side.
Then, each time we detect a page has changed we must make sure to
unregister and re-register it. Or if you want to be very
smart, check that the PFN didn't change and reregister
if it did.
This will make overcommit work.