|
From: | Anthony Liguori |
Subject: | Re: [Qemu-devel] [RFC PATCH] Exporting Guest RAM information for NUMA binding |
Date: | Mon, 21 Nov 2011 19:57:09 -0600 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.21) Gecko/20110831 Lightning/1.0b2 Thunderbird/3.1.13 |
On 11/21/2011 04:50 PM, Chris Wright wrote:
* Peter Zijlstra (address@hidden) wrote:On Mon, 2011-11-21 at 21:30 +0530, Bharata B Rao wrote:In the original post of this mail thread, I proposed a way to export guest RAM ranges (Guest Physical Address-GPA) and their corresponding host host virtual mappings (Host Virtual Address-HVA) from QEMU (via QEMU monitor). The idea was to use this GPA to HVA mappings from tools like libvirt to bind specific parts of the guest RAM to different host nodes. This needed an extension to existing mbind() to allow binding memory of a process(QEMU) from a different process(libvirt). This was needed since we wanted to do all this from libvirt. Hence I was coming from that background when I asked for extending ms_mbind() to take a tid parameter. If QEMU community thinks that NUMA binding should all be done from outside of QEMU, it is needed, otherwise what you have should be sufficient.That's just retarded, and no you won't get such extentions. Poking at another process's virtual address space is just daft. Esp. if there's no actual reason for it.Need to separate the binding vs the policy mgmt. The policy mgmt could still be done outside, whereas the binding could still be done from w/in QEMU. A simple monitor interface to rebalance vcpu memory allcoations to different nodes could very well schedule vcpu thread work in QEMU.
I really would prefer to avoid having such an interface. It's a shot gun that will only result in many poor feet being maimed. I can't tell you the number of times I've encountered people using CPU pinning when they have absolutely no business doing CPU pinning.
If we really believe such an interface should exist, then the interface should really be from the kernel. Once we have memgroups, there's no reason to involve QEMU at all. QEMU can define the memgroups based on the NUMA nodes and then it's up to the kernel as to whether it exposes controls to explicitly bind memgroups within a process or not.
Regards, Anthony Liguori
So, I agree, even if there is some external policy mgmt, it could still easily work w/ QEMU to use Peter's proposed interface. thanks, -chris
[Prev in Thread] | Current Thread | [Next in Thread] |