qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support


From: Thomas Huth
Subject: Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism
Date: Fri, 15 Jul 2016 17:54:44 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2

On 15.07.2016 17:18, Greg Kurz wrote:
> On Fri, 15 Jul 2016 14:28:44 +0200
> Thomas Huth <address@hidden> wrote:
> 
>> On 15.07.2016 10:35, David Gibson wrote:
>>> On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote:  
>>>> Commit 86b50f2e1bef ("Disable huge page support if it is not available
>>>> for main RAM") already made sure that huge page support is not announced
>>>> to the guest if the normal RAM of non-NUMA configurations is not backed
>>>> by a huge page filesystem. However, there is one more case that can go
>>>> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
>>>> with huge page support (and only the memory of a DIMM is configured with
>>>> it). When QEMU is started with the following command line for example,
>>>> the Linux guest currently crashes because it is trying to use huge pages
>>>> on a memory region that does not support huge pages:
>>>>
>>>>  qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
>>>>    
>>>> memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 
>>>> \
>>>>    -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
>>>>    -numa node,nodeid=0 -numa node,nodeid=1
>>>>
>>>> To fix this issue, we've got to make sure to disable huge page support,
>>>> too, when there is a NUMA node that is not using a memory backend with
>>>> huge page support.
>>>>
>>>> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
>>>> Signed-off-by: Thomas Huth <address@hidden>
>>>> ---
>>>>  target-ppc/kvm.c | 10 +++++++---
>>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>>>> index 884d564..7a8f555 100644
>>>> --- a/target-ppc/kvm.c
>>>> +++ b/target-ppc/kvm.c
>>>> @@ -389,12 +389,16 @@ static long getrampagesize(void)
>>>>  
>>>>      object_child_foreach(memdev_root, find_max_supported_pagesize, 
>>>> &hpsize);
>>>>  
>>>> -    if (hpsize == LONG_MAX) {
>>>> +    if (hpsize == LONG_MAX || hpsize == getpagesize()) {
>>>>          return getpagesize();
>>>>      }
>>>>  
>>>> -    if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
>>>> -        /* No NUMA nodes and normal RAM without -mem-path ==> no huge 
>>>> pages! */
>>>> +    /* If NUMA is disabled or the NUMA nodes are not backed with a
>>>> +     * memory-backend, then there is at least one node using "normal"
>>>> +     * RAM. And since normal RAM has not been configured with "-mem-path"
>>>> +     * (what we've checked earlier here already), we can not use huge 
>>>> pages!
>>>> +     */
>>>> +    if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {  
>>>
>>> Is that second clause sufficient, or do you need to loop through and
>>> check the memdev of every node?  
>>
>> Checking the first entry should be sufficient. QEMU forces you to
>> specify either a memory backend for all NUMA nodes (which we should have
>> looked at during the object_child_foreach() some lines earlier), or you
>> must not specify a memory backend for any NUMA node at all. You can not
>> mix the settings, so checking numa_info[0] is enough.
> 
> And what happens if we specify a hugepage memdev backend to one of the
> nodes and a regular RAM memdev backend to the other ?

I think that should be handled with the object_child_foreach() logic in
that function ... unless I completely misunderstood the code ;-)

> I actually wanted to try that but I hit an assertion, which isn't
> related to this patch I think:
> 
> qemu-system-ppc64: memory.c:1934: memory_region_add_subregion_common: 
>    Assertion `!subregion->container' failed.

I just tried that, too, and I did not get that assertion:

qemu-system-ppc64 -enable-kvm ... -m 2G,slots=4,maxmem=32G \
 -object 
memory-backend-file,policy=default,mem-path=/mnt/kvm_hugepage,size=1G,id=mem-mem1
 \
 -object memory-backend-file,policy=default,mem-path=/mnt,size=1G,id=mem-mem2 \
 -smp 2 -numa node,nodeid=0,memdev=mem-mem1 \
 -numa node,nodeid=1,memdev=mem-mem2

And the guest was starting fine, with huge pages disabled.

> So I tried to trick the logic you are trying to fix the other way
> round:
> 
> -mem-path /dev/hugepages \
> -m 1G,slots=4,maxmem=32G \
> -object memory-backend-ram,policy=default,size=1G,id=mem-mem1 \
> -device pc-dimm,id=dimm-mem1,memdev=mem-mem1 \
> -smp 2 \
> -numa node,nodeid=0 -numa node,nodeid=1
> 
> The guest fails the same way as before your patch: the hugepage size is
> advertised to the guest, but the numa node is associated to regular ram.

You're right, this is still an issue here! ... so we need yet another
fix for this case :-/

Thanks for the testing!

 Thomas


Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]