qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v0] spapr: Disable memory hotplug when HTAB


From: Anshuman Khandual
Subject: Re: [Qemu-devel] [RFC PATCH v0] spapr: Disable memory hotplug when HTAB size is insufficient
Date: Wed, 09 Sep 2015 14:36:19 +0530
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

On 09/04/2015 09:42 PM, Michael Roth wrote:
> Quoting Nathan Fontenot (2015-09-04 10:49:18)
>> On 09/04/2015 10:33 AM, Michael Roth wrote:
>>> Quoting Nathan Fontenot (2015-09-03 13:50:59)
>>>> On 09/01/2015 10:28 PM, Bharata B Rao wrote:
>>>>> On Mon, Aug 24, 2015 at 09:01:51AM +0530, Bharata B Rao wrote:
>>>>>> The hash table size allocated to guest depends on the maxmem size.
>>>>>> If the host isn't able to allocate the required hash table size but
>>>>>> instead allocates less than the optimal requested size, then it will
>>>>>> not be possible to grow the RAM until maxmem via memory hotplug.
>>>>>> Attempts to hotplug memory till maxmem could fail and this failure
>>>>>> isn't being currently handled gracefully by the guest kernel thereby
>>>>>> causing guest kernel oops.
>>>>>>
>>>>>> This should eventually get fixed when we move to completely in-kernel
>>>>>> memory hotplug instead of the current method where userspace tool drmgr
>>>>>> drives the hotplug. Until the in-kernel memory hotplug is available
>>>>>> for PowerKVM, disable memory hotplug when requested hash table size
>>>>>> isn't allocated.
>>>>>
>>>>> David - Do you have any views on how to go about this ? Due to the way
>>>>> we do hotplug currently using drmgr, it appears that it is very difficult
>>>>> to have a graceful recovery within the guest kernel when memory hotplug
>>>>> request can't be fulfilled due to insufficient HTAB size. (Anshuman can
>>>>> elaborate on this with the exact description on why it is so hard to
>>>>> recover).
>>>>>
>>>>> Do you think disabling memory hotplug upfront is a reasonable workaround
>>>>> for this problem ?
>>>>>
>>>>> Nathan - When you enable in-kernel memory hotplug for PowerKVM, will you
>>>>> be exporting something for the userspace (capability ?) to check and
>>>>> determine the presense of in-kernel memory hotplug feature so that we
>>>>> can depend on graceful recovery instead of upfront disablement of
>>>>> memory hotplug from QEMU ?
>>>>>
>>>>
>>>> I did not have any plans currently to export something indicating we are
>>>> using the in-kernel memory hotplug code.
>>>>
>>>> Perhaps this is something we should consider adding the to the PAPR update
>>>> proposal that is being worked? Something to indicate we can gracefully 
>>>> handle
>>>> adding memory beyond HTAB size.
>>>
>>> That might make sense, but I'm curious what constitutes graceful
>>> recovery in this context. What can we do with in-kernel hotplug that's not
>>> possible with userspace tools? If it's graceful failure, is there really
>>> nothing that can be done by QEMU as the DRC level to get the same
>>> result?
>>
>> I don't have an answer for how to recover gracefully or if it will be 
>> possible.
> 
> Sorry, I meant it as a general question. Bharata mentioned Anshuman might have
> some further details?

Graceful recovery in the kernel seems to be difficult (though I cannot
say whether it is impossible) because of the way we have implemented
the memory hotplug function with the help of the userspace tool called
'drmgr'. It has two distinct steps in which it achieve memory hotplug
after receiving platform notification.

(1) Update the /proc/ofdt
(2) Write into /sys/devices/system/memory/probe

Both of these above steps try to add the new memory block into the kernel
(generic and arch specific representations). Now if the step (2) fails we
restore /proc/ofdt value to the original state present before we started
the hotplug operation. In short, this does not rollback all the changes
we had done in step (2) and step (1) gracefully. One of the reasons being
the fact that it happens in two distinct steps from user space.

Had it been attempted through a single step, kernel would have right away
reverted any changes before exiting back into the userspace. New in-kernel
memory hotplug method follows this principle now.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]