qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS


From: Christian Borntraeger
Subject: Re: [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS deflate balloon on oom
Date: Fri, 12 Jun 2015 13:56:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Am 10.06.2015 um 15:13 schrieb Michael S. Tsirkin:
> On Wed, Jun 10, 2015 at 03:02:21PM +0300, Denis V. Lunev wrote:
>> On 09/06/15 13:37, Christian Borntraeger wrote:
>>> Am 09.06.2015 um 12:19 schrieb Denis V. Lunev:
>>>> Excessive virtio_balloon inflation can cause invocation of OOM-killer,
>>>> when Linux is under severe memory pressure. Various mechanisms are
>>>> responsible for correct virtio_balloon memory management. Nevertheless it
>>>> is often the case that these control tools does not have enough time to
>>>> react on fast changing memory load. As a result OS runs out of memory and
>>>> invokes OOM-killer. The balancing of memory by use of the virtio balloon
>>>> should not cause the termination of processes while there are pages in the
>>>> balloon. Now there is no way for virtio balloon driver to free memory at
>>>> the last moment before some process get killed by OOM-killer.
>>>>
>>>> This does not provide a security breach as balloon itself is running
>>>> inside Guest OS and is working in the cooperation with the host. Thus
>>>> some improvements from Guest side should be considered as normal.
>>>>
>>>> To solve the problem, introduce a virtio_balloon callback which is
>>>> expected to be called from the oom notifier call chain in out_of_memory()
>>>> function. If virtio balloon could release some memory, it will make the
>>>> system return and retry the allocation that forced the out of memory
>>>> killer to run.
>>>>
>>>> This behavior should be enabled if and only if appropriate feature bit
>>>> is set on the device. It is off by default.
>>> The balloon frees pages in this way
>>>
>>> static void balloon_page(void *addr, int deflate)
>>> {
>>> #if defined(__linux__)
>>>     if (!kvm_enabled() || kvm_has_sync_mmu())
>>>         qemu_madvise(addr, TARGET_PAGE_SIZE,
>>>                 deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
>>> #endif
>>> }
>>>
>>> The guest can re-touch that page and get a empty zero or the old page back 
>>> without
>>> tampering the host integrity. This should work for all cases I am aware of 
>>> (without sync_mmu its a nop anyway) so why not enable that by default? 
>>> Anything that I missed?
>>>
>>> Christian
>>
>> I'd like to do that :) Actually original version of kernel patch
>> has enabled this unconditionally. But Michael asked to make
>> it configurable and off by default.
>>
>> Den
> 
> That's not the question here.  The question is why is it limited by 
> kvm_has_sync_mmu.

Well we have two interesting options here:

VIRTIO_BALLOON_F_MUST_TELL_HOST and VIRTIO_BALLOON_F_DEFLATE_ON_OOM

For any sane host with ondemand paging just re-accessing the page
should simply work. So the common case could be
VIRTIO_BALLOON_F_MUST_TELL_HOST == off
VIRTIO_BALLOON_F_DEFLATE_ON_OOM == on

Only for the rare case of hypervisors without paging or other memory
related restrictions we have to enable MUST_TELL_HOST.
Now: QEMU knows exactly which case we have, so why not let QEMU tell
the guest what the capabilities are. (e.g. sync_mmu ---> no need to 
tell the host).

I can at least imaging that some admin wants to make the the oom case
configurable, but a sane default seems to be to not kill random
guest processes.

Christian




reply via email to

[Prev in Thread] Current Thread [Next in Thread]