qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH qemu 0/4] memory: Reduce memory use


From: Alexey Kardashevskiy
Subject: Re: [Qemu-devel] [RFC PATCH qemu 0/4] memory: Reduce memory use
Date: Fri, 8 Sep 2017 12:08:08 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

On 08/09/17 00:54, Dr. David Alan Gilbert wrote:
> * Alexey Kardashevskiy (address@hidden) wrote:
>> On 07/09/17 19:51, Dr. David Alan Gilbert wrote:
>>> * Alexey Kardashevskiy (address@hidden) wrote:
>>>> This was inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1481593
>>>>
>>>> What happens ithere is that every virtio block device creates 2 address
>>>> spaces - for modern config space (called "virtio-pci-cfg-as") and
>>>> for busmaster (common pci thing, called after the device name,
>>>> in my case "virtio-blk-pci").
>>>>
>>>> Each address_space_init() updates topology for every address space.
>>>> Every topology update (address_space_update_topology()) creates a new
>>>> dispatch tree - AddressSpaceDispatch with nodes (1KB) and
>>>> sections (48KB) and destroys the old one.
>>>>
>>>> However the dispatch destructor is postponed via RCU which does not
>>>> get a chance to execute until the machine is initialized but before
>>>> we get there, memory is not returned to the pool, and this is a lot
>>>> of memory which grows n^2.
>>>>
>>>> These patches are trying to address the memory use and boot time
>>>> issues but tbh only the first one provides visible outcome.
>>>
>>> Do you have a feel for how much memory is saved?
>>
>>
>> The 1/4 saves ~33GB (~44GB -> 11GB) for a 2GB guest and 400 virtio-pci
>> devices. These GB figures are the peak values (but it does not matter for
>> OOM killer), memory gets released in one go when RCU kicks in, it just
>> happens too late.
> 
> Nice saving!  Still, why is it using 11GB?

Yet to be discovered :) Not clear at the moment.


> What's it like for more sane configurations, say 2-3 virtio devices - is
> there anything noticable or is it just the huge setups?
> 
> Dave
> 
> 
>> The 3/4 saves less, I'd say 50KB per VCPU (more if you count peaks but so
>> much). Strangely, I do not see the difference in valgrind output when I run
>> a guest with 1024 or just 8 CPUs, probably "massif" is not the right tool
>> to catch this.

I did some more tests.

v2.10:
1024 CPUs, no virtio:     0:47   490.8MB   38/34
1 CPU, 500 virtio-block:  5:03   59.69GB   2354438/3

1/4 applied:
1024 CPUs, no virtio:     0:49   490.8MB   38/34
1 CPU, 500 virtio-block:  1:57   17.74GB   2186/3

3/4 applied:
1024 CPUs, no virtio:     0:53   491.1MB   20/17
1 CPU, 500 virtio-block:  2:01    17.7GB   2167/0


Time is what it takes to start QEMU with -S and then Q-Ax.

Memory amount is peak use from valgrind massif.

Last 2 numbers - "38/34" for example - 38 is the number of g_new(FlatView,
1), 34 is the number of g_free(view); the numbers are printed at
https://git.qemu.org/?p=qemu.git;a=blob;f=vl.c;h=8e247cc2a239ae8fb3d3cdf6d4ee78fd723d1053;hb=1ab5eb4efb91a3d4569b0df6e824cc08ab4bd8ec#l4666
before RCU kicks in.

500 virtio-block + bridges use around 1100 address spaces.





>>
>>>
>>> Dave
>>>
>>>> There are still things to polish and double check the use of RCU,
>>>> I'd like to get any feedback before proceeding - is this going
>>>> the right way or way too ugly?
>>>>
>>>>
>>>> This is based on sha1
>>>> 1ab5eb4efb Peter Maydell "Update version for v2.10.0 release".
>>>>
>>>> Please comment. Thanks.
>>>>
>>>>
>>>>
>>>> Alexey Kardashevskiy (4):
>>>>   memory: Postpone flatview and dispatch tree building till all devices
>>>>     are added
>>>>   memory: Prepare for shared flat views
>>>>   memory: Share flat views and dispatch trees between address spaces
>>>>   memory: Add flat views to HMP "info mtree"
>>>>
>>>>  include/exec/memory-internal.h |   6 +-
>>>>  include/exec/memory.h          |  93 +++++++++----
>>>>  exec.c                         | 242 +++++++++++++++++++--------------
>>>>  hw/alpha/typhoon.c             |   2 +-
>>>>  hw/dma/rc4030.c                |   4 +-
>>>>  hw/i386/amd_iommu.c            |   2 +-
>>>>  hw/i386/intel_iommu.c          |   9 +-
>>>>  hw/intc/openpic_kvm.c          |   2 +-
>>>>  hw/pci-host/apb.c              |   2 +-
>>>>  hw/pci/pci.c                   |   3 +-
>>>>  hw/ppc/spapr_iommu.c           |   4 +-
>>>>  hw/s390x/s390-pci-bus.c        |   2 +-
>>>>  hw/vfio/common.c               |   6 +-
>>>>  hw/virtio/vhost.c              |   6 +-
>>>>  memory.c                       | 299 
>>>> +++++++++++++++++++++++++++--------------
>>>>  monitor.c                      |   3 +-
>>>>  vl.c                           |   4 +
>>>>  hmp-commands-info.hx           |   7 +-
>>>>  18 files changed, 448 insertions(+), 248 deletions(-)
>>>>
>>>> -- 
>>>> 2.11.0
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / address@hidden / Manchester, UK
>>>
>>
>>
>> -- 
>> Alexey
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
> 


-- 
Alexey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]