qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage
Date: Tue, 28 Jun 2016 08:56:10 -0400 (EDT)


----- Original Message -----
> From: "Peter Lieven" <address@hidden>
> To: "Paolo Bonzini" <address@hidden>
> Cc: address@hidden, address@hidden, "peter maydell" <address@hidden>, 
> address@hidden,
> address@hidden, address@hidden, address@hidden
> Sent: Tuesday, June 28, 2016 2:33:02 PM
> Subject: Re: [PATCH 00/15] optimize Qemu RSS usage
> 
> Am 28.06.2016 um 14:29 schrieb Paolo Bonzini:
> >> Am 28.06.2016 um 13:37 schrieb Paolo Bonzini:
> >>> On 28/06/2016 11:01, Peter Lieven wrote:
> >>>> I recently found that Qemu is using several hundred megabytes of RSS
> >>>> memory
> >>>> more than older versions such as Qemu 2.2.0. So I started tracing
> >>>> memory allocation and found 2 major reasons for this.
> >>>>
> >>>> 1) We changed the qemu coroutine pool to have a per thread and a global
> >>>> release
> >>>>      pool. The choosen poolsize and the changed algorithm could lead to
> >>>>      up
> >>>>      to
> >>>>      192 free coroutines with just a single iothread. Each of the
> >>>>      coroutines
> >>>>      in the pool each having 1MB of stack memory.
> >>> But the fix, as you correctly note, is to reduce the stack size.  It
> >>> would be nice to compile block-obj-y with -Wstack-usage=2048 too.
> >> To reveal if there are any big stack allocations in the block layer?
> > Yes.  Most should be fixed by now, but a handful are probably still there.
> > (definitely one in vvfat.c).
> >
> >> As it seems reducing to 64kB breaks live migration in some (non
> >> reproducible) cases.
> > Does it hit the guard page?
> 
> How would that look like? I get segfaults like this:
> 
> segfault at 7f91aa642b78 ip 0000555ab714ef7d sp 00007f91aa642b50 error 6 in
> qemu-system-x86_64[555ab6f2c000+794000]
> 
> most of the time error 6. Sometimes error 7. segfault is near the sp.

You can use "p ((CoroutineUContext*)current)->stack" from gdb
to check the stack base of the currently running coroutine (do it in the thread
that received the segfault).

You can also check the instruction with that ip and try to get a backtrace.

Paolo


> >>>> 2) Between Qemu 2.2.0 and 2.3.0 RCU was introduced which lead to delayed
> >>>> freeing
> >>>>      of memory. This lead to higher heap allocations which could not
> >>>>      effectively
> >>>>      be returned to kernel (most likely due to fragmentation).
> >>> I agree that some of the exec.c allocations need some care, but I would
> >>> prefer to use a custom free list or lazy allocation instead of mmap.
> >> This would only help if the elements from the free list would be allocated
> >> using mmap? The issue is that RCU delays the freeing so that the number of
> >> concurrent allocations is high and then a bunch is freed at once. If the
> >> memory
> >> was malloced it would still have caused trouble.
> > The free list should improve reuse and fragmentation.  I'll take a look at
> > lazy allocation of subpages, too.
> 
> Ok, that would be good. And for the PhsyPageMap we use mmap and try to avoid
> the realloc?

I think that with lazy allocation of subpages the PhysPageMap will be much
smaller, but I need to check.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]