[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has sever
From: |
ChristianEhrhardt |
Subject: |
[Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time |
Date: |
Tue, 03 Apr 2018 09:49:14 -0000 |
Back again,
my question was more about if we are able to JUST take
2a53535af471f4bee9d6cb5b363746b8d5ed21dd without the rest.
We are already in Feature Freeze for Ubuntu 18.04, so we can either
a) wait for the next release and pick it up in full by the new qemu
version (well we will do that anyway)
b) identify a fix only (not all the cleanup and reworks) patch that will
be good for the 2.11.1 in Bionic
Especially being "just slow" but not broken makes it harder to consider the
closer we get to release (I hate that as well being a performance engineer, but
minimizing regressions is a target as well :-) ).
Essentially to some extend being in feature freeze is as if we are under [1]
already.
So will 2a53535af471f4bee9d6cb5b363746b8d5ed21dd alone be good in your opinion?
Or will it need more and if so what would be the minimal set of your changes.
[1]: https://wiki.ubuntu.com/StableReleaseUpdates
--
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1740219
Title:
static linux-user ARM emulation has several-second startup time
Status in QEMU:
New
Status in qemu package in Ubuntu:
Triaged
Bug description:
static linux-user emulation has several-second startup time
My problem: I'm a Parabola packager, and I'm updating our
qemu-user-static package from 2.8 to 2.11. With my new
statically-linked 2.11, running `qemu-arm /my/arm-chroot/bin/true`
went from taking 0.006s to 3s! This does not happen with the normal
dynamically linked 2.11, or the old static 2.8.
What happens is it gets stuck in
`linux-user/elfload.c:init_guest_space()`. What `init_guest_space`
does is map 2 parts of the address space: `[base, base+guest_size]`
and `[base+0xffff0000, base+0xffff0000+page_size]`; where it must find
an acceptable `base`. Its strategy is to `mmap(NULL, guest_size,
...)` decide where the first range is, and then check if that
+0xffff0000 is also available. If it isn't, then it starts trying
`mmap(base, ...)` for the entire address space from low-address to
high-address.
"Normally," it finds an accaptable `base` within the first 2 tries.
With a static 2.11, it's taking thousands of tries.
----
Now, from my understanding, there are 2 factors working together to
cause that in static 2.11 but not the other builds:
- 2.11 increased the default `guest_size` from 0xf7000000 to 0xffff0000
- PIE (and thus ASLR) is disabled for static builds
For some reason that I don't understand, with the smaller
`guest_size` the initial `mmap(NULL, guest_size, ...)` usually
returns an acceptable address range; but larger `guest_size` makes it
consistently return a block of memory that butts right up against
another already mapped chunk of memory. This isn't just true on the
older builds, it's true with the 2.11 builds if I use the `-R` flag to
shrink the `guest_size` back down to 0xf7000000. That is with
linux-hardened 4.13.13 on x86-64.
So then, it it falls back to crawling the entire address space; so it
tries base=0x00001000. With ASLR, that probably succeeds. But with
ASLR being disabled on static builds, the text segment is at
0x60000000; which is does not leave room for the needed
0xffff1000-size block before it. So then it tries base=0x00002000.
And so on, more than 6000 times until it finally gets to and passes
the text segment; calling mmap more than 12000 times.
----
I'm not sure what the fix is. Perhaps try to mmap a continuous chunk
of size 0xffff1000, then munmap it and then mmap the 2 chunks that we
actually need. The disadvantage to that is that it does not support
the sparse address space that the current algorithm supports for
`guest_size < 0xffff0000`. If `guest_size < 0xffff0000` *and* the big
mmap fails, then it could fall back to a sparse search; though I'm not
sure the current algorithm is a good choice for it, as we see in this
bug. Perhaps it should inspect /proc/self/maps to try to find a
suitable range before ever calling mmap?
To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1740219/+subscriptions
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time,
ChristianEhrhardt <=
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, LukeShu, 2018/04/03
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, ChristianEhrhardt, 2018/04/04
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, Peter Maydell, 2018/04/04
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, ChristianEhrhardt, 2018/04/04
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, ChristianEhrhardt, 2018/04/05
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, LukeShu, 2018/04/05
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, ChristianEhrhardt, 2018/04/06
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, Launchpad Bug Tracker, 2018/04/09
- [Qemu-devel] [Bug 1740219] Re: static linux-user ARM emulation has several-second startup time, Thomas Huth, 2018/04/26