qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2


From: Gleb Natapov
Subject: Re: [Qemu-devel] Anyone seeing huge slowdown launching qemu with Linux 2.6.35?
Date: Wed, 4 Aug 2010 18:48:30 +0300

On Wed, Aug 04, 2010 at 05:31:12PM +0200, Alexander Graf wrote:
> 
> On 04.08.2010, at 17:25, Gleb Natapov wrote:
> 
> > On Wed, Aug 04, 2010 at 09:57:17AM -0500, Anthony Liguori wrote:
> >> On 08/04/2010 09:51 AM, David S. Ahern wrote:
> >>> 
> >>> On 08/03/10 12:43, Avi Kivity wrote:
> >>>> libguestfs does not depend on an x86 architectural feature.
> >>>> qemu-system-x86_64 emulates a PC, and PCs don't have -kernel.  We should
> >>>> discourage people from depending on this interface for production use.
> >>> That is a feature of qemu - and an important one to me as well. Why
> >>> should it be discouraged? You end up at the same place -- a running
> >>> kernel and in-ram filesystem; why require going through a bootloader
> >>> just because the hardware case needs it?
> >> 
> >> It's smoke and mirrors.  We're still providing a boot loader it's
> >> just a little tiny one that we've written soley for this purpose.
> >> 
> >> And it works fine for production use.  The question is whether we
> >> ought to be aggressively optimizing it for large initrd sizes.  To
> >> be honest, after a lot of discussion of possibilities, I've come to
> >> the conclusion that it's just not worth it.
> >> 
> >> There are better ways like using string I/O and optimizing the PIO
> >> path in the kernel.  That should cut down the 1s slow down with a
> >> 100MB initrd by a bit.  But honestly, shaving a couple hundred ms
> >> further off the initrd load is just not worth it using the current
> >> model.
> >> 
> > The slow down is not 1s any more. String PIO emulation had many bugs
> > that were fixed in 2.6.35. I verified how much time it took to load 100M
> > via fw_cfg interface on older kernel and on 2.6.35. On older kernels on
> > my machine it took ~2-3 second on 2.6.35 it took 26s. Some optimizations
> > that was already committed make it 20s. I have some code prototype that
> > makes it 11s. I don't see how we can get below that, surely not back to
> > ~2-3sec.
> 
> What exactly is the reason for the slowdown? It can't be only boundary and 
> permission checks, right?
> 
> 
The big part of slowdown right now is that write into memory is done
for each byte. It means for each byte we call kvm_write_guest() and
kvm_mmu_pte_write(). The second call is needed in case memory, instruction
is trying to write to, is shadowed. Previously we didn't checked for
that at all. This can be mitigated by introducing write cache and do
combined writes into the memory and unshadow the page if there is more
then one write into it. This optimization saves ~10secs. Currently string
emulation enter guest from time to time to check if event injection is
needed and read from userspace is done in 1K chunks, not 4K like it was,
but when I made reads to be 4K and disabled guest reentry I haven't seen
any speed improvements worth talking about.

--
                        Gleb.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]