qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default


From: Roman Kagan
Subject: Re: [Qemu-devel] [PATCH] i386: turn off l3-cache property by default
Date: Thu, 30 Nov 2017 09:28:29 +0300
User-agent: Mutt/1.9.1 (2017-09-22)

On Wed, Nov 29, 2017 at 06:15:05PM +0100, Paolo Bonzini wrote:
> On 29/11/2017 14:35, Roman Kagan wrote:
> >>
> >>> IMO, the long term solution is to make Linux guests not misbehave
> >>> when we stop lying about the L3 cache.  Maybe we could provide a
> >>> "IPIs are expensive, please avoid them" hint in the KVM CPUID
> >>> leaf?
> > We already have it, it's the hypervisor bit ;)  Seriously, I'm unaware
> > of hypervisors where IPIs aren't expensive.
> > 
> 
> In theory, AMD's AVIC should optimize IPIs to running vCPUs.  Amazon's
> recently posted patches to disable HLT and MWAIT exits might tilt the
> balance in favor of IPIs even for Intel APICv (where sending the IPI is
> expensive, but receiving it isn't).
> 
> Being able to tie this to Amazon's other proposal, the "DEDICATED" CPUID
> bit, would be nice.  My plan was to disable all three of MWAIT/HLT/PAUSE
> when setting the dedicated bit.

Yes the IPI cost can hopefully be mitigated in the case of dedicated and
busy vCPUs.

However, in the max density scenario this doesn't help.

Obviously, in the pipe benchmark scheduling the two ends of the pipe on
different cores is detrimental for performance even on a physical
machine; however, IIUC it was a conscious decision by the scheduler
folks because it provides acceptable latency for mostly-idle systems and
decent performance in more loaded cases.

We wouldn't care about this pipe benchmark numbers per se, because the
latencies are still good for practical purposes.  However, in case of
virtual machines, this extra overhead of remote scheduling in the guest
results in a slight -- circa 15% in our Drupal-based test -- increase of
the host cpu consumption by vcpu threads.  That, in turn, makes the host
cpu overcommit being reached with 15% less VMs (and, once overcommit is
reached, the drupal response latency goes into the sky, so it's
effectively a cut-off for density).

Roman.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]