Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init

From:	Igor Mammedov
Subject:	Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP
Date:	Mon, 23 Oct 2017 10:45:41 +0200

On Fri, 20 Oct 2017 17:53:09 -0200
Eduardo Habkost <address@hidden> wrote:

> On Fri, Oct 20, 2017 at 12:21:30PM +1100, David Gibson wrote:
> > On Thu, Oct 19, 2017 at 02:23:04PM +0200, Paolo Bonzini wrote:  
> > > On 19/10/2017 13:49, David Gibson wrote:  
> > > > Note that describing socket/core/thread tuples as arch independent (or
> > > > even machine independent) is.. debatable.  I mean it's flexible enough
> > > > that most platforms can be fit to that scheme without too much
> > > > straining.  But, there's no arch independent way of defining what each
> > > > level means in terms of its properties.
> > > > 
> > > > So, for example, on spapr - being paravirt - there's no real
> > > > distinction between cores and sockets, how you divide them up is
> > > > completely arbitrary.  
> > > 
> > > Same on x86, actually.
> > > 
> > > It's _common_ that cores on the same socket share L3 cache and that a
> > > socket spans an integer number of NUMA nodes, but it doesn't have to be
> > > that way.
> > > 
> > > QEMU currently enforces the former (if it tells the guest at all that
> > > there is an L3 cache), but not the latter.  
> > 
> > Ok.  Correct me if I'm wrong, but doesn't ACPI describe the NUMA
> > architecture in terms of this thread/core/socket heirarchy?  That's
> > not true for PAPR, where the NUMA topology is described in an
> > independent set of (potentially arbitrarily nested) nodes.  
> 
> On PC, ACPI NUMA information only refer to CPU APIC IDs, which
> identify individual CPU threads; it doesn't care about CPU
> socket/core/thread topology.  If I'm not mistaken, the
> socket/core/thread topology is not represented in ACPI at all.
ACPI does node mapping per logical cpu (thread) in SRAT table,
so virtually we are able to describe insane configurations.
That however doesn't mean that we should go outside of
what real hw does and confuse guest which may have certain
expectations.

Currently for x86 expectations are that cpus are mapped to numa
nodes either by whole cores or whole sockets (AMD and Intel cpus
respectively). In future it might change.


> Some guest OSes, however, may get very confused if they see an
> unexpected NUMA/CPU topology.  IIRC, it was possible to make old
> Linux kernel versions panic by generating a weird topology.

There where bugs that where fixed on QEMU or guest kernel side
when unexpected mapping were present. While we can 'fix' guest
expectation in linux kernel it might be not possible for other
OSes one more reason we shouldn't allow blind assignment by mgmt.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP, (continued)

Prev by Date: Re: [Qemu-devel] [PATCH] build: allow setting a custom GIT binary for transparent proxying
Next by Date: [Qemu-devel] [PULL 0/1] Usb 20171023 patches
Previous by thread: Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP
Next by thread: Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP
Index(es):
- Date
- Thread