qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init


From: Eduardo Habkost
Subject: Re: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP
Date: Wed, 18 Oct 2017 18:22:40 -0200
User-agent: Mutt/1.9.0 (2017-09-02)

On Wed, Oct 18, 2017 at 04:30:10PM +0100, Daniel P. Berrange wrote:
> On Tue, Oct 17, 2017 at 06:06:35PM +0200, Igor Mammedov wrote:
> > On Tue, 17 Oct 2017 16:07:59 +0100
> > "Daniel P. Berrange" <address@hidden> wrote:
> > 
> > > On Tue, Oct 17, 2017 at 09:27:02AM +0200, Igor Mammedov wrote:
> > > > On Mon, 16 Oct 2017 17:36:36 +0100
> > > > "Daniel P. Berrange" <address@hidden> wrote:
> > > >   
> > > > > On Mon, Oct 16, 2017 at 06:22:50PM +0200, Igor Mammedov wrote:  
> > > > > > Series allows to configure NUMA mapping at runtime using QMP/HMP
> > > > > > interface. For that to happen it introduces a new '-paused' CLI 
> > > > > > option
> > > > > > which allows to pause QEMU before machine_init() is run and
> > > > > > adds new set-numa-node HMP/QMP commands which in conjuction with
> > > > > > info hotpluggable-cpus/query-hotpluggable-cpus allow to configure
> > > > > > NUMA mapping for cpus.    
> > > > > 
> > > > > What's the problem we're seeking solve here compared to what we 
> > > > > currently
> > > > > do for NUMA configuration ?  
> > > > From RHBZ1382425
> > > > "
> > > > Current -numa CLI interface is quite limited in terms that allow map
> > > > CPUs to NUMA nodes as it requires to provide cpu_index values which 
> > > > are non obvious and depend on machine/arch. As result libvirt has to
> > > > assume/re-implement cpu_index allocation logic to provide valid 
> > > > values for -numa cpus=... QEMU CLI option.  
> > > 
> > > In broad terms, this problem applies to every device / object libvirt
> > > asks QEMU to create. For everything else libvirt is able to assign a
> > > "id" string, which is can then use to identify the thing later. The
> > > CPU stuff is different because libvirt isn't able to provide 'id'
> > > strings for each CPU - QEMU generates a psuedo-id internally which
> > > libvirt has to infer. The latter is the same problem we had with
> > > devices before '-device' was introduced allowing 'id' naming.
> > > 
> > > IMHO we should take the same approach with CPUs and start modelling 
> > > the individual CPUs as something we can explicitly create with -object
> > > or -device. That way libvirt can assign names and does not have to 
> > > care about CPU index values, and it all works just the same way as
> > > any other devices / object we create
> > > 
> > > ie instead of:
> > > 
> > >   -smp 8,sockets=4,cores=2,threads=1
> > >   -numa node,nodeid=0,cpus=0-3
> > >   -numa node,nodeid=1,cpus=4-7
> > > 
> > > we could do:
> > > 
> > >   -object numa-node,id=numa0
> > >   -object numa-node,id=numa1
> > >   -object cpu,id=cpu0,node=numa0,socket=0,core=0,thread=0
> > >   -object cpu,id=cpu1,node=numa0,socket=0,core=1,thread=0
> > >   -object cpu,id=cpu2,node=numa0,socket=1,core=0,thread=0
> > >   -object cpu,id=cpu3,node=numa0,socket=1,core=1,thread=0
> > >   -object cpu,id=cpu4,node=numa1,socket=2,core=0,thread=0
> > >   -object cpu,id=cpu5,node=numa1,socket=2,core=1,thread=0
> > >   -object cpu,id=cpu6,node=numa1,socket=3,core=0,thread=0
> > >   -object cpu,id=cpu7,node=numa1,socket=3,core=1,thread=0
> > the follow up question would be where do "socket=3,core=1,thread=0"
> > come from, currently these options are the function of
> > (-M foo -smp ...) and can be queried vi query-hotpluggble-cpus at
> > runtime after qemu parses -M and -smp options.
> 
> NB, I realize my example was open to mis-interpretation. The values I'm
> illustrating here for socket=3,core=1,thread=0 and *not* ID values, they
> are a plain enumeration of values. ie this is saying the 4th socket, the
> 2nd core and the 1st thread.  Internally QEMU might have the 2nd core
> with a core-id of 8, or 7038 or whatever architecture specific numbering
> scheme makes sense, but that's not what the mgmt app gives at the CLI
> level

I believe we have been trying to avoid index numbers to identify
entities as a reaction to the bad experience we had with the
cpu_index/apic_id mess in the past.

An interface using arch-independent socket/core/thread indexes
(not arch-dependent IDs) like you propose in the paragraph above
could be a solution, as long as it is documented very clearly
(and we include automated testing for those constraints).  But
note that this is _not_ how the socket/core/thread IDs on the
"-device *-cpu" and -numa command-line options work today.

Also, this might solve the problem for CPU socket/core/thread
identification, but might not be enough for the messy device
address assignment rules that libvirt needs to duplicate in
src/qemu/qemu_domain_address.c today.

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]