qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,clu


From: Salil Mehta
Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU {socket,cluster,core,thread}-id property
Date: Tue, 10 Sep 2024 11:01:05 +0000

HI Zhao,

>  From: Zhao Liu <zhao1.liu@intel.com>
>  Sent: Monday, September 9, 2024 4:28 PM
>  To: Salil Mehta <salil.mehta@huawei.com>
>  
>  On Wed, Sep 04, 2024 at 05:37:21PM +0000, Salil Mehta wrote:
>  > Date: Wed, 4 Sep 2024 17:37:21 +0000
>  > From: Salil Mehta <salil.mehta@huawei.com>
>  > Subject: RE: [PATCH RFC V3 01/29] arm/virt,target/arm: Add new ARMCPU
>  > {socket,cluster,core,thread}-id property
>  >
>  > Hi Zhao,
>  >
>  > >  From: zhao1.liu@intel.com <zhao1.liu@intel.com>
>  > >  Sent: Wednesday, September 4, 2024 3:43 PM
>  > >  To: Salil Mehta <salil.mehta@huawei.com>
>  > >
>  > >  Hi Salil,
>  > >
>  > >  On Mon, Aug 19, 2024 at 11:53:52AM +0000, Salil Mehta wrote:
>  > >  > Date: Mon, 19 Aug 2024 11:53:52 +0000  > From: Salil Mehta
>  > > <salil.mehta@huawei.com>  > Subject: RE: [PATCH RFC V3 01/29]
>  > > arm/virt,target/arm: Add new ARMCPU  >
>  > > {socket,cluster,core,thread}-id property
>  > >
>  > >  [snip]
>  > >
>  > >  > >  > NULL); @@ -2708,6 +2716,7 @@ static const CPUArchIdList  > >
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  > >  >   {
>  > >  > >  >       int n;
>  > >  > >  >       unsigned int max_cpus = ms->smp.max_cpus;
>  > >  > >  > +    unsigned int smp_threads = ms->smp.threads;
>  > >  > >  >       VirtMachineState *vms = VIRT_MACHINE(ms);
>  > >  > >  >       MachineClass *mc = MACHINE_GET_CLASS(vms);
>  > >  > >  >
>  > >  > >  > @@ -2721,6 +2730,7 @@ static const CPUArchIdList  > >
>  > > *virt_possible_cpu_arch_ids(MachineState *ms)
>  > >  > >  >       ms->possible_cpus->len = max_cpus;
>  > >  > >  >       for (n = 0; n < ms->possible_cpus->len; n++) {
>  > >  > >  >           ms->possible_cpus->cpus[n].type = ms->cpu_type;
>  > >  > >  > +        ms->possible_cpus->cpus[n].vcpus_count = smp_threads;
>  > >  > >  >           ms->possible_cpus->cpus[n].arch_id =
>  > >  > >  >               virt_cpu_mp_affinity(vms, n);
>  > >  > >  >
>  > >  > >
>  > >  > >  Why @vcpus_count is initialized to @smp_threads? it needs to
>  > > be  > > documented in the commit log.
>  > >  >
>  > >  >
>  > >  > Because every thread internally amounts to a vCPU in QOM and
>  > > which is  > in 1:1 relationship with KVM vCPU. AFAIK, QOM does not
>  > > strictly  > follows any architecture. Once you start to get into
>  > > details of  > threads there are many aspects of shared resources one
>  > > will have to  > consider and these can vary across different
>  > > implementations of  architecture.
>  > >
>  > >  For SPAPR CPU, the granularity of >possible_cpus->cpus[] is "core",
>  > > and for  x86, it's "thread" granularity.
>  >
>  >
>  > We have threads per-core at microarchitecture level in ARM as well.
>  > But each thread appears like a vCPU to OS and AFAICS there are no
>  > special attributes attached to it. SMT can be enabled/disabled at
>  > firmware and should get reflected in the configuration accordingly
>  > i.e. value of *threads-per-core* changes between 1 and 'N'.  This
>  > means 'vcpus_count' has to reflect the correct configuration. But I
>  > think threads lack proper representation in Qemu QOM.
>  
>  In topology related part, SMT (of x86) usually represents the logical
>  processor level. And thread has the same meaning.


Agreed. It is same in ARM as well. The difference could be in how hardware
threads are implemented at microarchitecture level.  Nevertheless, we do
have such virtual configurations, and the meaning of *threads* as-in QOM
topology (socket,cluster,core,thread) is virtualized similar to the hardware
threads in host. And One should be able to configure threads support in the
virtual environment,  regardless whether or not underlying hardware
supports threads. That's my take.

Other aspect is how we then expose these threads to the guest. The guest
kernel (just like host kernel) should gather topology information using
ACPI PPTT Table (This is ARM specific?). Later is populated by the Qemu
(just like by firmware for the host kernel) by making use of the virtual
topology. ARM guest kernel, in absence of PPTT support can detect
presence of hardware threads by reading MT Bit within the MPIDR_EL1
register.

Every property in 'ms->possible_cpus->cpus[n].props should be exactly
same as finalized and part of the MachineState::CpuTopology.
Hence, number of threads-per-core 'vcpus_count'  should not be treated
differently. 

But there is  a catch! (I explained that earlier)


 To change these
>  meanings is also possible, but I think it should be based on the actual use
>  case. we can consider the complexity of the implementation when there is a
>  need.


Agreed. There is no ambiguity in the meaning of hardware threads or the 
virtualized MachineState::CpuTopology. Properties of all the possible vCPUs
should exactly be same as part of MachineState. This includes the number
of threads-per-core.

You mentioned 'vcpus_count' should be 1 but does that mean user can never
specify threads > 1 in virtual configuration for x86?


>  
>  > In Qemu, each vCPU reflects an execution context (which gets uniquely
>  > mapped to KVM vCPU). AFAICS, we only have *CPUState* (Struct
>  ArchCPU)
>  > as a placeholder for this execution context and there is no
>  > *ThreadState* (derived out of Struct CPUState). Hence, we've  to map
>  > all the threads as QOM vCPUs. This means the array of present or
>  > possible CPUs represented by 'struct CPUArchIdList' contains all
>  > execution contexts which actually might be vCPU or a thread. Hence,
>  > usage of
>  > *vcpus_count* seems quite superficial to me frankly.
>  >
>  > Also, AFAICS, KVM does not have the concept of the threads and only
>  > has KVM vCPUs, but you are still allowed to specify the topology with
>  > sockets, dies, clusters, cores, threads in most architectures.
>  
>  There are some uses for topology, such as it affects scheduling behavior,
>  and it affects feature emulation, etc.


True. And we should be flexible at the VMM level. We should let Admin of
the VMM control how he creates the virtual topology which best fits
on the underlying hardware features of the host. This includes, NUMA,
sub-NUMA, cores, hardware, threads, cache topology etc. 


>  
>  > >  And smp.threads means how many threads in one core, so for x86, the
>  > > vcpus_count of a "thread" is 1, and for spapr, the vcpus_count of a
>  "core" equals to smp.threads.
>  >
>  >
>  > Sure, but does the KVM specifies this?
>  
>  At least as you said, KVM (for x86) doesn't consider higher-level topologies
>  at the moment, but that's not to say that it won't in the future, as certain
>  registers do have topology dependencies.


sure. so you mean for x86 virtual topology, smp.threads = 1 always?


>  
>  > and how does these threads map to the QOM vCPU objects or execution
>  context?
>  
>  Each CPU object will create a (software) thread, you can refer the function
>  "kvm_start_vcpu_thread(CPUState *cpu)", which will be called when CPU
>  object realizes.


Yes, sure, and each such QOM vCPU thread and 'struct CPUState' is mapped to
the lowest granularity of execution specified within the QOM virtual topology.
It could be a 'thread' or a 'core'. And all these will run as a KVM vCPU 
scheduled
on some hardware core and maybe hardware thread (if enabled).

So there is no difference across architectures regarding this part. I was trying
to point that in QOM, even the threads will have their own 'struct CPUState'
and each one will be part of the "CPUArchIdList *possible_cpus" maintained
at the MachineState. At this level we loose the relationship information of
the cores and their corresponding threads (given by 'vcpus_count').


>  > AFAICS there is nothing but 'CPUState'
>  > which will be made part of the  possible vCPU list 'struct CPUArchIdList'.
>  
>  As I said, an example is spapr ("spapr_possible_cpu_arch_ids()"), which
>  maps possible_cpu to core object. However, this is a very specific example,
>  and like Igor's slides said, I understand it's an architectural requirement.


I'm sure there must have been some. I'm trying to understand it. Can you
share the slides?


>  
>  > >
>  > >  IIUC, your granularity is still "thread", so that this filed should be 
> 1.
>  >
>  >
>  > Well, again we need more discussion on this. I've stated my concerns
>  > against doing this. User should be allowed to create virtual topology
>  > which will include 'threads' as one of the parameter.
>  >
>  
>  I don't seem to understand...There is a "threads" parameter in -smp, does
>  this not satisfy your use case?

It certainly does. But this is what should get reflected in the 'vcpus_count' 
as well? 


Best regards
Salil

>  
>  Regards,
>  Zhao
>  




reply via email to

[Prev in Thread] Current Thread [Next in Thread]