(Also cc "machine core" maintainers.)
Hi Xiaoyao,
On Mon, Jan 15, 2024 at 12:18:17PM +0800, Xiaoyao Li wrote:
Date: Mon, 15 Jan 2024 12:18:17 +0800
From: Xiaoyao Li <xiaoyao.li@intel.com>
Subject: Re: [PATCH v7 10/16] i386/cpu: Introduce cluster-id to X86CPU
On 1/15/2024 11:27 AM, Zhao Liu wrote:
On Sun, Jan 14, 2024 at 09:49:18PM +0800, Xiaoyao Li wrote:
Date: Sun, 14 Jan 2024 21:49:18 +0800
From: Xiaoyao Li <xiaoyao.li@intel.com>
Subject: Re: [PATCH v7 10/16] i386/cpu: Introduce cluster-id to X86CPU
On 1/8/2024 4:27 PM, Zhao Liu wrote:
From: Zhuocheng Ding <zhuocheng.ding@intel.com>
Introduce cluster-id other than module-id to be consistent with
CpuInstanceProperties.cluster-id, and this avoids the confusion
of parameter names when hotplugging.
I don't think reusing 'cluster' from arm for x86's 'module' is a good idea.
It introduces confusion around the code.
There is a precedent: generic "socket" v.s. i386 "package".
It's not the same thing. "socket" vs "package" is just software people and
hardware people chose different name. It's just different naming issue.
No, it's a similar issue. Same physical device, different name only.
Furthermore, the topology was introduced for resource layout and silicon
fabrication, and similar design ideas and fabrication processes are fairly
consistent across common current arches. Therefore, it is possible to
abstract similar topological hierarchies for different arches.
however, here it's reusing name issue while 'cluster' has been defined for
x86. It does introduce confusion.
There's nothing fundamentally different between the x86 module and the
generic cluster, is there? This is the reason that I don't agree with
introducing "modules" in -smp.
The direct definition of cluster is the level that is above the "core"
and shares the hardware resources including L2. In this sense, arm's
cluster is the same as x86's module.
then, what about intel implements tile level in the future? why ARM's
'cluster' is mapped to 'module', but not 'tile' ?
This depends on the actual need.
Module (for x86) and cluster (in general) are similar, and tile (for x86)
is used for L3 in practice, so I use module rather than tile to map
generic cluster.
And, it should be noted that x86 module is mapped to the generic cluster,
not to ARM's. It's just that currently only ARM is using the clusters
option in -smp.
I believe QEMU provides the abstract and unified topology hierarchies in
-smp, not the arch-specific hierarchies.
reusing 'cluster' for 'module' is just a bad idea.
Though different arches have different naming styles, but QEMU's generic
code still need the uniform topology hierarchy.
generic code can provide as many topology levels as it can. each ARCH can
choose to use the ones it supports.
e.g.,
in qapi/machine.json, it says,
# The ordering from highest/coarsest to lowest/finest is:
# @drawers, @books, @sockets, @dies, @clusters, @cores, @threads.
This ordering is well-defined...
#
# Different architectures support different subsets of topology
# containers.
#
# For example, s390x does not have clusters and dies, and the socket
# is the parent container of cores.
we can update it to
# The ordering from highest/coarsest to lowest/finest is:
# @drawers, @books, @sockets, @dies, @clusters, @module, @cores,
# @threads.
...but here it's impossible to figure out why cluster is above module,
and even I can't come up with the difference between cluster and module.
#
# Different architectures support different subsets of topology
# containers.
#
# For example, s390x does not have clusters and dies, and the socket
# is the parent container of cores.
#
# For example, x86 does not have drawers and books, and does not support
# cluster.
even if cluster of x86 is supported someday in the future, we can remove the
ordering requirement from above description.
x86's cluster is above the package.
To reserve this name for x86, we can't have the well-defined topology
ordering.
But topology ordering is necessary in generic code, and many
calculations depend on the topology ordering.
I don't think it's a problem to add 'module' to SMPconfiguration.
Adding an option is simple, but however, it is not conducive to the
topology maintenance of QEMU, reusing the existing generic structure
should be the first consideration except when the new level is
fundamentally different.
Thanks,
Zhao