qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn


From: Xiaoyao Li
Subject: Re: [PATCH] cpu: Initialize nr_cores and nr_threads in cpu_common_initfn()
Date: Thu, 5 Dec 2024 19:53:24 +0800
User-agent: Mozilla Thunderbird

On 11/25/2024 5:38 PM, Igor Mammedov wrote:
On Fri, 22 Nov 2024 11:03:17 -0500
Xiaoyao Li <xiaoyao.li@intel.com> wrote:

Currently cpu->nr_cores and cpu->nr_threads are initialized in
qemu_init_vcpu(), which is called a bit late in *cpu_realizefn() for
each ARCHes.

x86 arch would like to use nr_cores and nr_threads earlier in its
realizefn(). To serve this purpose, initialize nr_cores and nr_threads
in cpu_common_initfn(), which is earlier than *cpu_realizefn().

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
  hw/core/cpu-common.c | 10 +++++++++-
  system/cpus.c        |  4 ----
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 09c79035949b..6de92ed854bc 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -237,14 +237,22 @@ static void cpu_common_unrealizefn(DeviceState *dev)
  static void cpu_common_initfn(Object *obj)
  {
      CPUState *cpu = CPU(obj);
+    Object *machine = qdev_get_machine();
+    MachineState *ms;
gdb_init_cpu(cpu);
      cpu->cpu_index = UNASSIGNED_CPU_INDEX;
      cpu->cluster_index = UNASSIGNED_CLUSTER_INDEX;
      /* user-mode doesn't have configurable SMP topology */
-    /* the default value is changed by qemu_init_vcpu() for system-mode */
      cpu->nr_cores = 1;
      cpu->nr_threads = 1;
+#ifndef CONFIG_USER_ONLY
+    if (object_dynamic_cast(machine, TYPE_MACHINE)) {
+        ms = MACHINE(machine);
+        cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
+        cpu->nr_threads = ms->smp.threads;
+    }
+#endif

Can't say, that I'm fond of adding/moving hack to access MachineState
from CPU context. Granted we did/still do it elsewhere, But I'd rather
prefer getting rid of those remnants that access globals.
It's basically technical debt we are carrying since 2009 (dc6b1c09849).
Moving that around doesn't help with getting rid of arbitrary access to globals.

As Paolo've noted there are other ways to set cores/threads,
albeit at expense of adding more code. And that could be fine
if it's done within expected cpu initialization flow.

Instead of accessing MachineState directly from CPU code (which is
essentially a layer violation), I'd suggest to set cores_nr/threads_nr
from pre_plug handler (which is machine code).
We do similar thing for nr_dies/nr_modules already, and we should do
same for cores/trheads.

Quick hack would be do the same for cores/threads in x86_cpu_pre_plug(),
and make qemu_init_vcpu() conditional to avoid touching other targets/machines.

I'd even ack that, however that's just leaves us with the same
old technical debt. So I'd like to coax a promise to fix it properly
(i.e. get rid of access to machine from CPU code).

(here goes typical ask to rewrite whole QEMU before doing thing you
actually need)

To do that we would need to:
   1. audit usage of cpu->nr_cores/cpu->nr_threads across QEMU, to figure out
      what targets/machines need them
   2. then add pre_plug() handlers to those machines to set them.
   3. In the end get rid of initializing them in cpu_common_initfn().

here is the update:

For cpu->nr_cores, it's only used by x86 ARCH. We can remove it and maintain one for x86 separately.

For cpu->nr_threads, besides x86, it's also used by

1) hw/mips/malta.c

    env->mvp->CP0_MVPConf0 = deposit32(env->mvp->CP0_MVPConf0,
                                       CP0MVPC0_PTC, 8,
                                       smp_cpus * cs->nr_threads - 1);

2) target/mips/tcg/sysemu

    vpe_idx = tc_idx / cs->nr_threads;
    *tc = tc_idx % cs->nr_threads;

3) target/ppc/compat.c

    int n_threads = CPU(cpu)->nr_threads;

There are no existing CPU pre_plug() function for above cases, and I don't know how to add it because I know nothing about MIPS/PPC at all.

If desire is still to go with direction, I need someone else to help MIPS/PPC. Or is it OK that only change the X86 implementation to initialize cpu->nr_threads earlier in x86_cpu_pre_plug() and leave other ARCHes as todo?
        

With that done we can then add a common helper to generalize topo config
based on -smp from pre_plug() handlers to reduce duplication caused by
per machine pre_plug handlers.

Or introduce a generic cpu_pre_plug() handler at Machine level and make
_pre_plug call chain to call it (sort of what we do with nested realize calls);

I'd prefer the 1st option (#2) as it explicitly documents what
targets/machines care about cores/threads at expense of some boiler plate code
duplication, instead of blanket generic fallback like we have now (regardless of
if it's actually needed).

      cpu->cflags_next_tb = -1;
/* allocate storage for thread info, initialise condition variables */
diff --git a/system/cpus.c b/system/cpus.c
index 1c818ff6828c..c1547fbfd39b 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -664,10 +664,6 @@ const AccelOpsClass *cpus_get_accel(void)
void qemu_init_vcpu(CPUState *cpu)
  {
-    MachineState *ms = MACHINE(qdev_get_machine());
-
-    cpu->nr_cores = machine_topo_get_cores_per_socket(ms);
-    cpu->nr_threads =  ms->smp.threads;
      cpu->stopped = true;
      cpu->random_seed = qemu_guest_random_seed_thread_part1();





reply via email to

[Prev in Thread] Current Thread [Next in Thread]