qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] SeaBIOS, FW_CFG_NUMA, and FW_CFG_MAX_CPUS


From: Eduardo Habkost
Subject: Re: [Qemu-devel] SeaBIOS, FW_CFG_NUMA, and FW_CFG_MAX_CPUS
Date: Mon, 23 Jul 2012 16:37:40 -0300
User-agent: Mutt/1.5.21 (2010-09-15)

On Mon, Jul 23, 2012 at 07:25:01PM +0000, Blue Swirl wrote:
> On Mon, Jul 23, 2012 at 7:09 PM, Eduardo Habkost <address@hidden> wrote:
> > On Mon, Jul 23, 2012 at 06:40:51PM +0000, Blue Swirl wrote:
> >> On Fri, Jul 20, 2012 at 8:00 PM, Eduardo Habkost <address@hidden> wrote:
> >> > Hi,
> >> >
> >> > While working at the CPU index vs APIC ID changes, I stumbled upon
> >> > another not-very-well-defined interface between SeaBIOS and QEMU, and I
> >> > would like to clarify the semantics and constraints of some FW_CFG
> >> > entries.
> >> >
> >> > First, the facts/assumptions:
> >> >
> >> > - There's no concept of "CPU index" or "CPU identifier" that SeaBIOS and
> >> >   QEMU agree upon, except for the APIC ID. All SeaBIOS can really see
> >> >   are the CPU APIC IDs, on boot or on CPU hotplug.
> >> > - The APIC ID is already a perfectly good CPU identifier, that is
> >> >   present on bare metal hardware too.
> >> >   - Adding a new kind of "CPU identifier" in addition to the APIC ID
> >> >     would just make things more complex.
> >> >   - The only problem with APIC IDs is that they may not be contiguous.
> >> >
> >> > That said, I would like to clarify the meaning of:
> >> >
> >> > - FW_CFG_MAX_CPUS
> >> >
> >> > What are the basic semantics and expectations about FW_CFG_MAX_CPUS?
> >>
> >> FYI: This originates from Sparc and PPC, it says how many SMP CPUs
> >> there are in the system. There we don't have (at least now) any CPU
> >> IDs and of course no APIC.
> >
> > Aren't you describing FW_CFG_NB_CPUS? If not, what's the difference
> > between FS_CFG_NB_CPUS and FW_CFG_MAX_CPUS on those architectures?
> 
> Yes, sorry. There's no difference.
> 
> >
> > Until now, the only purpose I see for max_cpus/FW_CFG_MAX_CPUS is to
> > allow CPU hotplug. I don't know if there are other use cases where
> > max_cpus/FW_CFG_MAX_CPUS is useful.
> >
> >
> >>
> >> But I have no idea what x86 should use. As a general rule, what would
> >> happen on a real machine should be emulated, but QEMU can also assist
> >> BIOS (for example to skip some complex HW probes).
> >
> > Right now I am divided between two approaches:
> >
> > - In case FW_CFG_MAX_CPUS' only purpose is to allow CPU hotplug, make it
> >   really mean "upper limit to APIC ID values" in x86;
> > - Otherwise, I am inclined to add a FW_CFG_MAX_APIC_ID entry to x86, so
> >   the firmware can (optionally) choose appropriate sizes for its
> >   internal APIC-ID-based data structures.
> 
> One integer does not tell very much.

It tells a lot: it contains exactly 32 perfectly good bits of
information.  ;-)

Jokes aside: one integer like that would be very helpful to SeaBIOS: by
knowing what's the maximum APIC ID value it would ever see, it can build
ACPI tables and support CPU hotplug without having to maintain and
update Processor ID => APIC ID tables inside the ACPI ASL code (that's
very hard to debug).

> 
> >
> >>
> >> > Considering that the APIC IDs may not be contiguous, is it supposed to
> >> > be:
> >> >
> >> > a) the maximum number of CPUs that will be ever online, doesn't matter
> >> >    their APIC IDs, or
> >> > b) a value so that every CPU has APIC ID < MAX_CPUS.
> >> >
> >> > A practical example: suppose we have a machine with 18 CPUs with the
> >> > following APIC IDs: 0x00, 0x01, 0x02, 0x04, 0x05, 0x06, 0x08, 0x09,
> >> > 0x0a, 0x10, 0x11, 0x12, 0x14, 0x15, 0x16, 0x18, 0x19, 0x1a.
> >> >
> >> > (That's the expected result for a machine with 2 sockets, 3 cores per
> >> > socket, 3 threads per core.)
> >> >
> >> > In that case, should FW_CFG_MAX_CPUS be: a) 18, or b) 27 (0x1b)?
> >> >
> >> > If it should be 18, it will require additional work on SeaBIOS to make:
> >> > - CPU hotplug work
> >> > - SRAT/MADT/SSDT tables be built with Processor ID != APIC ID
> >> > - SRAT/MADT/SSDT tables be kept stable if the system is hibernated and
> >> >   resumed after a CPU is hot-plugged.
> >> >
> >> > (Probably in that case I would suggest introducing a FW_CFG_MAX_APIC_ID
> >> > entry, so that SeaBIOS can still build the ACPI tables more easily).
> >> >
> >> >
> >> > - FW_CFG_NUMA
> >> >
> >> > The first problem with FW_CFG_NUMA is that it depends on FW_CFG_MAX_CPUS
> >> > (so it inherits the same questions above). The second is that
> >> > FW_CFG_NUMA is a CPU-based table, but there's nothing SeaBIOS can use to
> >> > know what CPUs FW_CFG_NUMA is refering to, except for the APIC IDs. So,
> >> > should FW_CFG_NUMA be indexed by APIC IDs?
> >> >
> >> >
> >> > - My proposal:
> >> >
> >> > My proposal is to try to keep things simple, and just use the following
> >> > rule:
> >> >
> >> >  - Never have a CPU with APIC ID > FW_CFG_MAX_CPUS.
> >> >
> >> > This way:
> >> > - The SeaBIOS ACPI code can be kept simple.
> >> > - The current CPU hotplug interface can work as-is (up to 256 CPUs),
> >> >   based on APIC IDs.
> >> > - The current FW_CFG_NUMA interface can work as-is, indexed by APIC IDs.
> >> > - The ACPI tables can be easily kept stable between hibernate and
> >> >   resume, after CPU hotplug.
> >> >
> >> > This is the direction I am trying to go, and I am sending this just to
> >> > make sure nobody is against it, and to not surprise anybody when I send
> >> > a QEMU patch to make FW_CFG_MAX_CPUS be based on APIC IDs.
> >> >
> >> >
> >> > My second proposal would be to introduce a FW_CFG_MAX_APIC_ID entry, so
> >> > the SeaBIOS ACPI code can be kept simple.
> >> >
> >> > My third proposal would be to introduce a FW_CFG CPU Index => APIC ID
> >> > table, but I really wouldn't like to introduce a new type of CPU
> >> > identifier to be used between QEMU and SeaBIOS, when the APIC ID is a
> >> > perfectly good unique CPU identifier that already exists in bare metal
> >> > hardware.
> >> >
> >> > --
> >> > Eduardo
> >> >
> >>
> >
> > --
> > Eduardo
> 

-- 
Eduardo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]