qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] smbios: make memory device size configurable per Machine


From: Michael S. Tsirkin
Subject: Re: [PATCH] smbios: make memory device size configurable per Machine
Date: Sat, 20 Jul 2024 15:36:10 -0400

On Thu, Jul 11, 2024 at 03:05:11PM +0200, Igor Mammedov wrote:
> On Thu, 11 Jul 2024 07:13:27 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Thu, Jul 11, 2024 at 09:48:22AM +0200, Igor Mammedov wrote:
> > > Currently SMBIOS maximum memory device chunk is capped at 16Gb,
> > > which is fine for the most cases (QEMU uses it to describe initial
> > > RAM (type 17 SMBIOS table entries)).
> > > However when starting guest with terabytes of RAM this leads to
> > > too many memory device structures, which eventually upsets linux
> > > kernel as it reserves only 64K for these entries and when that
> > > border is crossed out it runs out of reserved memory.
> > > 
> > > Instead of partitioning initial RAM on 16Gb chunks, use maximum
> > > possible chunk size that SMBIOS spec allows[1]. Which lets
> > > encode RAM in Mb units in uint32_t-1 field (upto 2047Tb).
> > > As result initial RAM will generate only one type 17 structure
> > > until host/guest reach ability to use more RAM in the future.
> > > 
> > > Compat changes:
> > > We can't unconditionally change chunk size as it will break
> > > QEMU<->guest ABI (and migration). Thus introduce a new machine class
> > > field that would let older versioned machines to use 16Gb chunks
> > > while new machine type could use maximum possible chunk size.
> > > 
> > > While it might seem to be risky to rise max entry size this much
> > > (much beyond of what current physical RAM modules support),
> > > I'd not expect it causing much issues, modulo uncovering bugs
> > > in software running within guest. And those should be fixed
> > > on guest side to handle SMBIOS spec properly, especially if
> > > guest is expected to support so huge RAM configs.
> > > In worst case, QEMU can reduce chunk size later if we would
> > > care enough about introducing a workaround for some 'unfixable'
> > > guest OS, either by fixing up the next machine type or
> > > giving users a CLI option to customize it.
> > > 
> > > 1) SMBIOS 3.1.0 7.18.5 Memory Device — Extended Size
> > > 
> > > PS:
> > > * tested on 8Tb host with RHEL6 guest, which seems to parse
> > >   type 17 SMBIOS table entries correctly (according to 'dmidecode').
> > > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  include/hw/boards.h |  4 ++++
> > >  hw/arm/virt.c       |  1 +
> > >  hw/core/machine.c   |  1 +
> > >  hw/i386/pc_piix.c   |  1 +
> > >  hw/i386/pc_q35.c    |  1 +
> > >  hw/smbios/smbios.c  | 11 ++++++-----
> > >  6 files changed, 14 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/include/hw/boards.h b/include/hw/boards.h
> > > index ef6f18f2c1..48ff6d8b93 100644
> > > --- a/include/hw/boards.h
> > > +++ b/include/hw/boards.h
> > > @@ -237,6 +237,9 @@ typedef struct {
> > >   *    purposes only.
> > >   *    Applies only to default memory backend, i.e., explicit memory 
> > > backend
> > >   *    wasn't used.
> > > + * @smbios_memory_device_size:
> > > + *    Default size of memory device,
> > > + *    SMBIOS 3.1.0 "7.18 Memory Device (Type 17)"  
> > 
> > Maybe it would be better to just make this a boolean,
> > and put the spec related logic in smbios.c ?
> > WDYT?
> 
> Using bool here, seems awkward to me,
> i.e. not clear semantics and compat handling would be
> complicated as well.
> 
> And if we have to expose it someday to users,
> it would be logical to make it machine property.
> Given it's used not only by x86, having it as value
> here lets each machine to customize if necessary
> using well established pattern (incl. compat machinery)
> 
> 
> > >   */
> > >  struct MachineClass {
> > >      /*< private >*/
> > > @@ -304,6 +307,7 @@ struct MachineClass {
> > >      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> > >      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> > >      ram_addr_t (*fixup_ram_size)(ram_addr_t size);
> > > +    uint64_t smbios_memory_device_size;
> > >  };
> > >  
> > >  /**
> > > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > > index b0c68d66a3..719e83e6a1 100644
> > > --- a/hw/arm/virt.c
> > > +++ b/hw/arm/virt.c
> > > @@ -3308,6 +3308,7 @@ DEFINE_VIRT_MACHINE_AS_LATEST(9, 1)
> > >  static void virt_machine_9_0_options(MachineClass *mc)
> > >  {
> > >      virt_machine_9_1_options(mc);
> > > +    mc->smbios_memory_device_size = 16 * GiB;
> > >      compat_props_add(mc->compat_props, hw_compat_9_0, hw_compat_9_0_len);
> > >  }
> > >  DEFINE_VIRT_MACHINE(9, 0)
> > > diff --git a/hw/core/machine.c b/hw/core/machine.c
> > > index bc38cad7f2..3cfdaec65d 100644
> > > --- a/hw/core/machine.c
> > > +++ b/hw/core/machine.c
> > > @@ -1004,6 +1004,7 @@ static void machine_class_init(ObjectClass *oc, 
> > > void *data)
> > >      /* Default 128 MB as guest ram size */
> > >      mc->default_ram_size = 128 * MiB;
> > >      mc->rom_file_has_mr = true;
> > > +    mc->smbios_memory_device_size = 2047 * TiB;
> > >  
> > >      /* numa node memory size aligned on 8MB by default.
> > >       * On Linux, each node's border has to be 8MB aligned  
> > 
> > 
> > 
> > All these values really should be documented.
> It's in commit message, but right I'll document value here
> on respin so it would be easier for reader to see where it
> comes from.
> 
> > And I feel 
> ???

sorry. what I said above. can we find some place where it is
not awkward to quote spec?

> > 
> > 
> > 
> > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> > > index 9445b07b4f..d9e69243b4 100644
> > > --- a/hw/i386/pc_piix.c
> > > +++ b/hw/i386/pc_piix.c
> > > @@ -495,6 +495,7 @@ static void 
> > > pc_i440fx_machine_9_0_options(MachineClass *m)
> > >      pc_i440fx_machine_9_1_options(m);
> > >      m->alias = NULL;
> > >      m->is_default = false;
> > > +    m->smbios_memory_device_size = 16 * GiB;
> > >  
> > >      compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
> > >      compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
> > > diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> > > index 71d3c6d122..9d108b194e 100644
> > > --- a/hw/i386/pc_q35.c
> > > +++ b/hw/i386/pc_q35.c
> > > @@ -374,6 +374,7 @@ static void pc_q35_machine_9_0_options(MachineClass 
> > > *m)
> > >      PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> > >      pc_q35_machine_9_1_options(m);
> > >      m->alias = NULL;
> > > +    m->smbios_memory_device_size = 16 * GiB;
> > >      compat_props_add(m->compat_props, hw_compat_9_0, hw_compat_9_0_len);
> > >      compat_props_add(m->compat_props, pc_compat_9_0, pc_compat_9_0_len);
> > >      pcmc->isa_bios_alias = false;
> > > diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
> > > index 3b7703489d..a394514264 100644
> > > --- a/hw/smbios/smbios.c
> > > +++ b/hw/smbios/smbios.c
> > > @@ -1093,6 +1093,7 @@ static bool smbios_get_tables_ep(MachineState *ms,
> > >                         Error **errp)
> > >  {
> > >      unsigned i, dimm_cnt, offset;
> > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > >      ERRP_GUARD();
> > >  
> > >      assert(ep_type == SMBIOS_ENTRY_POINT_TYPE_32 ||
> > > @@ -1123,12 +1124,12 @@ static bool smbios_get_tables_ep(MachineState *ms,
> > >      smbios_build_type_9_table(errp);
> > >      smbios_build_type_11_table();
> > >  
> > > -#define MAX_DIMM_SZ (16 * GiB)
> > > -#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? MAX_DIMM_SZ \
> > > -                                        : ((current_machine->ram_size - 
> > > 1) % MAX_DIMM_SZ) + 1)
> > > +#define GET_DIMM_SZ ((i < dimm_cnt - 1) ? mc->smbios_memory_device_size \
> > > +    : ((current_machine->ram_size - 1) % mc->smbios_memory_device_size) 
> > > + 1)
> > >  
> > > -    dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size, MAX_DIMM_SZ) /
> > > -               MAX_DIMM_SZ;
> > > +    dimm_cnt = QEMU_ALIGN_UP(current_machine->ram_size,
> > > +                             mc->smbios_memory_device_size) /
> > > +               mc->smbios_memory_device_size;
> > >  
> > >      /*
> > >       * The offset determines if we need to keep additional space between
> > > -- 
> > > 2.43.0  
> > 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]