[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity struc
From: |
Haozhong Zhang |
Subject: |
Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM |
Date: |
Thu, 22 Feb 2018 09:40:00 +0800 |
User-agent: |
NeoMutt/20171027 |
On 02/21/18 14:55 +0100, Igor Mammedov wrote:
> On Tue, 20 Feb 2018 17:17:58 -0800
> Dan Williams <address@hidden> wrote:
>
> > On Tue, Feb 20, 2018 at 6:10 AM, Igor Mammedov <address@hidden> wrote:
> > > On Sat, 17 Feb 2018 14:31:35 +0800
> > > Haozhong Zhang <address@hidden> wrote:
> > >
> > >> ACPI 6.2A Table 5-129 "SPA Range Structure" requires the proximity
> > >> domain of a NVDIMM SPA range must match with corresponding entry in
> > >> SRAT table.
> > >>
> > >> The address ranges of vNVDIMM in QEMU are allocated from the
> > >> hot-pluggable address space, which is entirely covered by one SRAT
> > >> memory affinity structure. However, users can set the vNVDIMM
> > >> proximity domain in NFIT SPA range structure by the 'node' property of
> > >> '-device nvdimm' to a value different than the one in the above SRAT
> > >> memory affinity structure.
> > >>
> > >> In order to solve such proximity domain mismatch, this patch build one
> > >> SRAT memory affinity structure for each NVDIMM device with the
> > >> proximity domain used in NFIT. The remaining hot-pluggable address
> > >> space is covered by one or multiple SRAT memory affinity structures
> > >> with the proximity domain of the last node as before.
> > >>
> > >> Signed-off-by: Haozhong Zhang <address@hidden>
> > > If we consider hotpluggable system, correctly implemented OS should
> > > be able pull proximity from Device::_PXM and override any value from SRAT.
> > > Do we really have a problem here (anything that breaks if we would use
> > > _PXM)?
> > > Maybe we should add _PXM object to nvdimm device nodes instead of
> > > massaging SRAT?
> >
> > Unfortunately _PXM is an awkward fit. Currently the proximity domain
> > is attached to the SPA range structure. The SPA range may be
> > associated with multiple DIMM devices and those individual NVDIMMs may
> > have conflicting _PXM properties.
> There shouldn't be any conflict here as NVDIMM device's _PXM method,
> should override in runtime any proximity specified by parent scope.
> (as parent scope I'd also count boot time NFIT/SRAT tables).
>
> To make it more clear we could clear valid proximity domain flag in SPA
> like this:
>
> diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
> index 59d6e42..131bca5 100644
> --- a/hw/acpi/nvdimm.c
> +++ b/hw/acpi/nvdimm.c
> @@ -260,9 +260,7 @@ nvdimm_build_structure_spa(GArray *structures,
> DeviceState *dev)
> */
> nfit_spa->flags = cpu_to_le16(1 /* Control region is strictly for
> management during hot add/online
> - operation */ |
> - 2 /* Data in Proximity Domain field is
> - valid*/);
> + operation */);
>
> /* NUMA node. */
> nfit_spa->proximity_domain = cpu_to_le32(node);
>
> > Even if that was unified across
> > DIMMs it is ambiguous whether a DIMM-device _PXM would relate to the
> > device's control interface, or the assembled persistent memory SPA
> > range.
> I'm not sure what you mean under 'device's control interface',
> could you clarify where the ambiguity comes from?
>
> I read spec as: _PXM applies to address range covered by NVDIMM
> device it belongs to.
>
> As for assembled SPA, I'd assume that it applies to interleaved set
> and all NVDIMMs with it should be on the same node. It's somewhat
> irrelevant question though as QEMU so far implements only
> 1:1:1/SPA:Region Mapping:NVDIMM Device/
> mapping.
>
> My main concern with using static configuration tables for proximity
> mapping, we'd miss on hotplug side of equation. However if we start
> from dynamic side first, we could later complement it with static
> tables if there really were need for it.
This patch affects only the static tables and static-plugged NVDIMM.
For hot-plugged NVDIMMs, guest OSPM still needs to evaluate _FIT to
get the information of the new NVDIMMs including their proximity
domains.
One intention of this patch is to simulate the bare metal as much as
possible. I have been using this patch to develop and test NVDIMM
enabling work on Xen, and think it might be useful for developers of
other OS and hypervisors.
Haozhong
Re: [Qemu-devel] [PATCH] hw/acpi-build: build SRAT memory affinity structures for NVDIMM, no-reply, 2018/02/23