[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 33/35] pc: ACPI BIOS: reserve SRAT entry for hot
From: |
Igor Mammedov |
Subject: |
Re: [Qemu-devel] [PATCH 33/35] pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole |
Date: |
Tue, 15 Apr 2014 17:55:22 +0200 |
On Tue, 15 Apr 2014 14:37:01 +0800
Hu Tao <address@hidden> wrote:
> On Mon, Apr 14, 2014 at 06:44:42PM +0200, Igor Mammedov wrote:
> > On Mon, 14 Apr 2014 15:25:01 +0800
> > Hu Tao <address@hidden> wrote:
> >
> > > On Fri, Apr 04, 2014 at 03:36:58PM +0200, Igor Mammedov wrote:
> > > > Needed for Windows to use hotplugged memory device, otherwise
> > > > it complains that server is not configured for memory hotplug.
> > > > Tests shows that aftewards it uses dynamically provided
> > > > proximity value from _PXM() method if available.
> > > >
> > > > Signed-off-by: Igor Mammedov <address@hidden>
> > > > ---
> > > > hw/i386/acpi-build.c | 14 ++++++++++++++
> > > > 1 file changed, 14 insertions(+)
> > > >
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index ef89e99..012b100 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1197,6 +1197,8 @@ build_srat(GArray *table_data, GArray *linker,
> > > > uint64_t curnode;
> > > > int srat_start, numa_start, slots;
> > > > uint64_t mem_len, mem_base, next_base;
> > > > + PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> > > > + ram_addr_t hotplug_as_size =
> > > > memory_region_size(&pcms->hotplug_memory);
> > > >
> > > > srat_start = table_data->len;
> > > >
> > > > @@ -1261,6 +1263,18 @@ build_srat(GArray *table_data, GArray *linker,
> > > > acpi_build_srat_memory(numamem, 0, 0, 0, MEM_AFFINITY_NOFLAGS);
> > > > }
> > > >
> > > > + /*
> > > > + * Fake entry required by Windows to enable memory hotplug in OS.
> > > > + * Individual DIMM devices override proximity set here via _PXM
> > > > method,
> > > > + * which returns associated with it NUMA node id.
> > > > + */
> > > > + if (hotplug_as_size) {
> > > > + numamem = acpi_data_push(table_data, sizeof *numamem);
> > > > + acpi_build_srat_memory(numamem, pcms->hotplug_memory_base,
> > > > + hotplug_as_size, 0,
> > > > MEM_AFFINITY_HOTPLUGGABLE |
> > > > + MEM_AFFINITY_ENABLED);
> > > > + }
> > > > +
> > >
> > > Hi Igor,
> > >
> > > With the faked entry, memory unplug doesn't work. Entries should be set
> > > up for each node with correct flags(enable, hotpluggable) to make memory
> > > unplug work.
> > Could you be more specific, what and how doesn't work and why there is
> > need for SRAT entries per DIMM?
> > I've briefly tested with your unplug patches and linux seemed be ok with
> > unplug,
> > i.e. device node was removed from /sys after receiving remove notification.
>
>
> Following are fail cases:
>
I did some testing using upstream kernel with hot-remove enabled.
tested only "this patch" case
> ------------------------------------------------------------------------+----------------------------------------------
> guest commands |
> this patch | hacked SRAT
> ------------------------------------------------------------------------+----------------------------------------------
> echo 'online' > /sys/devices/system/memory/memory32/state && \ |
> |
> echo 'offline' > /sys/devices/system/memory/memory32/state |
> fail | success
works for me, but it might/allowed to fail offline since page
migration may fail if memory section or its part is not movable.
> ------------------------------------------------------------------------+----------------------------------------------
> echo 'online' > /sys/devices/system/memory/memory32/state && \ |
> |
> echo 1 > /sys/devices/LNXSYSTM\:00/device\:00/PNP0C80\:00/eject |
> fail | success
the same as #1
> ------------------------------------------------------------------------+----------------------------------------------
> echo 'online_movable' > /sys/devices/system/memory/memory32/state |
> fail[first memory block] | fail
it's linux implementation specific, should be fixed in guest and has
nothing to do with qemu side.
PS: all hot-added memory sections could be onlined with 'online_movable'
in reverse order.
> ------------------------------------------------------------------------+----------------------------------------------
> echo 'online_movable' > /sys/devices/system/memory/memory35/state && \ |
> |
> echo 'offline' > /sys/devices/system/memory/memory35/state |
> success[last memory block] | success
> ------------------------------------------------------------------------+----------------------------------------------
> echo 'online_movable' > /sys/devices/system/memory/memory32/state && \ |
> |
> echo 1 > /sys/devices/LNXSYSTM\:00/device\:00/PNP0C80\:00/eject |
> success[last memory block] | success
> ------------------------------------------------------------------------+----------------------------------------------
movable memory section is guarantied to succeed, hence no issue.
Reading upstream kernel code, it honors PNP0C80._PXM value and overrides
anything that was provided in SRAT. So I don't see why hacked SRAT
would make any difference.
Could you verify with the latest upstream kernel?
PS: do not forget to check "removable" attribute before marking case as failed.
One time, I've seen guest panic on "successful" eject of ZONE_NORMAL memory
section since it was still using it (so there is still hot-remove bugs in
kernel) and "removable" doesn't guarantee anything for ZONE_NORMAL memory
section.
>
> Hacke SRAT memory entry:
>
> PXM: 0
> range: 4G ~ 4G + 512M
> flags: Enabled Hot-Pluggable
>
> PXM: 1
> range: 4G + 512M ~ 5G
> flags: Enabled Hot-Pluggable
>
> So I think we should add maxmem to -numa and build SRAT accordingly.
> But there is something I'm not sure with. I added dimm in node 1, but
> it's memory range fell in node 0. Users always can cause the mismatch
> with dimm,start,node.
>
>
>
> This is the relevent part in command line:
>
> qemu command line: -m 512M,slots=4,maxmem=2G \
> -object memory-ram,id=foo,size=512M \
> -numa node,id=n0,mem=256M -numa node,id=n1,mem=256M
>
> (qemu monitor) device_add dimm,id=d0,memdev=foo,node=1
>
> >
> > >
> > > Windows has not been tested yet. I encountered a problem that there is
> > > no SRAT in Windows so even memory hotplug doesn't work. (but there is
> > > in Linux with the same configuration).
> > For Windows to work one needs to add "-numa node" CLI option so that
> > SRAT would be exposed to guest.
>
> Thanks. I need to double-check.
>
> > Paolo suggested to enable -numa node by default, I guess we can do it
> > once NUMA re-factoring is merged.
> >
> > That said, I haven't found any information that Windows supports
> > memory hot-remove. Google tells that only hot-add is supported
> > for up to WS2008R2. I've tested WS2012R2, it doesn't work either,
> > i.e. it sees but ignores Notify request.
> >
> > >
> > > Regards,
> > > Hu Tao
> > >
--
Regards,
Igor
- [Qemu-devel] [PATCH 27/35] pc: migrate piix4 & ich9 MemHotplugState, (continued)
- [Qemu-devel] [PATCH 27/35] pc: migrate piix4 & ich9 MemHotplugState, Igor Mammedov, 2014/04/04
- [Qemu-devel] [PATCH 26/35] acpi:ich9: add memory hotplug handling, Igor Mammedov, 2014/04/04
- [Qemu-devel] [PATCH 32/35] pc: ACPI BIOS: use enum for defining memory affinity flags, Igor Mammedov, 2014/04/04
- [Qemu-devel] [PATCH 29/35] pc: ACPI BIOS: punch holes in PCI0._CRS for memory hotplug IO region, Igor Mammedov, 2014/04/04
- [Qemu-devel] [PATCH 33/35] pc: ACPI BIOS: reserve SRAT entry for hotplug mem hole, Igor Mammedov, 2014/04/04
[Qemu-devel] [PATCH 34/35] pc: ACPI BIOS: make GPE.3 handle memory hotplug event on PIIX and Q35 machines, Igor Mammedov, 2014/04/04
[Qemu-devel] [PATCH 31/35] pc: ACPI BIOS: implement memory hotplug interface, Igor Mammedov, 2014/04/04
Message not available
[Qemu-devel] [PATCH 30/35] pc: ACPI BIOS: name CPU hotplug ACPI0004 device, Igor Mammedov, 2014/04/04