qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 05/14] vl: handle "-device dimm"


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH v5 05/14] vl: handle "-device dimm"
Date: Tue, 16 Jul 2013 14:00:06 +0200

On Tue, 16 Jul 2013 12:31:46 +0200
Paolo Bonzini <address@hidden> wrote:

> Il 16/07/2013 12:19, Igor Mammedov ha scritto:
> > On Tue, 16 Jul 2013 08:19:48 +0200
> > Paolo Bonzini <address@hidden> wrote:
> > 
> >> Il 16/07/2013 03:27, Hu Tao ha scritto:
> >>>> I think it's the same.  One "-numa mem" option = one "-device dimm"
> >>>> option; both define one range.  Unused memory ranges may remain if you
> >>>> stumble upon a unusable range such as the PCI window.  For example two
> >>>> "-numa mem,size=2G" options would allocate memory from 0 to 2G and from
> >>>> 4 to 6G.
> >>>
> >>> So we can drop -dimm if we agree on -numa mem?
> >>
> >> Yes, the point of the "-numa mem" proposal was to avoid the concept of a
> >> "partially initialized device" that you had for DIMMs.
> > I've though -numa mem was for mapping initial memory to numa nodes.
> > It seem wrong to use it for representing dimm device and also limiting
> > possible hotplugged regions to specified at startup ranges.
> 
> It's not for DIMM devices, it is for reserving areas of the address
> space for hot-plugged RAM.  DIMM hotplug is done with "device_add dimm"
> (and you can also use "-numa mem,populated=no,... -device dimm,..." to
> start a VM with hot-unpluggable memory).
There isn't a real need to reserve from ACPI pov, memory device in ACPI could
provide _PXM() method to return mapping to numa node.
And from my testing linux and windows guest are using it, even if is there is
unnecessary mapping in SRAT table overriding SRAT mammping with dynamic one.

It would be better not to use "populated" concept at all. If there is
-device dim on cmd line, then it populated and for hotplugged dimm
all necessary information could be generated dynamically.

> > we can leave -numa for initial memory mapping and manage of the mapping
> > of hotpluggable regions with -device dimm,node=X,size=Y.
> > 
> > It that case command line -device dimm will provide a fully initialized
> > dimm device usable at startup (but hot-unplugable) and
> >   (monitor) device_add dimm,,node=X,size=Y
> > would serve hot-plug case.
> > 
> > That way arbitrary sized dimm could be hot-pluged without specifying them
> > at startup, like it's done on bare-metal.
> 
> But the memory ranges need to be specified at startup in the ACPI
> tables, and that's what "-numa mem" is for.
not really, there is caveat with windows, which needs a hotplugable SRAT entry
that tells it max possible limit (otherwise windows sees new dimm device but
refuses to use it saying "server is not configured for hotplug" or something
like this), but as far as such entry exists, windows is happily uses dynamic
_CRS() and _PXM() if they are below that limit (even if a new range is not in
any range defined by SRAT).

And ACPI spec doesn't say that SRAT MUST be populated with hotplug ranges.

It's kind of simplier for bare-metal, where they might do it due to limited
supported DIMM capacity by reserving static entries with max supported ranges
per DIMM and know in advance DIMM count for platform. But actual _CRS() anyway
dynamic since plugged in DIMM could have a smaller capacity then supported
max for slot.

To summarize ACPI + windows limitations:
 - ACPI needs to pre-allocate memory devices, i.e. number of possible increments
   OSPM could utilize. It might be possible to overcome limitation be using
   Load() or LoadTable() in runtime, but I haven't tried it.
 - Windows needs to know max supported limit, a fake entry in SRAT from RamSize
   to max_mem works nicely there (tested with ws2008r2DC and ws2012DC).

That's why I was proposing to extend "-m" option for "slots" number (i.e. nr of
memory devices) and 'max_mem' to make Windows happy and cap mgmt tools from
going over initially configured limit.

then -device dimm could be used for hotpluggable mem available at startup
and device_add fir adding more dimms with user defined sizes to desired nodes
at runtime.

Works nice without any need for 'populated=xxx' and predefined ranges.

PS:
I'll be able to post more or less usable RFC that does it on top of mst's ACPI
tables in QEMU by the end of this week.

> 
> > In addition command line -device would be used in migration case to describe
> > already hot-plugged dimms on target.
> 
> Yep.
> 
> Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]