qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH v1] spapr: Memory hot-unplug supp


From: Bharata B Rao
Subject: Re: [Qemu-devel] [Qemu-ppc] [RFC PATCH v1] spapr: Memory hot-unplug support
Date: Wed, 11 Nov 2015 12:50:57 +0530
User-agent: Mutt/1.5.23 (2014-03-12)

On Wed, Nov 11, 2015 at 12:36:30PM +1100, David Gibson wrote:
> On Mon, Oct 26, 2015 at 03:23:05PM +0530, Bharata B Rao wrote:
> > Add support to hot remove pc-dimm memory devices.
> 
> Sorry it's taken me so long to look at this.
> 
> > TODO: In response to memory hot removal operation on a DIMM device,
> > guest kernel might refuse to offline a few LMBs that are part of that 
> > device.
> > In such cases, we will have a DIMM device that has some LMBs online and some
> > LMBs offline. To avoid this situation, drmgr could be enhanced to support
> > a command line option that results in removal of all the LMBs or none.

I am realizing that it might not be possible to support the above notion of
zero or full removal of DIMM unless we have an association of LMBs
to DIMMs. drmgr could be handling unplug requests of multiple DIMMs
at a time and it is really not possible to determine how many LMBs of
which DIMM got off-lined successfully.

With this memory unplug patch that I have posted here and with the existing
drmgr, this is the behaviour:

- If guest kernel refuses to offline some LMBs of the DIMM for which
  hot unplug request is done, such LMBs will remain online while the
  rest of the LMBs from the DIMM will be offline. The DIMM device will
  continue to exist from QEMU point of view and will be removed only when
  all the LMBs belonging to it are removed.

- Since we implement LMB addition/removal by count, during an unplug request,
  drmgr will walk through all the removable/hotplugged DIMMs and will attempt
  to unplug the specified number of LMBs. In addition to other things, this
  involves marking the LMB offline first (via
  /sys/devices/system/memory/memoryXX/online) and then releasing the DRC
  which involves isolating the DRC.

  QEMU will fail isolation request for any LMB DRC that has not been marked
  for removal via prior DIMM device_del operation. In response to this
  failure, drmgr will bring the LMB back online by writing to
  /sys/devices/system/memory/memoryXX/online.

  Thus some LMBs that are unrelated to the current DIMM removal request
  are unnecessarily made offline only to be brought online immediately.

  There is another situation possible here. Let DIMM1 have 4 LMBs out of
  which 2 LMBs became offline in response to unplug request. Now during
  the removal of DIMM2, there is nothing that prevents those 2 LMBs from
  DIMM1 from going away (if guest allows). So we have a situation where
  some LMBs of DIMM1 got offline when running unplug request for DIMM2.
  Though I have not seen this in practice, I believe it is possible.

> 
> Hm.. what would be the end result of such a situation?  We want to
> handle it as gracefully as we can, even if the guest has old tools.
> Is there some way we can detect this failure condition, and re-connect
> the DIMM?
> 
> It does highlight the fact that the PAPR hotplug interface and the
> pc-dimm model don't work together terribly well.  I think we have to

I believe we will have a saner model when PAPR is updated to support
removal of specified number of LMBs starting at a particular DRC index.
With that in place, we could support pc-dimm model removal correctly.

> try to support it for the sake of management layers, but I do wonder
> if we ought to thinkg about an alternative "lmb-pool" backend, where
> the precise location of memory blocks isn't so important.  With some
> thought such a backend might also be useful for paravirt x86.
> 
> Which also makes me think, I wonder if it would be possible to wire up
> a PAPR compatible interface to qemu's balloon backend, since in some
> ways the PAPR memory hotplug model acts more like a balloon (in that
> the guest physical address of removed LMBs isn't usually important to
> the host).
> 
> Still, we need to get the dimm backed model working first, I guess.
> 

So

1. can we live with the current hotunplug behaviour that I described above
  for now or
2. should we put pc-dimm based memory hotunplug on backburner until we get
  PAPR spec changes in place or
3. start exploring lmb-pool model ?

Regards,
Bharata.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]