qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Regression (?) due to c4177479 ('spapr: make sure RMA i


From: Nishanth Aravamudan
Subject: Re: [Qemu-devel] Regression (?) due to c4177479 ('spapr: make sure RMA is in first mode of first memory node')
Date: Thu, 17 Apr 2014 16:03:23 -0700
User-agent: Mutt/1.5.21 (2010-09-15)

On 18.04.2014 [08:46:55 +1000], Benjamin Herrenschmidt wrote:
> On Fri, 2014-04-18 at 08:43 +1000, Alexey Kardashevskiy wrote:
> > On 04/18/2014 06:25 AM, Nishanth Aravamudan wrote:
> > > Hi Alexey,
> > > 
> > > Prior to the $SUBJECT commit, we could present memoryless node0s to
> > > guests. Now, we indicate that we don't have the requisite 128M for the
> > > RMA if node 0 has no memory. Note that a memoryless node0 is possible
> > > under PowerVM (but not predictably present) so I was hoping to use KVM
> > > to test relevant fixes for memoryless nodes.
> > > 
> > > I think this change is a misinterpretation of the PAPR standard, though.
> > > Yes, the RMA must be in the first block of memory, but that isn't
> > > necessarily on node 0. The topology of a PAPR-compliant guest does not
> > > require a node 0 (and in fact, under PowerVM, Linux doesn't actually
> > > require node 0 either, but it would under KVM).
> > > 
> > > Thoughts? I suppose it's fine to say that node 0 must be sufficiently
> > > populated under KVM -- there's not really a reason to not have memory on
> > > a given node (except maybe ballooning). I can keep the commit reverted
> > > locally for testing purposes. Just wanted to see if the semantic change
> > > was intentional.
> > 
> > 
> > PAPR spec 2.7:
> > C.6.6 Memory Node
> > ===
> > This section defines the PAPR modifications to the OF /memory node. In
> > PAPR, the memory allocated to an OS image
> > may be divided into a number of allocation units called ???regions??? or
> > ???Logical Memory Blocks (LMB). An OS image
> > may be dynamically allocated additional regions or may be asked to release
> > regions. Each LMB is either represented in
> > the device tree by its own /memory node or by an entry in
> > /ibm,dynamic-reconfiguration-memory nodes
> > (see Section C.6.6.2??? ???ibm,dynamic-reconfiguration-memory?????? on page 
> > 1089).
> > The /memory node that refers to the
> > storage starting at real address zero (???reg??? property starting at the 
> > value
> > zero) always remains allocated to an OS image.
> > 
> > The client program is initially loaded into this storage, called the RMA,
> > that is represented by the first value of the
> > ???reg??? property of this first /memory node.
> > ===
> > 
> > The last sentence is why the change was made. It does not say "first
> > populated" node. I am adding Ben as he had very strong opinion about this
> > thing.
> 
> You are confusing device-tree node with NUMA nodes.
> 
> Yes, it must be the LMB at address 0, which is the /memory node, but
> that doesn't have to be NUMA node 0.

Yeah, so I think the check that was added:

-    if (spapr->rma_size > node0_size) {
-        fprintf(stderr, "Error: Numa node 0 has to span the RMA (%#08"HWADDR_PR
-                spapr->rma_size);
-        exit(1);
-    }

incorrectly is checking against node0_size? It should be checking
against the first LMB instead, right?

Thanks,
Nish




reply via email to

[Prev in Thread] Current Thread [Next in Thread]