qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [FIX PATCH] spapr: Fix QEMU abort during mem


From: Daniel Henrique Barboza
Subject: Re: [Qemu-devel] [Qemu-ppc] [FIX PATCH] spapr: Fix QEMU abort during memory unplug
Date: Wed, 19 Jul 2017 15:03:34 -0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1



On 07/19/2017 05:54 AM, Bharata B Rao wrote:
Commit 0cffce56 (hw/ppc/spapr.c: adding pending_dimm_unplugs to
sPAPRMachineState) introduced a new way to track pending LMBs of DIMM
device that is marked for removal. Since this commit we can hit the
assert in spapr_pending_dimm_unplugs_add() in the following situation:

- DIMM device removal fails as the guest doesn't allow the removal.
- Subsequent attempt to remove the same DIMM would hit the assert
   as the corresponding sPAPRDIMMState is still part of the
   pending_dimm_unplugs list.

I've asked Mike if there was any way of knowing if the unplug were successful
in the guest, and aside from the DRC state changes we don't have
any other way. I wanted to propose a simple cleanup of the remaining
sPAPRDIMMState from the failed unplug to avoid this situation altogether.

It appears that we'll have to deal with it with extra logic to cover this.

Fix this by removing the assert and conditionally adding the
sPAPRDIMMState to pending_dimm_unplugs list only when it is not
already present.

I think there is a better place to put this verification. The function
spapr_pending_dimm_unplugs_add is called in two places:

1- in spapr_memory_unplug_request, when it first creates the structure
before starting the DRC unplug requests;

2- in spapr_recover_pending_dimm_state, to restore the DIMM state when
it is lost after a migration for example.

spapr_recover_pending_dimm_state is called as follows from spapr_lmb_release:


sPAPRDIMMState *ds = spapr_pending_dimm_unplugs_find(spapr, PC_DIMM(dev));

    /* This information will get lost if a migration occurs
     * during the unplug process. In this case recover it. */
    if (ds == NULL) {
        ds = spapr_recover_pending_dimm_state(spapr, PC_DIMM(dev));
        /* The DRC being examined by the caller at least must be counted */
        g_assert(ds->nr_lmbs);
    }


As we can see there is a find() call before calling recover(). This means that if you do the verification inside unplugs_add() you're end up doing find() twice when this recover() condition happens. This also means that you're most likely hitting this situation from the regular LMB unplug flow from spapr_memory_unplug_request.

My suggestion would be to move this verification to spapr_memory_unplug_request
like this:

    (...)
    if (local_err) {
        goto out;
    }

/* If there were prior unplugs attempts from this same DIMM that failed, there will be a dimm state object inside spapr->pending_dimm_unplugs list.
       In this case, do not add a new sPAPRDIMMState. */
    ds = spapr_recover_pending_dimm_state(spapr, dimm);
    if (ds == NULL) {
        ds = g_malloc0(sizeof(sPAPRDIMMState));
        ds->nr_lmbs = nr_lmbs;
        ds->dimm = dimm;
        spapr_pending_dimm_unplugs_add(spapr, ds);
    }


I prefer this way because you avoid a g_malloc0 that would be thrown away in
the unplugs_add().

Now, if we decide to stick with the verification inside unplugs_add() like
you proposed ....


Fixes: 0cffce56ae3501c5783d779f97993ce478acf856
Signed-off-by: Bharata B Rao <address@hidden>
---
  hw/ppc/spapr.c | 5 +++--
  1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1cb09e7..990bb2d 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2853,8 +2853,9 @@ static sPAPRDIMMState 
*spapr_pending_dimm_unplugs_find(sPAPRMachineState *s,
  static void spapr_pending_dimm_unplugs_add(sPAPRMachineState *spapr,
                                             sPAPRDIMMState *dimm_state)
  {
-    g_assert(!spapr_pending_dimm_unplugs_find(spapr, dimm_state->dimm));
-    QTAILQ_INSERT_HEAD(&spapr->pending_dimm_unplugs, dimm_state, next);
+    if (!spapr_pending_dimm_unplugs_find(spapr, dimm_state->dimm)) {
+        QTAILQ_INSERT_HEAD(&spapr->pending_dimm_unplugs, dimm_state, next);
+    }
... you'll need to free() the incoming dimm_state object that was allocated
outside of unplugs_add() if you're not going to add it:


+    if (!spapr_pending_dimm_unplugs_find(spapr, dimm_state->dimm)) {
+        QTAILQ_INSERT_HEAD(&spapr->pending_dimm_unplugs, dimm_state, next);
+    }

    else {
        g_free(dimm_state);
    }



Now that I think about it, in this approach you'll waste not just a g_malloc() but also a g_free(). Making the verification inside spapr_memory_unplug_request
like I said above would be my way to go in this case.



Daniel

  }

  static void spapr_pending_dimm_unplugs_remove(sPAPRMachineState *spapr,




reply via email to

[Prev in Thread] Current Thread [Next in Thread]