qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 5/5] spapr: Work around spurious warnings from vfio INTx initia


From: Cédric Le Goater
Subject: Re: [RFC 5/5] spapr: Work around spurious warnings from vfio INTx initialization
Date: Thu, 17 Oct 2019 15:42:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0

On 17/10/2019 10:43, Cédric Le Goater wrote:
> On 17/10/2019 07:42, David Gibson wrote:
>> Traditional PCI INTx for vfio devices can only perform well if using
>> an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
>> if an in kernel irqchip is not available.
>>
>> We usually do have an in-kernel irqchip available for pseries machines
>> on POWER hosts.  However, because the platform allows feature
>> negotiation of what interrupt controller model to use, we don't
>> currently initialize it until machine reset.  vfio_intx_update() is
>> called (first) from vfio_realize() before that, so it can issue a
>> spurious warning, even if we will have an in kernel irqchip by the
>> time we need it.
>>
>> To workaround this, make a call to spapr_irq_update_active_intc() from
>> spapr_irq_init() which is called at machine realize time, before the
>> vfio realize.  This call will be pretty much obsoleted by the later
>> call at reset time, but it serves to suppress the spurious warning
>> from VFIO.
>>
>> Cc: Alex Williamson <address@hidden>
>> Cc: Alexey Kardashevskiy <address@hidden>
>>
>> Signed-off-by: David Gibson <address@hidden>
>> ---
>>  hw/ppc/spapr_irq.c | 11 ++++++++++-
>>  1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
>> index 45544b8976..bb91c61fa0 100644
>> --- a/hw/ppc/spapr_irq.c
>> +++ b/hw/ppc/spapr_irq.c
>> @@ -345,6 +345,14 @@ void spapr_irq_init(SpaprMachineState *spapr, Error 
>> **errp)
>>  
>>      spapr->qirqs = qemu_allocate_irqs(spapr_set_irq, spapr,
>>                                        smc->nr_xirqs + SPAPR_XIRQ_BASE);
>> +
>> +    /*
>> +     * Mostly we don't actually need this until reset, except that not
>> +     * having this set up can cause VFIO devices to issue a
>> +     * false-positive warning during realize(), because they don't yet
>> +     * have an in-kernel irq chip.
>> +     */
>> +    spapr_irq_update_active_intc(spapr);
> 
> This will call the de/activate hooks of the irq chip very early 
> before reset and CAS, and won't call them at reset if the intc
> pointers are the same (xive only for instance). It breaks very 
> obviously the emulated XIVE device which won't reset the OS CAM 
> line with the CPU id values and CPU notification will be broken.
> 
> We have to find something else.

Here is a possible fix for the (re)setting of the OS CAM line. 

Removing spapr_xive_set_tctx_os_cam() is a good thing but this solution
shows some issues in our modeling of hot-plugged CPUS with a reset() 
being called at realize().


Thanks,

C.


>From eb6086645b1db13a8e021e8a5b8a2c6faba875b3 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <address@hidden>
Date: Thu, 17 Oct 2019 15:35:36 +0200
Subject: [PATCH] spapr/xive: Set the OS CAM line at reset
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When a Virtual Processor is scheduled to run on a HW thread, the
hypervisor pushes its identifier in the OS CAM line. When running in
TCG or kernel_irqchip=off, QEMU needs to emulate the same behavior.

Introduce a 'os-cam' property which will be used to set the OS CAM
line at reset.

Complexity arise with hotplug CPUs which need an extra reset of the
intc presenter of the interrupt controller selected by CAS. The sPAPR
IRQ backend is extended with new handler cpu_intc_reset()

spapr_realize_vcpu() is also modified to call the CPU reset after the
creation of the intc presenter.

This is a combo patch which needs to be splitted.

Signed-off-by: Cédric Le Goater <address@hidden>
---
 include/hw/ppc/spapr_irq.h  |  4 ++++
 include/hw/ppc/spapr_xive.h |  1 -
 include/hw/ppc/xics.h       |  1 +
 include/hw/ppc/xive.h       |  5 ++++-
 hw/intc/spapr_xive.c        | 37 ++++++++++++-------------------------
 hw/intc/xics.c              |  5 +++++
 hw/intc/xics_spapr.c        |  8 ++++++++
 hw/intc/xive.c              | 33 +++++++++++++++++++++++++++++----
 hw/ppc/pnv.c                |  3 ++-
 hw/ppc/spapr_cpu_core.c     |  8 ++++++--
 hw/ppc/spapr_irq.c          | 21 +++++++++++++++++++++
 11 files changed, 92 insertions(+), 34 deletions(-)

diff --git a/include/hw/ppc/spapr_irq.h b/include/hw/ppc/spapr_irq.h
index 5e150a667902..78327496c102 100644
--- a/include/hw/ppc/spapr_irq.h
+++ b/include/hw/ppc/spapr_irq.h
@@ -52,6 +52,8 @@ typedef struct SpaprInterruptControllerClass {
      */
     int (*cpu_intc_create)(SpaprInterruptController *intc,
                             PowerPCCPU *cpu, Error **errp);
+    int (*cpu_intc_reset)(SpaprInterruptController *intc, PowerPCCPU *cpu,
+                          Error **errp);
     int (*claim_irq)(SpaprInterruptController *intc, int irq, bool lsi,
                      Error **errp);
     void (*free_irq)(SpaprInterruptController *intc, int irq);
@@ -68,6 +70,8 @@ void spapr_irq_update_active_intc(SpaprMachineState *spapr);
 
 int spapr_irq_cpu_intc_create(SpaprMachineState *spapr,
                               PowerPCCPU *cpu, Error **errp);
+int spapr_irq_cpu_intc_reset(SpaprMachineState *spapr,
+                             PowerPCCPU *cpu, Error **errp);
 void spapr_irq_print_info(SpaprMachineState *spapr, Monitor *mon);
 void spapr_irq_dt(SpaprMachineState *spapr, uint32_t nr_servers,
                   void *fdt, uint32_t phandle);
diff --git a/include/hw/ppc/spapr_xive.h b/include/hw/ppc/spapr_xive.h
index d84bd5c229f0..742b7e834f2a 100644
--- a/include/hw/ppc/spapr_xive.h
+++ b/include/hw/ppc/spapr_xive.h
@@ -57,7 +57,6 @@ typedef struct SpaprXive {
 void spapr_xive_pic_print_info(SpaprXive *xive, Monitor *mon);
 
 void spapr_xive_hcall_init(SpaprMachineState *spapr);
-void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx);
 void spapr_xive_mmio_set_enabled(SpaprXive *xive, bool enable);
 void spapr_xive_map_mmio(SpaprXive *xive);
 
diff --git a/include/hw/ppc/xics.h b/include/hw/ppc/xics.h
index 1e6a9300eb2b..602173c12250 100644
--- a/include/hw/ppc/xics.h
+++ b/include/hw/ppc/xics.h
@@ -161,6 +161,7 @@ void icp_set_mfrr(ICPState *icp, uint8_t mfrr);
 uint32_t icp_accept(ICPState *ss);
 uint32_t icp_ipoll(ICPState *ss, uint32_t *mfrr);
 void icp_eoi(ICPState *icp, uint32_t xirr);
+void icp_reset(ICPState *icp);
 
 void ics_write_xive(ICSState *ics, int nr, int server,
                     uint8_t priority, uint8_t saved_priority);
diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
index fd3319bd3202..e273069c25a9 100644
--- a/include/hw/ppc/xive.h
+++ b/include/hw/ppc/xive.h
@@ -319,6 +319,7 @@ typedef struct XiveTCTX {
     qemu_irq    os_output;
 
     uint8_t     regs[XIVE_TM_RING_COUNT * XIVE_TM_RING_SIZE];
+    uint32_t    os_cam;
 } XiveTCTX;
 
 /*
@@ -414,7 +415,9 @@ void xive_tctx_tm_write(XiveTCTX *tctx, hwaddr offset, 
uint64_t value,
 uint64_t xive_tctx_tm_read(XiveTCTX *tctx, hwaddr offset, unsigned size);
 
 void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
-Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp);
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, uint32_t os_cam,
+                         Error **errp);
+void xive_tctx_reset(XiveTCTX *tctx);
 
 static inline uint32_t xive_nvt_cam_line(uint8_t nvt_blk, uint32_t nvt_idx)
 {
diff --git a/hw/intc/spapr_xive.c b/hw/intc/spapr_xive.c
index ba32d2cc5b0f..71f138512a1c 100644
--- a/hw/intc/spapr_xive.c
+++ b/hw/intc/spapr_xive.c
@@ -205,21 +205,13 @@ void spapr_xive_mmio_set_enabled(SpaprXive *xive, bool 
enable)
     memory_region_set_enabled(&xive->end_source.esb_mmio, false);
 }
 
-/*
- * When a Virtual Processor is scheduled to run on a HW thread, the
- * hypervisor pushes its identifier in the OS CAM line. Emulate the
- * same behavior under QEMU.
- */
-void spapr_xive_set_tctx_os_cam(XiveTCTX *tctx)
+static uint32_t spapr_xive_get_os_cam(PowerPCCPU *cpu)
 {
     uint8_t  nvt_blk;
     uint32_t nvt_idx;
-    uint32_t nvt_cam;
 
-    spapr_xive_cpu_to_nvt(POWERPC_CPU(tctx->cs), &nvt_blk, &nvt_idx);
-
-    nvt_cam = cpu_to_be32(TM_QW1W2_VO | xive_nvt_cam_line(nvt_blk, nvt_idx));
-    memcpy(&tctx->regs[TM_QW1_OS + TM_WORD2], &nvt_cam, 4);
+    spapr_xive_cpu_to_nvt(cpu, &nvt_blk, &nvt_idx);
+    return xive_nvt_cam_line(nvt_blk, nvt_idx);
 }
 
 static void spapr_xive_end_reset(XiveEND *end)
@@ -537,19 +529,21 @@ static int 
spapr_xive_cpu_intc_create(SpaprInterruptController *intc,
     SpaprXive *xive = SPAPR_XIVE(intc);
     Object *obj;
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
+    uint32_t os_cam = spapr_xive_get_os_cam(cpu);
 
-    obj = xive_tctx_create(OBJECT(cpu), XIVE_ROUTER(xive), errp);
+    obj = xive_tctx_create(OBJECT(cpu), XIVE_ROUTER(xive), os_cam, errp);
     if (!obj) {
         return -1;
     }
 
     spapr_cpu->tctx = XIVE_TCTX(obj);
+    return 0;
+}
 
-    /*
-     * (TCG) Early setting the OS CAM line for hotplugged CPUs as they
-     * don't beneficiate from the reset of the XIVE IRQ backend
-     */
-    spapr_xive_set_tctx_os_cam(spapr_cpu->tctx);
+static int spapr_xive_cpu_intc_reset(SpaprInterruptController *intc,
+                                     PowerPCCPU *cpu, Error **errp)
+{
+    xive_tctx_reset(spapr_cpu_state(cpu)->tctx);
     return 0;
 }
 
@@ -643,14 +637,6 @@ static void spapr_xive_dt(SpaprInterruptController *intc, 
uint32_t nr_servers,
 static int spapr_xive_activate(SpaprInterruptController *intc, Error **errp)
 {
     SpaprXive *xive = SPAPR_XIVE(intc);
-    CPUState *cs;
-
-    CPU_FOREACH(cs) {
-        PowerPCCPU *cpu = POWERPC_CPU(cs);
-
-        /* (TCG) Set the OS CAM line of the thread interrupt context. */
-        spapr_xive_set_tctx_os_cam(spapr_cpu_state(cpu)->tctx);
-    }
 
     if (kvm_enabled()) {
         int rc = spapr_irq_init_kvm(kvmppc_xive_connect, intc, errp);
@@ -697,6 +683,7 @@ static void spapr_xive_class_init(ObjectClass *klass, void 
*data)
     sicc->activate = spapr_xive_activate;
     sicc->deactivate = spapr_xive_deactivate;
     sicc->cpu_intc_create = spapr_xive_cpu_intc_create;
+    sicc->cpu_intc_reset = spapr_xive_cpu_intc_reset;
     sicc->claim_irq = spapr_xive_claim_irq;
     sicc->free_irq = spapr_xive_free_irq;
     sicc->set_irq = spapr_xive_set_irq;
diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index b5ac408f7b74..652771d6a5a5 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -295,6 +295,11 @@ static void icp_reset_handler(void *dev)
     }
 }
 
+void icp_reset(ICPState *icp)
+{
+    icp_reset_handler(icp);
+}
+
 static void icp_realize(DeviceState *dev, Error **errp)
 {
     ICPState *icp = ICP(dev);
diff --git a/hw/intc/xics_spapr.c b/hw/intc/xics_spapr.c
index 4f64b9a9fc66..c0b2a576effe 100644
--- a/hw/intc/xics_spapr.c
+++ b/hw/intc/xics_spapr.c
@@ -346,6 +346,13 @@ static int 
xics_spapr_cpu_intc_create(SpaprInterruptController *intc,
     return 0;
 }
 
+static int xics_spapr_cpu_intc_reset(SpaprInterruptController *intc,
+                                     PowerPCCPU *cpu, Error **errp)
+{
+    icp_reset(spapr_cpu_state(cpu)->icp);
+    return 0;
+}
+
 static int xics_spapr_claim_irq(SpaprInterruptController *intc, int irq,
                                 bool lsi, Error **errp)
 {
@@ -433,6 +440,7 @@ static void ics_spapr_class_init(ObjectClass *klass, void 
*data)
     sicc->activate = xics_spapr_activate;
     sicc->deactivate = xics_spapr_deactivate;
     sicc->cpu_intc_create = xics_spapr_cpu_intc_create;
+    sicc->cpu_intc_reset = xics_spapr_cpu_intc_reset;
     sicc->claim_irq = xics_spapr_claim_irq;
     sicc->free_irq = xics_spapr_free_irq;
     sicc->set_irq = xics_spapr_set_irq;
diff --git a/hw/intc/xive.c b/hw/intc/xive.c
index d420c6571e14..be4f2c974178 100644
--- a/hw/intc/xive.c
+++ b/hw/intc/xive.c
@@ -547,7 +547,7 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon)
     }
 }
 
-static void xive_tctx_reset(void *dev)
+static void xive_tctx_reset_handler(void *dev)
 {
     XiveTCTX *tctx = XIVE_TCTX(dev);
 
@@ -566,6 +566,23 @@ static void xive_tctx_reset(void *dev)
         ipb_to_pipr(tctx->regs[TM_QW1_OS + TM_IPB]);
     tctx->regs[TM_QW3_HV_PHYS + TM_PIPR] =
         ipb_to_pipr(tctx->regs[TM_QW3_HV_PHYS + TM_IPB]);
+
+    /*
+     * (TCG) Set the OS CAM line of the thread interrupt context.
+     *
+     * When a Virtual Processor is scheduled to run on a HW thread,
+     * the hypervisor pushes its identifier in the OS CAM line.
+     * Emulate the same behavior under QEMU.
+     */
+    if (tctx->os_cam) {
+        uint32_t qw1w2 = cpu_to_be32(TM_QW1W2_VO | tctx->os_cam);
+        memcpy(&tctx->regs[TM_QW1_OS + TM_WORD2], &qw1w2, 4);
+    }
+}
+
+void xive_tctx_reset(XiveTCTX *tctx)
+{
+    xive_tctx_reset_handler(tctx);
 }
 
 static void xive_tctx_realize(DeviceState *dev, Error **errp)
@@ -608,12 +625,12 @@ static void xive_tctx_realize(DeviceState *dev, Error 
**errp)
         }
     }
 
-    qemu_register_reset(xive_tctx_reset, dev);
+    qemu_register_reset(xive_tctx_reset_handler, dev);
 }
 
 static void xive_tctx_unrealize(DeviceState *dev, Error **errp)
 {
-    qemu_unregister_reset(xive_tctx_reset, dev);
+    qemu_unregister_reset(xive_tctx_reset_handler, dev);
 }
 
 static int vmstate_xive_tctx_pre_save(void *opaque)
@@ -662,11 +679,17 @@ static const VMStateDescription vmstate_xive_tctx = {
     },
 };
 
+static Property  xive_tctx_properties[] = {
+    DEFINE_PROP_UINT32("os-cam", XiveTCTX, os_cam, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
 static void xive_tctx_class_init(ObjectClass *klass, void *data)
 {
     DeviceClass *dc = DEVICE_CLASS(klass);
 
     dc->desc = "XIVE Interrupt Thread Context";
+    dc->props = xive_tctx_properties;
     dc->realize = xive_tctx_realize;
     dc->unrealize = xive_tctx_unrealize;
     dc->vmsd = &vmstate_xive_tctx;
@@ -684,7 +707,8 @@ static const TypeInfo xive_tctx_info = {
     .class_init    = xive_tctx_class_init,
 };
 
-Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, Error **errp)
+Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, uint32_t os_cam,
+                         Error **errp)
 {
     Error *local_err = NULL;
     Object *obj;
@@ -693,6 +717,7 @@ Object *xive_tctx_create(Object *cpu, XiveRouter *xrtr, 
Error **errp)
     object_property_add_child(cpu, TYPE_XIVE_TCTX, obj, &error_abort);
     object_unref(obj);
     object_property_add_const_link(obj, "cpu", cpu, &error_abort);
+    object_property_set_int(obj, os_cam, "os-cam", &local_err);
     object_property_set_bool(obj, true, "realized", &local_err);
     if (local_err) {
         goto error;
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
index 7cf64b6d2533..99c06842573e 100644
--- a/hw/ppc/pnv.c
+++ b/hw/ppc/pnv.c
@@ -806,7 +806,8 @@ static void pnv_chip_power9_intc_create(PnvChip *chip, 
PowerPCCPU *cpu,
      * controller object is initialized afterwards. Hopefully, it's
      * only used at runtime.
      */
-    obj = xive_tctx_create(OBJECT(cpu), XIVE_ROUTER(&chip9->xive), &local_err);
+    obj = xive_tctx_create(OBJECT(cpu), XIVE_ROUTER(&chip9->xive), 0,
+                           &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 3e4302c7d596..416aa75e5fba 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -33,6 +33,7 @@ static void spapr_cpu_reset(void *opaque)
     PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
     SpaprCpuState *spapr_cpu = spapr_cpu_state(cpu);
     target_ulong lpcr;
+    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
 
     cpu_reset(cs);
 
@@ -77,9 +78,11 @@ static void spapr_cpu_reset(void *opaque)
     spapr_cpu->dtl_addr = 0;
     spapr_cpu->dtl_size = 0;
 
-    spapr_caps_cpu_apply(SPAPR_MACHINE(qdev_get_machine()), cpu);
+    spapr_caps_cpu_apply(spapr, cpu);
 
     kvm_check_mmu(cpu, &error_fatal);
+
+    spapr_irq_cpu_intc_reset(spapr, cpu, &error_fatal);
 }
 
 void spapr_cpu_set_entry_state(PowerPCCPU *cpu, target_ulong nip, target_ulong 
r3)
@@ -235,12 +238,13 @@ static void spapr_realize_vcpu(PowerPCCPU *cpu, 
SpaprMachineState *spapr,
     kvmppc_set_papr(cpu);
 
     qemu_register_reset(spapr_cpu_reset, cpu);
-    spapr_cpu_reset(cpu);
 
     if (spapr_irq_cpu_intc_create(spapr, cpu, &local_err) < 0) {
         goto error_unregister;
     }
 
+    spapr_cpu_reset(cpu);
+
     if (!sc->pre_3_0_migration) {
         vmstate_register(NULL, cs->cpu_index, &vmstate_spapr_cpu_state,
                          cpu->machine_data);
diff --git a/hw/ppc/spapr_irq.c b/hw/ppc/spapr_irq.c
index bb91c61fa000..5d2b64029cd5 100644
--- a/hw/ppc/spapr_irq.c
+++ b/hw/ppc/spapr_irq.c
@@ -220,6 +220,27 @@ int spapr_irq_cpu_intc_create(SpaprMachineState *spapr,
     return 0;
 }
 
+int spapr_irq_cpu_intc_reset(SpaprMachineState *spapr,
+                             PowerPCCPU *cpu, Error **errp)
+{
+    SpaprInterruptController *intcs[] = ALL_INTCS(spapr);
+    int i;
+    int rc;
+
+    for (i = 0; i < ARRAY_SIZE(intcs); i++) {
+        SpaprInterruptController *intc = intcs[i];
+        if (intc) {
+            SpaprInterruptControllerClass *sicc = SPAPR_INTC_GET_CLASS(intc);
+            rc = sicc->cpu_intc_reset(intc, cpu, errp);
+            if (rc < 0) {
+                return rc;
+            }
+        }
+    }
+
+    return 0;
+}
+
 static void spapr_set_irq(void *opaque, int irq, int level)
 {
     SpaprMachineState *spapr = SPAPR_MACHINE(opaque);
-- 
2.21.0





reply via email to

[Prev in Thread] Current Thread [Next in Thread]