qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v16 00/14] vfio-pci: pass the aer error to guest


From: Chen Fan
Subject: Re: [Qemu-devel] [PATCH v16 00/14] vfio-pci: pass the aer error to guest
Date: Wed, 3 Feb 2016 16:54:01 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0


On 01/17/2016 02:34 AM, Michael S. Tsirkin wrote:
On Tue, Jan 12, 2016 at 10:43:01AM +0800, Cao jin wrote:
From: Chen Fan <address@hidden>

For now, for vfio pci passthough devices when qemu receives
an error from host aer report, currentlly just terminate the guest,
but usually user want to know what error occurred but stopping the
guest, so this patches add aer capability support for vfio device,
and pass the error to guest, and have guest driver to recover
from the error.
I would like to see a version of this patchset that doesn't
depend on pci core changes.
I think that if you make this simplifying assumption:

- all devices on same bus in guest are on same bus in host

then you can handle both reset and hotplug simply in function 0
since it will belong to vfio.

So we can have a version without pci core changes that simply assumes
this, and things will just work.


Now, if we wanted to enforce this limitation, I think the
cleanest way would be to add a callback in struct PCIDevice:

        bool is_valid_function(PCIDevice *newfunction)

and call it as each function is added.
This way aer function can validate that each function
added shares the same bus.
And this way issues will be detected directly and not when
function 0 is added.

I would prefer this validation code to be a patch on top so we can merge
the functionality directly and avoid blocking it while we figure out the
best api to validate things.

I don't see why making guest topology match host would
ever be a problem, but if it's required to support
configurations where these differ, I'd like to see
an attempt to address that be split out, after aer
is supported.
Hi Michael,

Just think about this more,  I think we also should check the vfio
devices whether on the same bus at the time of function 0 is added.
because we don't know the affected devices by a bus reset have
already all been assigned to VM. for example, the multi-function's hotplug.
devices on same bus in host are added to VM one by one. when we
test one device, we haven't yet added the other devices. so I think
the patch should like below. then we could add a vfio_is_valid_function in vfio
to test each device whether the affected devices on the same bus.

Thanks,
Chen

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index d940f79..7163b56 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1836,6 +1836,38 @@ PCIDevice *pci_find_device(PCIBus *bus, int bus_num, uint8_t devfn)
     return bus->devices[devfn];
 }

+static int pci_bus_check_devices(PCIBus *bus)
+{
+    PCIDeviceClass *pc;
+    int i, ret = 0;
+
+    for (i = 0; i < ARRAY_SIZE(bus->devices); ++i) {
+        if (!bus->devices[i]) {
+            continue;
+        }
+
+        pc = PCI_DEVICE_GET_CLASS(bus->devices[i]);
+        if (!pc->is_valid_func) {
+            continue;
+        }
+
+        ret = pc->is_valid_func(bus->devices[i], bus);
+        if (!ret) {
+            return -1;
+        }
+    }
+    return 0;
+}
+
+static bool pci_is_valid_function(PCIDevice *pdev, PCIBus *bus)
+{
+    if (pdev->bus == bus) {
+        return true;
+    }
+
+    return false;
+}
+
 static void pci_qdev_realize(DeviceState *qdev, Error **errp)
 {
     PCIDevice *pci_dev = (PCIDevice *)qdev;
@@ -1878,6 +1910,14 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
         pci_qdev_unrealize(DEVICE(pci_dev), NULL);
         return;
     }
+
+    if (DEVICE(pci_dev)->hotplugged &&
+        pci_get_function_0(pci_dev) == pci_dev &&
+        pci_bus_check_devices(bus)) {
+        error_setg(errp, "failed to hotplug function 0");
+        pci_qdev_unrealize(DEVICE(pci_dev), NULL);
+        return;
+    }
 }

 static void pci_default_realize(PCIDevice *dev, Error **errp)
@@ -2390,6 +2430,7 @@ static void pci_device_class_init(ObjectClass *klass, void *data)
     k->bus_type = TYPE_PCI_BUS;
     k->props = pci_props;
     pc->realize = pci_default_realize;
+    pc->is_valid_func = pci_is_valid_function;
 }

 AddressSpace *pci_device_iommu_address_space(PCIDevice *dev)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index dedf277..a89580f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -191,6 +191,7 @@ typedef struct PCIDeviceClass {

     void (*realize)(PCIDevice *dev, Error **errp);
     int (*init)(PCIDevice *dev);/* TODO convert to realize() and remove */
+    bool (*is_valid_func)(PCIDevice *dev, PCIBus *bus);
     PCIUnregisterFunc *exit;
     PCIConfigReadFunc *config_read;
     PCIConfigWriteFunc *config_write;






v15-v16:
    10/14, 11/14 are new to introduce a reset sequence id to specify the
    vfio devices has been reset for that reset. other patches aren't modified.

v14-v15:
    1. add device hot reset callback
    2. add bus_in_reset for vfio device to avoid multi do host bus reset

v13-v14:
    1. for multifunction device, requiring all functions enable AER.(9/13)
    2. due to all affected functions receive error signal, ignore no
       error occurred function. (12/13)

v12-v13:
    1. since support multifuncion hotplug, here add callback to enable aer.
    2. add pci device pre+post reset for aer host reset.

Chen Fan (14):
   vfio: extract vfio_get_hot_reset_info as a single function
   vfio: squeeze out vfio_pci_do_hot_reset for support bus reset
   pcie: modify the capability size assert
   vfio: make the 4 bytes aligned for capability size
   vfio: add pcie extanded capability support
   aer: impove pcie_aer_init to support vfio device
   vfio: add aer support for vfio device
   vfio: add check host bus reset is support or not
   add check reset mechanism when hotplug vfio device
   pci: introduce pci bus pre reset
   vfio: introduce last reset sequence id
   pcie_aer: expose pcie_aer_msg() interface
   vfio-pci: pass the aer error to guest
   vfio: add 'aer' property to expose aercap

  hw/pci-bridge/ioh3420.c            |   2 +-
  hw/pci-bridge/xio3130_downstream.c |   2 +-
  hw/pci-bridge/xio3130_upstream.c   |   2 +-
  hw/pci/pci.c                       |  42 +++
  hw/pci/pci_bridge.c                |   3 +
  hw/pci/pcie.c                      |   2 +-
  hw/pci/pcie_aer.c                  |   6 +-
  hw/vfio/pci.c                      | 616 +++++++++++++++++++++++++++++++++----
  hw/vfio/pci.h                      |   9 +
  include/hw/pci/pci.h               |   1 +
  include/hw/pci/pci_bus.h           |   8 +
  include/hw/pci/pcie_aer.h          |   3 +-
  12 files changed, 624 insertions(+), 72 deletions(-)

--
1.9.3



.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]