qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH v3 1/2] VFIO: Clear stale MSIx table during EEH re


From: Gavin Shan
Subject: Re: [Qemu-ppc] [PATCH v3 1/2] VFIO: Clear stale MSIx table during EEH reset
Date: Wed, 1 Apr 2015 14:05:12 +1100
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Mar 31, 2015 at 07:16:43PM -0600, Alex Williamson wrote:
>On Wed, 2015-04-01 at 11:20 +1100, Gavin Shan wrote:
>> On Tue, Mar 31, 2015 at 01:36:30PM -0600, Alex Williamson wrote:
>> >On Mon, 2015-03-30 at 20:34 +1100, Gavin Shan wrote:
>> >> On Mon, Mar 30, 2015 at 01:39:16PM +1100, David Gibson wrote:
>> >> >On Thu, Mar 26, 2015 at 04:35:01PM +1100, Gavin Shan wrote:
>> >> >> The PCI device MSIx table is cleaned out in hardware after EEH PE
>> >> >> reset. However, we still hold the stale MSIx entries in QEMU, which
>> >> >> should be cleared accordingly. Otherwise, we will run into another
>> >> >> (recursive) EEH error and the PCI devices contained in the PE have
>> >> >> to be offlined exceptionally.
>> >> >> 
>> >> >> The patch introduces function vfio_eeh_pe_reset(), which is called
>> >> >> by sPAPR when asserting hot or fundamental reset, to clear stale MSIx
>> >> >> table before EEH PE reset so that MSIx table could be restored properly
>> >> >> after EEH PE reset.
>> >> >> 
>> >> >> Signed-off-by: Gavin Shan <address@hidden>
>> >> >> ---
>> >> >>  hw/ppc/spapr_pci_vfio.c | 13 +++++++++----
>> >> >>  hw/vfio/Makefile.objs   |  6 +++++-
>> >> >>  hw/vfio/pci-stub.c      | 16 ++++++++++++++++
>> >> >>  hw/vfio/pci.c           | 36 ++++++++++++++++++++++++++++++++++++
>> >> >>  include/hw/vfio/vfio.h  |  2 ++
>> >> >>  5 files changed, 68 insertions(+), 5 deletions(-)
>> >> >>  create mode 100644 hw/vfio/pci-stub.c
>> >> >> 
>> >> >> diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c
>> >> >> index 99a1be5..6fa3afe 100644
>> >> >> --- a/hw/ppc/spapr_pci_vfio.c
>> >> >> +++ b/hw/ppc/spapr_pci_vfio.c
>> >> >> @@ -151,19 +151,24 @@ static int 
>> >> >> spapr_phb_vfio_eeh_reset(sPAPRPHBState *sphb, int option)
>> >> >>      switch (option) {
>> >> >>      case RTAS_SLOT_RESET_DEACTIVATE:
>> >> >>          op.op = VFIO_EEH_PE_RESET_DEACTIVATE;
>> >> >> +        ret = vfio_container_ioctl(&svphb->phb.iommu_as,
>> >> >> +                                   svphb->iommugroupid,
>> >> >> +                                   VFIO_EEH_PE_OP, &op);
>> >> >
>> >> >For consistency, I think all the reset operations should go through
>> >> >vfio_eeh_pe_reset(), even though in this case it won't do more than
>> >> >call vfio_container_ioctl().
>> >> >
>> >> 
>> >> Fair enough. I'll fix :-)
>> >> 
>> >> 
>> >> >>          break;
>> >> >>      case RTAS_SLOT_RESET_HOT:
>> >> >> -        op.op = VFIO_EEH_PE_RESET_HOT;
>> >> >> +        ret = vfio_eeh_pe_reset(&svphb->phb.iommu_as,
>> >> >> +                                svphb->iommugroupid,
>> >> >> +                                VFIO_EEH_PE_RESET_HOT);
>> >> >>          break;
>> >> >>      case RTAS_SLOT_RESET_FUNDAMENTAL:
>> >> >> -        op.op = VFIO_EEH_PE_RESET_FUNDAMENTAL;
>> >> >> +        ret = vfio_eeh_pe_reset(&svphb->phb.iommu_as,
>> >> >> +                                svphb->iommugroupid,
>> >> >> +                                VFIO_EEH_PE_RESET_FUNDAMENTAL);
>> >> >>          break;
>> >> >>      default:
>> >> >>          return RTAS_OUT_PARAM_ERROR;
>> >> >>      }
>> >> >>  
>> >> >> -    ret = vfio_container_ioctl(&svphb->phb.iommu_as, 
>> >> >> svphb->iommugroupid,
>> >> >> -                               VFIO_EEH_PE_OP, &op);
>> >> >>      if (ret < 0) {
>> >> >>          return RTAS_OUT_HW_ERROR;
>> >> >>      }
>> >> >> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
>> >> >> index e31f30e..1b8a065 100644
>> >> >> --- a/hw/vfio/Makefile.objs
>> >> >> +++ b/hw/vfio/Makefile.objs
>> >> >> @@ -1,4 +1,8 @@
>> >> >>  ifeq ($(CONFIG_LINUX), y)
>> >> >>  obj-$(CONFIG_SOFTMMU) += common.o
>> >> >> -obj-$(CONFIG_PCI) += pci.o
>> >> >> +ifeq ($(CONFIG_PCI), y)
>> >> >> +obj-y += pci.o
>> >> >> +else
>> >> >> +obj-y += pci-stub.o
>> >> >> +endif
>> >> >>  endif
>> >> >> diff --git a/hw/vfio/pci-stub.c b/hw/vfio/pci-stub.c
>> >> >> new file mode 100644
>> >> >> index 0000000..f317c1e
>> >> >> --- /dev/null
>> >> >> +++ b/hw/vfio/pci-stub.c
>> >> >> @@ -0,0 +1,16 @@
>> >> >> +/*
>> >> >> + * To include the file on !CONFIG_PCI
>> >> >> + *
>> >> >> + * This work is licensed under the terms of the GNU GPL, version 2.  
>> >> >> See
>> >> >> + * the COPYING file in the top-level directory.
>> >> >> + */
>> >> >> +
>> >> >> +#include <linux/vfio.h>
>> >> >> +
>> >> >> +#include "exec/memory.h"
>> >> >> +#include "hw/vfio/vfio.h"
>> >> >> +
>> >> >> +int vfio_eeh_pe_reset(AddressSpace *as, int32_t groupid, uint32_t 
>> >> >> option)
>> >> >> +{
>> >> >> +    return -1;
>> >> >
>> >> >Probably should have assert(0) here - this should never be called if 
>> >> >!CONFIG_PCI.
>> >> >
>> >> 
>> >> Indeed, assert(0) would be better. I just replied to ask for dropping the 
>> >> stub
>> >> for !CONFIG_PCI if you and Alex.W agree.
>> >
>> >I certainly don't see the reason for the stub, it was only suggested
>> >before because a previous version had the callout in hw/vfio/common.c
>> >
>> 
>> Ok. I'll drop it in next revision.
>> 
>> >> >> +}
>> >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> >> >> index 6b80539..d0fd4b4 100644
>> >> >> --- a/hw/vfio/pci.c
>> >> >> +++ b/hw/vfio/pci.c
>> >> >> @@ -3319,6 +3319,42 @@ static void 
>> >> >> vfio_unregister_req_notifier(VFIOPCIDevice *vdev)
>> >> >>      vdev->req_enabled = false;
>> >> >>  }
>> >> >>  
>> >> >> +int vfio_eeh_pe_reset(AddressSpace *as, int32_t groupid, uint32_t 
>> >> >> option)
>> >> >> +{
>> >> >> +    VFIOGroup *group;
>> >> >> +    VFIODevice *vbasedev;
>> >> >> +    VFIOPCIDevice *vdev;
>> >> >> +    struct vfio_eeh_pe_op op = {
>> >> >> +        .argsz = sizeof(op),
>> >> >> +        .op = option
>> >> >> +    };
>> >> >> +
>> >> >> +    group = vfio_get_group(groupid, as);
>> >> >> +    if (!group) {
>> >> >> +        error_report("vfio: group %d not found\n", groupid);
>> >> >> +        return -1;
>> >> >> +    }
>> >> >> +
>> >> >> +    /*
>> >> >> +     * The MSIx table will be cleaned out by reset. We need
>> >> >> +     * disable it so that it can be reenabled properly. Also,
>> >> >> +     * the cached MSIx table should be cleared as it's not
>> >> >> +     * reflecting the contents in hardware.
>> >> >> +     */
>> >> >> +    QLIST_FOREACH(vbasedev, &group->device_list, next) {
>> >> >> +        vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>> >> >> +        if (msix_enabled(&vdev->pdev)) {
>> >> >> +            vfio_disable_msix(vdev);
>> >> >> +        }
>> >> >> +
>> >> >> +        msix_reset(&vdev->pdev);
>> >> >> +    }
>> >> >> +
>> >> >> +    vfio_put_group(group);
>> >> >> +
>> >> >> +    return vfio_container_ioctl(as, groupid, VFIO_EEH_PE_OP, &op);
>> >> >> +}
>> >
>> >So all you're trying to do here is find the devices in the PE and
>> >disable/reset MSI-X, but do you really need yet another ugly callback
>> >into vfio to do that?  Isn't it possible to find the devices based on
>> >the address space or PCI topology?  If we have EEH emulation, don't you
>> >also want to do this for emulated devices?  The vfio_disable_msix() call
>> >could be replaced by the equivalent config space access to make it look
>> >like the guest disabled MSI-X.
>> >
>> 
>> EEH for emulated PCI device is out of scope for now, which depends on
>> fully emulated IBM's PHB. The PE reset is requested by guest and guest
>> is aware of losing MSIx table after that.
>> 
>> I'm not sure I'm following your suggestion, but yes, the VFIO PCI devices
>> can be identified by checking its class string with help of some QOM helper
>> functions. So I guess you are suggesting something as follows, which would
>> make the code a bit cleaner.
>> 
>> - In hw/ppc/spapr_pci_vfio.c::spapr_phb_vfio_eeh_reset(), check all PCI
>>   devices hooked to the PHB and if it's a VFIO PCI device, disable MSIx
>>   interrupt by clearing MSIX_ENABLE in the config space and cleaning out
>>   the MSIx table if MSIx interrupt has been enabled on the PCI device.
>
>That's what I'm suggesting, but why do you even need to check whether
>the subordinate device is vfio?  I imagine you can't mix emulated and
>vfio devices behind a phb, but even if you could, what's the harm in
>doing the same MSI-X reset on emulated devices?  You don't need to
>support EEH on emulated, but you also don't need to handle vfio uniquely
>here.  Thanks,
>

Thanks for confirm. Yes, it's fair enough since EEH is platform unique
feature. I'll move the logic to sPAPR platform.

The VFIO PCI devices could be hooked to PCI bus, which is leaded from
emulated PCI bridge. It might be harmless to reset MSIx table for the
upstream emulatd bridge, but pointless as it's not part of the PE, on
which we're applying the reset. However, I doubt if a emulated PCI bridge
needs MSIx at all.

Thanks,
Gavin

>Alex
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]