qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [questions] about using vfio to assign sr-iov vf to vm


From: Zhang Haoyu
Subject: Re: [Qemu-devel] [questions] about using vfio to assign sr-iov vf to vm
Date: Mon, 18 Aug 2014 17:49:01 +0800

>>> >> >> Hi, all
>>> >> >> I'm using VFIO to assign intel 82599 VF to VM, now I encounter a 
>>> >> >> problem,
>>> >> >> 82599 PF and its VFs belong to the same iommu_group, but I only want 
>>> >> >> to assign some VFs to one VM, and some other VFs to another VM,
>>> >> >> so how to only unbind (part of) the VFs but PF?
>>> >> >> I read the kernel doc vfio.txt, I'm not sure should I unbind all of 
>>> >> >> the devices which belong to one iommu_group?
>>> >> >> If so, because PF and its VFs belong to the same iommu_group, if I 
>>> >> >> unbind the PF, its VFs also diappeared.
>>> >> >> I think I misunderstand someting,
>>> >> >> any advises?
>>> >> >
>>> >> >This occurs when the PF is installed behind components in the system
>>> >> >that do not support PCIe Access Control Services (ACS).  The IOMMU group
>>> >> >contains both the PF and the VF because upstream transactions can be
>>> >> >re-routed downstream by these non-ACS components before being translated
>>> >> >by the IOMMU.  Please provide 'sudo lspci -vvv', 'lspci -n', and kernel
>>> >> >version and we might be able to give you some advise on how to work
>>> >> >around the problem.  Thanks,
>>> >> >
>>> >> The intel 82599(02:00.0 or 02:00.1) is behind the pci bridge (00:01.1),  
>>> >> does 00:01.1 PCI bridge support ACS ?
>>> >
>>> >It does not and that's exactly the problem.  We must assume that the
>>> >root port can redirect a transaction from a subordinate device back to
>>> >another subordinate device without IOMMU translation when ACS support is
>>> >not present.  If you had a device plugged in below 00:01.0, we'd also
>>> >need to assume that non-IOMMU translated peer-to-peer between devices
>>> >behind either function, 00:01.0 or 00:01.1, is possible.
>>> >
>>> >Intel has indicated that processor root ports for all Xeon class
>>> >processors should support ACS and have verified isolation for PCH based
>>> >root ports allowing us to support quirks in place of ACS support.  I'm
>>> >not aware of any efforts at Intel to verify isolation capabilities of
>>> >root ports on client processors.  They are however aware that lack of
>>> >ACS is a limiting factor for usability of VT-d, and I hope that we'll
>>> >see future products with ACS support.
>>> >
>>> >Chances are good that the PCH root port at 00:1c.0 is supported by an
>>> >ACS quirk, but it seems that your system has a PCIe switch below the
>>> >root port.  If the PCIe switch downstream ports support ACS, then you
>>> >may be able to move the 82599 to the empty slot at bus 07 to separate
>>> >the VFs into different IOMMU groups.  Thanks,
>>> >
>>> Thanks, Alex,
>>> how to tell whether a PCI bridge/deivce support ACS capability?
>>> 
>>> I perform "lspci -vvv -s | grep -i ACS", nothing matched.
>>> # lspci -vvv -s 00:1c.0
>>> 00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family 
>>> PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
>>
>>
>>Ideally there would be capabilities for it, something like:
>>
>>Capabilities [xxx] Access Control Services...
>>
>>But, Intel failed to provide this, so we enable "effective" ACS
>>capabilities via a quirk:
>>
>>drivers/pci/quirks.c:
>>/*
>> * Many Intel PCH root ports do provide ACS-like features to disable peer
>> * transactions and validate bus numbers in requests, but do not provide an
>> * actual PCIe ACS capability.  This is the list of device IDs known to fall
>> * into that category as provided by Intel in Red Hat bugzilla 1037684.
>> */
>>static const u16 pci_quirk_intel_pch_acs_ids[] = {
>>        /* Ibexpeak PCH */
>>        0x3b42, 0x3b43, 0x3b44, 0x3b45, 0x3b46, 0x3b47, 0x3b48, 0x3b49,
>>        0x3b4a, 0x3b4b, 0x3b4c, 0x3b4d, 0x3b4e, 0x3b4f, 0x3b50, 0x3b51,
>>        /* Cougarpoint PCH */
>>        0x1c10, 0x1c11, 0x1c12, 0x1c13, 0x1c14, 0x1c15, 0x1c16, 0x1c17,
>>        0x1c18, 0x1c19, 0x1c1a, 0x1c1b, 0x1c1c, 0x1c1d, 0x1c1e, 0x1c1f,
>>        /* Pantherpoint PCH */
>>        0x1e10, 0x1e11, 0x1e12, 0x1e13, 0x1e14, 0x1e15, 0x1e16, 0x1e17,
>>        0x1e18, 0x1e19, 0x1e1a, 0x1e1b, 0x1e1c, 0x1e1d, 0x1e1e, 0x1e1f,
>>        /* Lynxpoint-H PCH */
>>        0x8c10, 0x8c11, 0x8c12, 0x8c13, 0x8c14, 0x8c15, 0x8c16, 0x8c17,
>>        0x8c18, 0x8c19, 0x8c1a, 0x8c1b, 0x8c1c, 0x8c1d, 0x8c1e, 0x8c1f,
>>        /* Lynxpoint-LP PCH */
>>        0x9c10, 0x9c11, 0x9c12, 0x9c13, 0x9c14, 0x9c15, 0x9c16, 0x9c17,
>>        0x9c18, 0x9c19, 0x9c1a, 0x9c1b,
>>        /* Wildcat PCH */
>>        0x9c90, 0x9c91, 0x9c92, 0x9c93, 0x9c94, 0x9c95, 0x9c96, 0x9c97,
>>        0x9c98, 0x9c99, 0x9c9a, 0x9c9b,
>>        /* Patsburg (X79) PCH */
>>        0x1d10, 0x1d12, 0x1d14, 0x1d16, 0x1d18, 0x1d1a, 0x1d1c, 0x1d1e,
>>};
>>
>>Hopefully if you run 'lspci -n', you'll see your device ID listed among
>>these.  We don't currently have any quirks for PCIe switches, so if your
>>IOMMU group is still bigger than it should be, that may be the reason.
>>Thanks,
>>
>Using device specific mechanisms to enable and verify ACS-like capability is 
>okay,
>but with regard to those devices which completely don't support ACS-like 
>capabilities, 
>what shall we do, how about applying the [PATCH] pci: Enable overrides for 
>missing ACS capabilities,
>and how to reduce the risk of data corruption and info leakage between VMs?
>
Any update compared with 
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/110726/focus=111515 ?

>Thanks,
>Zhang Haoyu
>>Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]