[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [questions] about using vfio to assign sr-iov vf to vm
From: |
Zhang Haoyu |
Subject: |
Re: [Qemu-devel] [questions] about using vfio to assign sr-iov vf to vm |
Date: |
Mon, 18 Aug 2014 09:00:16 +0800 |
>> >> Hi, all
>> >> I'm using VFIO to assign intel 82599 VF to VM, now I encounter a problem,
>> >> 82599 PF and its VFs belong to the same iommu_group, but I only want to
>> >> assign some VFs to one VM, and some other VFs to another VM, ...,
>> >> so how to only unbind (part of) the VFs but PF?
>> >> I read the kernel doc vfio.txt, I'm not sure should I unbind all of the
>> >> devices which belong to one iommu_group?
>> >> If so, because PF and its VFs belong to the same iommu_group, if I unbind
>> >> the PF, its VFs also diappeared.
>> >> I think I misunderstand someting,
>> >> any advises?
>> >
>> >This occurs when the PF is installed behind components in the system
>> >that do not support PCIe Access Control Services (ACS). The IOMMU group
>> >contains both the PF and the VF because upstream transactions can be
>> >re-routed downstream by these non-ACS components before being translated
>> >by the IOMMU. Please provide 'sudo lspci -vvv', 'lspci -n', and kernel
>> >version and we might be able to give you some advise on how to work
>> >around the problem. Thanks,
>> >
>> # lspci | grep Ether
>> 02:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+
>> Network Connection (rev 01)
>> 02:00.1 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+
>> Network Connection (rev 01)
>> 08:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 08:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 09:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 09:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 0a:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 0a:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 0b:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 0b:00.1 Ethernet controller: Intel Corporation 82576 Gigabit Network
>> Connection (rev 01)
>> 0c:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> RTL-8110SC/8169SC Gigabit Ethernet (rev 10)
>>
>> I want to direct-assign the VFs of intel 82599(02:00.0 or 02:00.1) to VM,
>> # lspci -t
>> -[0000:00]-+-00.0
>> +-01.0-[01]--
>> +-01.1-[02-03]--+-00.0
>> | \-00.1
>> +-02.0
>> +-06.0-[04]--
>> +-16.0
>> +-1a.0
>> +-1c.0-[05-0b]----00.0-[06-0b]--+-04.0-[07]--
>> | +-05.0-[08]--+-00.0
>> | | \-00.1
>> | +-06.0-[09]--+-00.0
>> | | \-00.1
>> | +-08.0-[0a]--+-00.0
>> | | \-00.1
>> | \-09.0-[0b]--+-00.0
>> | \-00.1
>> +-1d.0
>> +-1e.0-[0c]----00.0
>> +-1f.0
>> +-1f.2
>> \-1f.3
>>
>> lspci -vvv -s 02.00.0
>> 02:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit SFI/SFP+
>> Network Connection (rev 01)
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
>> <MAbort- >SERR- <PERR- INTx-
>> Latency: 0, Cache Line Size: 64 bytes
>> Interrupt: pin A routed to IRQ 17
>> Region 0: Memory at f7e20000 (64-bit, non-prefetchable) [size=128K]
>> Region 2: I/O ports at e020 [size=32]
>> Region 4: Memory at f7e44000 (64-bit, non-prefetchable) [size=16K]
>> Capabilities: [40] Power Management version 3
>> Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
>> Capabilities: [70] MSI-X: Enable+ Count=64 Masked-
>> Capabilities: [a0] Express (v2) Endpoint, MSI 00
>> Capabilities: [e0] Vital Product Data
>> Capabilities: [100 v1] Advanced Error Reporting
>> Capabilities: [140 v1] Device Serial Number 00-90-0b-ff-ff-29-33-c2
>> Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
>> Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
>> Kernel driver in use: ixgbe
>>
>> # lspci -vvv -s 00:01.1
>> 00:01.1 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor
>> PCI Express Root Port (rev 09) (prog-if 00 [Normal decode])
>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
>> Stepping- SERR- FastB2B- DisINTx+
>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
>> <MAbort- >SERR- <PERR- INTx-
>> Latency: 0, Cache Line Size: 64 bytes
>> Bus: primary=00, secondary=02, subordinate=03, sec-latency=0
>> I/O behind bridge: 0000e000-0000efff
>> Memory behind bridge: f7e00000-f7efffff
>> Prefetchable memory behind bridge: 00000000dfb00000-00000000dfefffff
>> Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
>> <MAbort- <SERR- <PERR-
>> BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
>> PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
>> Capabilities: [88] Subsystem: Intel Corporation Xeon E3-1200 v2/3rd Gen
>> Core processor PCI Express Root Port
>> Capabilities: [80] Power Management version 3
>> Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
>> Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00
>> Capabilities: [100 v1] Virtual Channel
>> Capabilities: [140 v1] Root Complex Link
>> Capabilities: [d94 v1] #19
>> Kernel driver in use: pcieport
>>
>> The intel 82599(02:00.0 or 02:00.1) is behind the pci bridge (00:01.1),
>> does 00:01.1 PCI bridge support ACS ?
>
>It does not and that's exactly the problem. We must assume that the
>root port can redirect a transaction from a subordinate device back to
>another subordinate device without IOMMU translation when ACS support is
>not present. If you had a device plugged in below 00:01.0, we'd also
>need to assume that non-IOMMU translated peer-to-peer between devices
>behind either function, 00:01.0 or 00:01.1, is possible.
>
>Intel has indicated that processor root ports for all Xeon class
>processors should support ACS and have verified isolation for PCH based
>root ports allowing us to support quirks in place of ACS support. I'm
>not aware of any efforts at Intel to verify isolation capabilities of
>root ports on client processors. They are however aware that lack of
>ACS is a limiting factor for usability of VT-d, and I hope that we'll
>see future products with ACS support.
>
>Chances are good that the PCH root port at 00:1c.0 is supported by an
>ACS quirk, but it seems that your system has a PCIe switch below the
>root port. If the PCIe switch downstream ports support ACS, then you
>may be able to move the 82599 to the empty slot at bus 07 to separate
>the VFs into different IOMMU groups. Thanks,
>
Thanks, Alex,
how to tell whether a PCI bridge/deivce support ACS capability?
I perform "lspci -vvv -s | grep -i ACS", nothing matched.
# lspci -vvv -s 00:1c.0
00:1c.0 PCI bridge: Intel Corporation 6 Series/C200 Series Chipset Family PCI
Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Bus: primary=00, secondary=05, subordinate=0b, sec-latency=0
I/O behind bridge: 00002000-00003fff
Memory behind bridge: f7800000-f7cfffff
Prefetchable memory behind bridge: 00000000f0000000-00000000f03fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns,
L1 <1us
ExtTag- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+
TransPend-
LnkCap: Port #1, Speed 5GT/s, Width x4, ASPM L0s L1, Latency L0
<1us, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+
BWMgmt+ ABWMgmt+
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug-
Surprise-
Slot #0, PowerLimit 25.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq-
LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power-
Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+
Interlock-
Changed: MRL- PresDet- LinkState-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna-
CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range BC, TimeoutDis+ ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-,
Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-,
LinkEqualizationRequest-
Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
Address: 00000000 Data: 0000
Capabilities: [90] Subsystem: Intel Corporation 6 Series/C200 Series
Chipset Family PCI Express Root Port 1
Capabilities: [a0] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Kernel driver in use: pcieport
Thanks,
Zhang Haoyu
>Alex