qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 03/11] vfio: add aer support for vfio device


From: Alex Williamson
Subject: Re: [Qemu-devel] [PATCH v9 03/11] vfio: add aer support for vfio device
Date: Tue, 13 Sep 2016 09:14:09 -0600

Hi Dou,

On Tue, 13 Sep 2016 14:56:15 +0800
Dou Liyang <address@hidden> wrote:

> Hi Alex,
> 
> I am Dou.
> I am testing these patches.
> 
> At 09/01/2016 03:44 AM, Alex Williamson wrote:
> 
> [...]
> 
> >> +
> >> +    pcie_cap_deverr_init(pdev);
> >> +    return pcie_aer_init(pdev, pos, size);  
> >
> > pcie_aer_init() adds a v2 AER capability regardless of the version of
> > the AER capability on the device.  Is this expected?  
> 
> I think it is not good.
> 
>   v2 defines a lot
> > more bits in various registers than v1, so are we simply hoping that
> > devices have reserved bits as zero like they're supposed to?  
> 
> I read the Capability Version in PCIe Spec. it says:
> 
> Capability Version – This field is a PCI-SIG defined version number
> that indicates the version of the Capability structure present.
> OR
> This field must be 2h if the End-End TLP Prefix Supported bit (see
> Section 7.8.15) is Set and must be 1h or 2h otherwise.
> 
> In my option, I guess that the spec may mean that v1/v2 is used for the
> End-End TLP Prefix Supported bit supported.
> 
> And I find that Kernel and QEmu don't do some special work for it. this
> feature may not affect the registers in AER.
> 
> can you give me some AER bits that are defined in v2 ,but not in v1?

There are quite a lot.  In the uncorrectable error status register bits
21-25 are only defined for v2, same for the mask and severity for this
register.  Bits 14 & 15 are new for v2 in the correctable error status
registers, status and mask.  Bits 9-11 are new in the control register.

> It's a
> > bit strange that for an Intel 82576 NIC I get a v1 AER capability w/o
> > aer=on and a v2 with.  Thanks,
> >  
> 
> Yes, I do the test, and get the same result for me.
> 
> I use the following device to test:
> 
> 06:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network
> Connection (rev 01)
> 
> Firstly, In host, I use the command "lspci -vvv -s 06:00.0":
> 
> [...]
> Capabilities: [100 v1] Advanced Error Reporting
> UESta:        DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>   MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk:        DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
> MalfTLP- ECRC- UnsupReq+ ACSViol-
> UESvrt:       DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
> MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta:        RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> CEMsk:        RxErr- BadTLP+ BadDLLP- Rollover- Timeout- NonFatalErr+
> AERCap:       First Error Pointer: 12, GenCap- CGenEn- ChkCap- ChkEn-
> [...]
> 
> Secondly, I pass through the device to the guest. In the guest:
> 
> [...]
> Capabilities: [100 v2] Advanced Error Reporting
> UESta:        DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
>   MalfTLP- ECRC- UnsupReq- ACSViol-
> UEMsk:        DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF-
> MalfTLP- ECRC- UnsupReq+ ACSViol-
> UESvrt:       DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+
> MalfTLP+ ECRC- UnsupReq- ACSViol-
> CESta:        RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
> CEMsk:        RxErr- BadTLP+ BadDLLP- Rollover- Timeout- NonFatalErr+
> AERCap:       First Error Pointer: 12, GenCap- CGenEn- ChkCap- ChkEn-
> [...]
> 
> They look the same, except the capabilities version.
> 
> Thirdly, I drop the patches and do the test. After I launch QEmu,
> In guest:
> 
> [...]
> Capabilities: [100 v1] Advanced Error Reporting
> [...]
> 
> Fourthly, I modify the PCI_ERR_VER: #define PCI_ERR_VER  1.
> In guest:
> [...]
> Capabilities: [100 v1] Advanced Error Reporting
> [...]
> 
> They also have the same features.
> 
> At last, I think the v1/v2 is not influence on our AER function.
> But, it is actually strange that we can get a v1 AER capability
> w/o aer=on and a v2 with.
> I suggest that we add a capability version parameter to extend the
> pcie_aer_init(),or add a new pcie_aer_v1_init() for the devices with
> v1 AER cap.

Agreed.  Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]