qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] VFIO VGA test branches


From: Knut Omang
Subject: Re: [Qemu-devel] VFIO VGA test branches
Date: Tue, 14 May 2013 08:42:27 +0200

On Mon, 2013-05-13 at 16:23 -0600, Alex Williamson wrote:
> On Mon, 2013-05-13 at 22:55 +0200, Knut Omang wrote:
> > Hi all,
> > 
> > Perfect timing from my perspective, thanks Alex!
> > 
> > I spent the better part of the weekend testing your branches on a new 
> > system 
> > I just put together for this purpose, results below..
> > 
> > On Fri, 2013-05-03 at 16:56 -0600, Alex Williamson wrote:
> > ...
> > > git://github.com/awilliam/linux-vfio.git vfio-vga-reset
> > > git://github.com/awilliam/qemu-vfio.git vfio-vga-reset
> > 
> > System setup: 
> > 
> > - Fedora 18 on
> > - Gigabyte Z77X-UD5H motherboard
> > - Intel Core i7 3770 (Ivy bridge w/integrated graphics)
> > - 2 discrete graphics cards:
> > 
> > lspci | egrep 'VGA|Audio'
> > 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v2/3rd 
> > Gen Core processor Graphics Controller (rev 09)
> > 00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family 
> > High Definition Audio Controller (rev 04)
> > 01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI 
> > Caicos [Radeon HD 6450]
> > 01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Caicos HDMI 
> > Audio [Radeon HD 6400 Series]
> > 02:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI 
> > Cape Verde PRO [Radeon HD 7700 Series]
> > 02:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape 
> > Verde/Pitcairn HDMI Audio [Radeon HD 7700/7800 Series]
> > 
> > Short summary:
> > 
> > - Once I got past a few time consuming obstacles explained below
> >    - the graphics part of the graphics/hdmi audio passthrough seems to work 
> > perfect
> >      on both discrete graphics cards 
> >      (though so far only one at at time and with some minor issues, see 
> > below)
> >    - no success with the hdmi audio yet (ideas for further investigation 
> > appreciated!)
> 
> I've had hdmi audio working with an HD7850, but only in Windows (7) and
> it was using legacy interrupts for some reason instead of MSI.  I wonder
> if Liux guests might work with snd_hda_intel.enable_msi=0.  I'm not sure
> what's wrong with MSI, but it seems to be new with the PCI bus reset
> support.

I tried 

modprobe -r snd_hda_intel
modprobe snd_hda_intel enable_msi=0

- did not seem to have any effect on Linux.
Here is the guest's lspci -vvv entry for the audio after the above:

01:00.1 Audio device: Advanced Micro Devices [AMD] nee ATI Cape Verde/Pitcairn 
HDMI Audio [Radeon HD 7700/7800 Series]
        Subsystem: PC Partner Limited Device aab0
        Physical Slot: 0
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- 
<MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin B routed to IRQ 17
        Region 0: Memory at fea60000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 
unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- 
Unsupported-
                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- 
TransPend-
                LnkCap: Port #1, Speed 8GT/s, Width x16, ASPM L0s L1, Latency 
L0 <64ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- 
BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, 
OBFF Not Supported
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, 
OBFF Disabled
                LnkSta2: Current De-emphasis Level: -3.5dB, 
EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, 
LinkEqualizationRequest-
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 00000000fee00000  Data: 4072
        Capabilities: [100 v1] Vendor Specific Information: ID=0001 Rev=1 
Len=010 <?>
        Capabilities: [150 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- 
RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Kernel driver in use: snd_hda_intel

> > - Contrary to address@hidden I had no success with using pci-assign for VGA
> >   with a standard fedora 18 kernel and fairly recent qemu, nor with your 
> > branches, 
> > 
> > Details:
> > 
> > - I started off with the required kernel parameter 'intel_iommu=on' + 
> > necessary parameters for disabling radeon
> >    (radeon.modeset=0 rd.driver.blacklist=radeon) using the integrated 
> > graphics as primary display
> >    - this caused the system to freeze (with color artifacts on the console)
> > 
> > - In my naivity and because of the "i" in ifgx I tried both with 
> >   'intel_iommu=ifgx_off' and then 'intel_iommu=on,igfx_off' 
> >   and a full set of combinations of vfio, cards, kernels and pci-assign 
> > before I suspected 
> >   that iommu support was turned off for **all** graphics cards with igfx_off
> 
> I'm not sure why this is, looks like the code only tries to turn it off
> when only graphics is under the remapping device.  We'd probably need to
> see the DMAR to know more (/sys/firmware/acpi/tables/DMAR).

Attaching both a decoded one and the raw dump.

> > - The solution was to have integrated graphics turned off in the BIOS, and 
> > 'intel_iommu=on':
> > 
> > - iommu groups:
> > 
> > ls -l /sys/bus/pci/devices/0000:01:00.0/iommu_group/devices
> > total 0
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:00:01.0 -> 
> > ../../../../devices/pci0000:00/0000:00:01.0
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:00:01.1 -> 
> > ../../../../devices/pci0000:00/0000:00:01.1
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:01:00.0 -> 
> > ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:01:00.1 -> 
> > ../../../../devices/pci0000:00/0000:00:01.0/0000:01:00.1
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:02:00.0 -> 
> > ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.0
> > lrwxrwxrwx 1 root root 0 May 11 08:55 0000:02:00.1 -> 
> > ../../../../devices/pci0000:00/0000:00:01.1/0000:02:00.1
> > 
> > - eg. both the VGA/HDMI Audio pairs + the two root ports they are plugged 
> > into are in the same group:
> 
> Ick.  Intel has been pretty good about advertising ACS support on their
> root ports.  I wonder if this is an oversight or if they are actually
> not isolated from each other.

Sad state I'm afraid - one of the reasons I went for Intel this time - I
have usually chosen AMD in the past but had a bad experience with an FM1
board with no IOMMU support.. 

No ACS on any of the root ports (or anything else..) - see attachment.. 

I wish there were lspci -vvv's out there for all hardware - quite a
gamble to buy motherboards if one wish to utilize it for something more
than a plain Windows install..

> > # lspci -n
> > ...
> > 01:00.0 0300: 1002:683f
> > 01:00.1 0403: 1002:aab0
> > 02:00.0 0300: 1002:6779
> > 02:00.1 0403: 1002:aa98
> > ...
> > 
> > modprobe vfio_pci
> > echo 0000:01:00.1 > /sys/bus/pci/devices/0000\:01\:00.1/driver/unbind
> > echo 0000:02:00.1 > /sys/bus/pci/devices/0000\:02\:00.1/driver/unbind
> > echo 1002 683f > /sys/bus/pci/drivers/vfio-pci/new_id
> > echo 1002 aab0 > /sys/bus/pci/drivers/vfio-pci/new_id
> > echo 1002 6779 > /sys/bus/pci/drivers/vfio-pci/new_id
> > echo 1002 aa98 > /sys/bus/pci/drivers/vfio-pci/new_id
> > 
> > # lsusb 
> > ...
> > Bus 001 Device 008: ID 046d:c315 Logitech, Inc. Classic New Touch Keyboard
> > Bus 001 Device 004: ID 046d:c05b Logitech, Inc. M-U0004 810-001317 [B110 
> > Optical USB Mouse]
> > ...
> > 
> > - I also applied your suggested patch to the quirk function in VFIO (see 
> > below)
> > 
> > - Here is a (trimmed for readability) command line I successfully used to 
> > boot from the Windows 7 install DVD, 
> >   notice the cd and disk device descriptions and the bus parameter - I 
> > struggled a while with that 
> >   until I came across a comment by Gerd Hoffmann here: 
> > https://bugzilla.redhat.com/show_bug.cgi?id=922670 (Thanks, Gerd!)
> > 
> > 
> > qemu-kvm -M q35 \
> >   -nodefconfig -readconfig $SRC/qemu/docs/q35-chipset.cfg \
> >   -device 
> > vfio-pci,host=2:00.0,x-vga=on,multifunction=on,bus=ich9-pcie-port-1,addr=0.0
> >  \
> >   -device vfio-pci,host=2:00.1,bus=ich9-pcie-port-1,addr=0.1 \
> >   -L $SRC/seabios/out/ -L $SRC/qemu/pc-bios \
> >   -vga none -nographic -cpu host -rtc base=localtime -k no -m 8192 -smp 2 \
> >   -drive file=/dev/sr0,index=2,media=cdrom,id=cd \
> >   -drive file=ivm03.img,index=0,media=disk,id=ivm03 \
> >   -device ide-drive,drive=ivm03,bus=ide.0 \
> >   -device ide-cd,drive=cd,bus=ide.1 \
> >   -net nic,vlan=0,model=virtio -net tap,vlan=0 \
> >   -enable-kvm \
> >   -device usb-host,hostbus=1,hostaddr=8 \
> >   -device usb-host,hostbus=1,hostaddr=4
> > 
> > - Both the graphics card seemshould really support ACS on s to have a rom 
> > but only the HD6450 let itself to "scraping". 
> 
> Did you try scraping the HD6450 while the HD7700 was the boot VGA and
> vica versa?  The boot VGA ROM is handled in a special way and what you
> really get is the shadow copy, which isn't what we want.

I did all the scraping work with the Radeons with my initial setup while
the integrated graphics was the primary display. I tried once more now
to scrape the HD7700 while the HD6450 is the primary VGA and still get
the same result:

# echo 1 > /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/rom
# cat /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/rom > HD7700.rom 
cat: /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/rom: Input/output error

system log reports:
May 14 07:15:54 asu kernel: [   82.344189] pci 0000:02:00.0: Invalid ROM 
contents

> 
> > Anyway, supplying it to vfio did not seem to make any difference.
> > 
> > find /sys -name rom
> > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/rom
> > /sys/devices/pci0000:00/0000:00:01.1/0000:02:00.0/rom
> > ...
> > 
> > Some observations and remaining unresolved issues:
> > 
> > - VFIO patch:
> >   Initially (while still running with igfx_off) I observed exactly the same 
> > behaviour as address@hidden
> >   reported a while ago: With vfio_pci debug enabled, vfio_pci ended up 
> > spinning with repeated calls to
> >   vfio_ati_3c3_quirk_read and repeated logs: 
> >     vfio: vfio_vga_read(0x3c3, 1) = 0x0
> >   I patched up accordingly with 
> > 
> > 
> > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> > index da0e5f9..a361d06 100644
> > --- a/hw/misc/vfio.c
> > +++ b/hw/misc/vfio.c
> > @@ -1291,7 +1291,7 @@ static uint64_t vfio_ati_3c3_quirk_read(void *opaque,
> >      uint64_t data = vfio_vga_read(&vdev->vga.region[QEMU_PCI_VGA_IO_HI],
> >                                    addr + quirk->data.base_offset, size);
> >  
> > -    if (data == quirk->data.address_match) {
> > +    if (1 || data == quirk->data.address_match) {
> >          data = vfio_pci_read_config(&vdev->pdev, quirk->data.address_val, 
> > size);
> >          DPRINTF("%s(0x3c3, 1) = 0x%"PRIx64"\n", __func__, data);
> >      }
> > 
> > 
> >   This of course did not help much until I actually got the iommu 
> >   enabled for the radeons (similar "repeated patters" as deniv reported)
> >   but what I have observed after I got it working is that if 
> >   I disable the patch above, things are not that well: the Fedora VM 
> >   comes up with VGA and the Fedora boot screen, then goes blank when 
> >   switching to X.
> 
> Hmm, I think we'd probably have better luck making that unconditional
> until we have reason to do otherwise.

I'm starting to wonder whether there's some timing issue or maybe
something with the initial state of the hardware affecting this.
It might be that the blank screen situation is more likely to occur if
debug is enabled - this morning I saw the same behavior even with the
patch enabled - then tried once more and got success. This was right
after a reboot so I tried a "warm" reboot and the same happened again:
First attempt got through the initial VGA phase then blanked, I ^C'ed
the VM then restarted and got all the way to the GUI again..

> > - The fact that the iommu group now extends across all my available 
> > graphics 
> >   devices now makes it difficult to  get the radeon (or catalyst) driver 
> > use to 
> >   the other card since the vfio_pci driver needs to hold it.
> >   Not a complete showstopper since the vesa driver comes up with 1024x768..
> >   Might it be a good idea to have an override option (exception list or 
> > similar?) 
> >   to allow the vfio_pci to be less restrictive about owning the whole group 
> >    - allow functionality over security in such case? This of course is 
> > further complicated
> >   by the need for graphics drivers to be disabled/enabled already at the 
> > kernel prompt..
> 
> We have a quirk in the kernel that enables us to witelist devices, but
> yes, there is no flexibility in this w/o modifying the code and
> rebuilding.  (see drivers/pci/quirks.c:pci_dev_acs_enabled and follow
> the example above w/ pci_dev_dma_source - function can just return 1)

Thanks, I'll have a look at that,

> > - There seems to be a bug in the (version F8) UEFI BIOS on the motherboard, 
> >   The BIOS offers (undocumented) a full range of selections of which PCIe 
> >   (or PCIe 1x) graphics card to use as primary, but any other selection 
> >   than the first PCIe 16x slot has no effect and the motherboard reverts 
> >   to the first slot, so to be able to test both cards, I had to put the 
> > card under test
> >   into the second (8x) PCIe slot. I am waiting for feedback from Gigabyte 
> > on possible 
> >   fixes for this in newer BIOSes.
> > 
> > - The ultimate goal is to try to consolidate some older Windows desktops as 
> > "seats" 
> >   on the new system, using the discrete graphics with HDMI/Displayport 
> > audio. 
> >   With the HD7700 moved to the second PCIe slot I tested both Windows and 
> >   Linux guests to try to get some sound through the HDMI audio device. 
> >   Windows complains that no usable device is available. On Linux (Fedora 
> > 18, KDE desktop), 
> >   the system settings -> multimedia dialogue never opens up which seems to 
> > indicate that 
> >   PulseAudio has problems communicating with the passed through device (?), 
> >   any hints/pointers here appreciated. From the vfio log it seems at least
> >   config space is accessed ok.
> > 
> > - There also seems to be issues with radeon and intel_iommu=on - if I try 
> >   to enable modesetting and normal X support for the radeon cards, X fails 
> > to start.
> > 
> > - It would be nice if the integrated graphics could be used as the host 
> > primary display - 
> >   I would be happy if someone has any hints as to if/how the ifgx_off 
> > option 
> >   could be extended/modified to only affect iommu operation on selected 
> > device(s),
> >   if at all possible..
> 
> Let's see what we can discover from your DMAR.  Also send along sudo
> lspci -vvv.  Thanks,

Attached.

One "interesting feature" I have never seen before on a motherboard is 
that I get "pcilib: sysfs_read_vpd: read failed: Connection timed out"
while doing the lspci -vvv but this appears to come from trying to read
the Vital Product Data capability of the secondary onboard ethernet
[07:00.0 Ethernet controller: Atheros Communications Inc. AR8151 v2.0
Gigabit Ethernet (rev c0)] which should not have any significance here..

Thanks,

Knut

Attachment: lspci_asu.txt
Description: Text document

Attachment: DMAR_asu.dsl
Description: Text Data

Attachment: DMAR_asu.raw
Description: image/panasonic-raw


reply via email to

[Prev in Thread] Current Thread [Next in Thread]