qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhance


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc enhances
Date: Mon, 20 Feb 2017 16:32:49 +0800
User-agent: Mutt/1.5.24 (2015-08-30)

On Mon, Feb 20, 2017 at 08:17:32AM +0000, Liu, Yi L wrote:
> > -----Original Message-----
> > From: Qemu-devel [mailto:address@hidden
> > On Behalf Of Peter Xu
> > Sent: Monday, February 20, 2017 3:48 PM
> > To: Alex Williamson <address@hidden>
> > Cc: Lan, Tianyu <address@hidden>; Tian, Kevin <address@hidden>;
> > address@hidden; address@hidden; address@hidden; qemu-
> > address@hidden; address@hidden; David Gibson
> > <address@hidden>
> > Subject: Re: [Qemu-devel] [PATCH v7 00/17] VT-d: vfio enablement and misc
> > enhances
> > 
> > On Fri, Feb 17, 2017 at 10:18:35AM -0700, Alex Williamson wrote:
> > > On Tue,  7 Feb 2017 16:28:02 +0800
> > > Peter Xu <address@hidden> wrote:
> > >
> > > > This is v7 of vt-d vfio enablement series.
> > > [snip]
> > > > =========
> > > > Test Done
> > > > =========
> > > >
> > > > Build test passed for x86_64/arm/ppc64.
> > > >
> > > > Simply tested with x86_64, assigning two PCI devices to a single VM,
> > > > boot the VM using:
> > > >
> > > > bin=x86_64-softmmu/qemu-system-x86_64
> > > > $bin -M q35,accel=kvm,kernel-irqchip=split -m 1G \
> > > >      -device intel-iommu,intremap=on,eim=off,caching-mode=on \
> > > >      -netdev user,id=net0,hostfwd=tcp::5555-:22 \
> > > >      -device virtio-net-pci,netdev=net0 \
> > > >      -device vfio-pci,host=03:00.0 \
> > > >      -device vfio-pci,host=02:00.0 \
> > > >      -trace events=".trace.vfio" \
> > > >      /var/lib/libvirt/images/vm1.qcow2
> > > >
> > > > pxdev:bin [vtd-vfio-enablement]# cat .trace.vfio
> > > > vtd_page_walk*
> > > > vtd_replay*
> > > > vtd_inv_desc*
> > > >
> > > > Then, in the guest, run the following tool:
> > > >
> > > >
> > > > https://github.com/xzpeter/clibs/blob/master/gpl/userspace/vfio-bind
> > > > -group/vfio-bind-group.c
> > > >
> > > > With parameter:
> > > >
> > > >   ./vfio-bind-group 00:03.0 00:04.0
> > > >
> > > > Check host side trace log, I can see pages are replayed and mapped
> > > > in
> > > > 00:04.0 device address space, like:
> > > >
> > > > ...
> > > > vtd_replay_ce_valid replay valid context device 00:04.00 hi 0x401 lo
> > > > 0x38fe1001 vtd_page_walk Page walk for ce (0x401, 0x38fe1001) iova
> > > > range 0x0 - 0x8000000000 vtd_page_walk_level Page walk
> > > > (base=0x38fe1000, level=3) iova range 0x0 - 0x8000000000
> > > > vtd_page_walk_level Page walk (base=0x35d31000, level=2) iova range
> > > > 0x0 - 0x40000000 vtd_page_walk_level Page walk (base=0x34979000,
> > > > level=1) iova range 0x0 - 0x200000 vtd_page_walk_one Page walk
> > > > detected map level 0x1 iova 0x0 -> gpa 0x22dc3000 mask 0xfff perm 3
> > > > vtd_page_walk_one Page walk detected map level 0x1 iova 0x1000 ->
> > > > gpa 0x22e25000 mask 0xfff perm 3 vtd_page_walk_one Page walk
> > > > detected map level 0x1 iova 0x2000 -> gpa 0x22e12000 mask 0xfff perm
> > > > 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x3000 ->
> > > > gpa 0x22e2d000 mask 0xfff perm 3 vtd_page_walk_one Page walk
> > > > detected map level 0x1 iova 0x4000 -> gpa 0x12a49000 mask 0xfff perm
> > > > 3 vtd_page_walk_one Page walk detected map level 0x1 iova 0x5000 ->
> > > > gpa 0x129bb000 mask 0xfff perm 3 vtd_page_walk_one Page walk
> > > > detected map level 0x1 iova 0x6000 -> gpa 0x128db000 mask 0xfff perm 3
> > vtd_page_walk_one Page walk detected map level 0x1 iova 0x7000 -> gpa
> > 0x12a80000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map
> > level 0x1 iova 0x8000 -> gpa 0x12a7e000 mask 0xfff perm 3
> > vtd_page_walk_one Page walk detected map level 0x1 iova 0x9000 -> gpa
> > 0x12b22000 mask 0xfff perm 3 vtd_page_walk_one Page walk detected map
> > level 0x1 iova 0xa000 -> gpa 0x12b41000 mask 0xfff perm 3 ...
> > >
> > > Hi Peter,
> > >
> > > I'm trying to make use of this, with your vtd-vfio-enablement-v7
> > > branch (HEAD 0c1c4e738095).  I'm assigning an 82576 PF to a VM.  It
> > > works with iommu=pt, but if I remove that option, the device does not
> > > work and vfio_iommu_map_notify is never called.  Any suggestions?  My
> > > commandline is below.  Thanks,
> > >
> > > Alex
> > >
> > > /usr/local/bin/qemu-system-x86_64 \
> > >         -name guest=l1,debug-threads=on -S \
> > >         -machine pc-q35-2.9,accel=kvm,usb=off,dump-guest-core=off,kernel-
> > irqchip=split \
> > >         -cpu host -m 10240 -realtime mlock=off -smp
> > 4,sockets=1,cores=2,threads=2 \
> > >         -no-user-config -nodefaults -monitor stdio -rtc 
> > > base=utc,driftfix=slew \
> > >         -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown \
> > >         -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 \
> > >         -boot strict=on \
> > >         -device ioh3420,port=0x10,chassis=1,id=pci.1,bus=pcie.0,addr=0x2 \
> > >         -device i82801b11-bridge,id=pci.2,bus=pcie.0,addr=0x1e \
> > >         -device pci-bridge,chassis_nr=3,id=pci.3,bus=pci.2,addr=0x0 \
> > >         -device ioh3420,port=0x18,chassis=4,id=pci.4,bus=pcie.0,addr=0x3 \
> > >         -device ioh3420,port=0x20,chassis=5,id=pci.5,bus=pcie.0,addr=0x4 \
> > >         -device ioh3420,port=0x28,chassis=6,id=pci.6,bus=pcie.0,addr=0x5 \
> > >         -device ioh3420,port=0x30,chassis=7,id=pci.7,bus=pcie.0,addr=0x6 \
> > >         -device ioh3420,port=0x38,chassis=8,id=pci.8,bus=pcie.0,addr=0x7 \
> > >         -device ich9-usb-ehci1,id=usb,bus=pcie.0,addr=0x1d.0x7 \
> > >         -device ich9-usb-
> > uhci1,masterbus=usb.0,firstport=0,bus=pcie.0,multifunction=on,addr=0x1d \
> > >         -device ich9-usb-
> > uhci2,masterbus=usb.0,firstport=2,bus=pcie.0,addr=0x1d.0x1 \
> > >         -device ich9-usb-
> > uhci3,masterbus=usb.0,firstport=4,bus=pcie.0,addr=0x1d.0x2 \
> > >         -device virtio-serial-pci,id=virtio-serial0,bus=pci.4,addr=0x0 \
> > >         -drive file=/dev/vg_s20/lv_l1,format=raw,if=none,id=drive-virtio-
> > disk0,cache=none,aio=native \
> > >         -device 
> > > virtio-blk-pci,scsi=off,bus=pci.5,addr=0x0,drive=drive-virtio-
> > disk0,id=virtio-disk0,bootindex=1 \
> > >         -netdev user,id=hostnet0 \
> > >         -device virtio-net-
> > pci,netdev=hostnet0,id=net0,mac=52:54:00:c2:62:30,bus=pci.1,addr=0x0 \
> > >         -device usb-tablet,id=input0,bus=usb.0,port=1 \
> > >         -vnc :0 -vga std \
> > >         -device vfio-pci,host=01:00.0,id=hostdev0,bus=pci.8,addr=0x0 \
> > >         -device intel-iommu,intremap=on,eim=off,caching-mode=on -trace
> > > events=/trace-events.txt -msg timestamp=on
> > 
> > Alex,
> > 
> > Thanks for testing this series.
> > 
> > I think I reproduced it using my 10g nic as well. What I got is:
> > 
> > [   23.724787] ixgbe 0000:01:00.0 enp1s0: Detected Tx Unit Hang
> > [   23.724787]   Tx Queue             <0>
> > [   23.724787]   TDH, TDT             <0>, <1>
> > [   23.724787]   next_to_use          <1>
> > [   23.724787]   next_to_clean        <0>
> > [   23.724787] tx_buffer_info[next_to_clean]
> > [   23.724787]   time_stamp           <fffbb8bb>
> > [   23.724787]   jiffies              <fffbc780>
> > [   23.729580] ixgbe 0000:01:00.0 enp1s0: tx hang 1 detected on queue 0,
> > resetting adapter
> > [   23.730752] ixgbe 0000:01:00.0 enp1s0: initiating reset due to tx timeout
> > [   23.731768] ixgbe 0000:01:00.0 enp1s0: Reset adapter
> > 
> > Is this the problem you have encountered? (adapter continuously reset)
> > 
> > Interestingly, I found that the problem solves itself after I move the 
> > "-device
> > intel-iommu,..." line before all the other devices.
> 
> I also encountered this interesting thing. yes, it is. you must place
> "-device intel-iommu" before the vfio-pci devices. If I remember correctly, 
> if "device intel-iommu" is not in front the others, the vtd_realize is called 
> after
> vfio_initfn, which would result in no calling of the following code snapshot.
> Then there is no channel between vfio device and intel-iommu, so everything
> is possible if such channel is gone. So better to place "intel-iommu" first 
> place^_^
> 
> hw/vfio/common.c: vfio_listener_region_add()
>     if (memory_region_is_iommu(section->mr)) {
>         VFIOGuestIOMMU *giommu;
> 
>         trace_vfio_listener_region_add_iommu(iova, end);
>         /*
>          * FIXME: For VFIO iommu types which have KVM acceleration to
>          * avoid bouncing all map/unmaps through qemu this way, this
>          * would be the right place to wire that up (tell the KVM
>          * device emulation the VFIO iommu handles to use).
>          */
>         giommu = g_malloc0(sizeof(*giommu));
>         giommu->iommu = section->mr;
>         giommu->iommu_offset = section->offset_within_address_space -
>                                section->offset_within_region;
>         giommu->container = container;
>         giommu->n.notify = vfio_iommu_map_notify;
>         giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;

Yeah. I think that's possibly because when we do "-device vfio-pci"
first then "-device intel-iommu" then we are actually listening to the
&address_space_memory and any real update on the IOMMU address space
is lost.

Imho forcing user to add "-device intel-iommu" first might be a little
bit "tough" indeed. Not sure whether we should just provide (or do we
have it?) a way to decide the init order of device list.

Thanks,

-- peterx



reply via email to

[Prev in Thread] Current Thread [Next in Thread]