qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device


From: Neo Jia
Subject: Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device
Date: Wed, 4 May 2016 14:14:19 -0700
User-agent: Mutt/1.5.24 (2015-08-30)

On Wed, May 04, 2016 at 11:06:19AM -0600, Alex Williamson wrote:
> On Wed, 4 May 2016 03:23:13 +0000
> "Tian, Kevin" <address@hidden> wrote:
> 
> > > From: Alex Williamson [mailto:address@hidden
> > > Sent: Wednesday, May 04, 2016 6:43 AM  
> > > > +
> > > > +               if (gpu_dev->ops->write) {
> > > > +                       ret = gpu_dev->ops->write(vgpu_dev,
> > > > +                                                 user_data,
> > > > +                                                 count,
> > > > +                                                 
> > > > vgpu_emul_space_config,
> > > > +                                                 pos);
> > > > +               }
> > > > +
> > > > +               memcpy((void *)(vdev->vconfig + pos), (void 
> > > > *)user_data, count);  
> > > 
> > > So write is expected to user_data to allow only the writable bits to be
> > > changed?  What's really being saved in the vconfig here vs the vendor
> > > vgpu driver?  It seems like we're only using it to cache the BAR
> > > values, but we're not providing the BAR emulation here, which seems
> > > like one of the few things we could provide so it's not duplicated in
> > > every vendor driver.  But then we only need a few u32s to do that, not
> > > all of config space.  
> > 
> > We can borrow same vconfig emulation from existing vfio-pci driver.
> > But doing so doesn't mean that vendor vgpu driver cannot have its
> > own vconfig emulation further. vGPU is not like a real device, since
> > there may be no physical config space implemented for each vGPU.
> > So anyway vendor vGPU driver needs to create/emulate the virtualized 
> > config space while the way how is created might be vendor specific. 
> > So better to keep the interface to access raw vconfig space from
> > vendor vGPU driver.
> 
> I'm hoping config space will be very simple for a vgpu, so I don't know
> that it makes sense to add that complexity early on.  Neo/Kirti, what
> capabilities do you expect to provide?  Who provides the MSI
> capability?  Is a PCIe capability provided?  Others?

Currently only standard PCI caps.

MSI cap is emulated by the vendor drivers via the above interface.

No PCIe caps so far.

>  
> > > > +static ssize_t vgpu_dev_rw(void *device_data, char __user *buf,
> > > > +               size_t count, loff_t *ppos, bool iswrite)
> > > > +{
> > > > +       unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
> > > > +       struct vfio_vgpu_device *vdev = device_data;
> > > > +
> > > > +       if (index >= VFIO_PCI_NUM_REGIONS)
> > > > +               return -EINVAL;
> > > > +
> > > > +       switch (index) {
> > > > +       case VFIO_PCI_CONFIG_REGION_INDEX:
> > > > +               return vgpu_dev_config_rw(vdev, buf, count, ppos, 
> > > > iswrite);
> > > > +
> > > > +       case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
> > > > +               return vgpu_dev_bar_rw(vdev, buf, count, ppos, iswrite);
> > > > +
> > > > +       case VFIO_PCI_ROM_REGION_INDEX:
> > > > +       case VFIO_PCI_VGA_REGION_INDEX:  
> > > 
> > > Wait a sec, who's doing the VGA emulation?  We can't be claiming to
> > > support a VGA region and then fail to provide read/write access to it
> > > like we said it has.  
> > 
> > For Intel side we plan to not support VGA region when upstreaming our
> > KVMGT work, which means Intel vGPU will be exposed only as a 
> > secondary graphics card then so legacy VGA is not required. Also no
> > VBIOS/ROM requirement. Guess we can remove above two regions.
> 
> So this needs to be optional based on what the mediation driver
> provides.  It seems like we're just making passthroughs for the vendor
> mediation driver to speak vfio.
> 
> > > > +
> > > > +static int vgpu_dev_mmio_fault(struct vm_area_struct *vma, struct 
> > > > vm_fault *vmf)
> > > > +{
> > > > +       int ret = 0;
> > > > +       struct vfio_vgpu_device *vdev = vma->vm_private_data;
> > > > +       struct vgpu_device *vgpu_dev;
> > > > +       struct gpu_device *gpu_dev;
> > > > +       u64 virtaddr = (u64)vmf->virtual_address;
> > > > +       u64 offset, phyaddr;
> > > > +       unsigned long req_size, pgoff;
> > > > +       pgprot_t pg_prot;
> > > > +
> > > > +       if (!vdev && !vdev->vgpu_dev)
> > > > +               return -EINVAL;
> > > > +
> > > > +       vgpu_dev = vdev->vgpu_dev;
> > > > +       gpu_dev  = vgpu_dev->gpu_dev;
> > > > +
> > > > +       offset   = vma->vm_pgoff << PAGE_SHIFT;
> > > > +       phyaddr  = virtaddr - vma->vm_start + offset;
> > > > +       pgoff    = phyaddr >> PAGE_SHIFT;
> > > > +       req_size = vma->vm_end - virtaddr;
> > > > +       pg_prot  = vma->vm_page_prot;
> > > > +
> > > > +       if (gpu_dev->ops->validate_map_request) {
> > > > +               ret = gpu_dev->ops->validate_map_request(vgpu_dev, 
> > > > virtaddr, &pgoff,
> > > > +                                                        &req_size, 
> > > > &pg_prot);
> > > > +               if (ret)
> > > > +                       return ret;
> > > > +
> > > > +               if (!req_size)
> > > > +                       return -EINVAL;
> > > > +       }
> > > > +
> > > > +       ret = remap_pfn_range(vma, virtaddr, pgoff, req_size, pg_prot); 
> > > >  
> > > 
> > > So not supporting validate_map_request() means that the user can
> > > directly mmap BARs of the host GPU and as shown below, we assume a 1:1
> > > mapping of vGPU BAR to host GPU BAR.  Is that ever valid in a vGPU
> > > scenario or should this callback be required?  It's not clear to me how
> > > the vendor driver determines what this maps to, do they compare it to
> > > the physical device's own BAR addresses?  
> > 
> > I didn't quite understand too. Based on earlier discussion, do we need
> > something like this, or could achieve the purpose just by leveraging
> > recent sparse mmap support?
> 
> The reason for faulting in the mmio space, if I recall correctly, is to
> enable an ordering where the user driver (QEMU) can mmap regions of the
> device prior to resources being allocated on the host GPU to handle
> them.  Sparse mmap only partially handles that, it's not dynamic.  With
> this faulting mechanism, the host GPU doesn't need to commit resources
> until the mmap is actually accessed.  Thanks,

Correct.

Thanks,
Neo

> 
> Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]