qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device


From: Alex Williamson
Subject: Re: [Qemu-devel] [RFC PATCH v3 2/3] VFIO driver for vGPU device
Date: Wed, 4 May 2016 11:06:19 -0600

On Wed, 4 May 2016 03:23:13 +0000
"Tian, Kevin" <address@hidden> wrote:

> > From: Alex Williamson [mailto:address@hidden
> > Sent: Wednesday, May 04, 2016 6:43 AM  
> > > +
> > > +         if (gpu_dev->ops->write) {
> > > +                 ret = gpu_dev->ops->write(vgpu_dev,
> > > +                                           user_data,
> > > +                                           count,
> > > +                                           vgpu_emul_space_config,
> > > +                                           pos);
> > > +         }
> > > +
> > > +         memcpy((void *)(vdev->vconfig + pos), (void *)user_data, 
> > > count);  
> > 
> > So write is expected to user_data to allow only the writable bits to be
> > changed?  What's really being saved in the vconfig here vs the vendor
> > vgpu driver?  It seems like we're only using it to cache the BAR
> > values, but we're not providing the BAR emulation here, which seems
> > like one of the few things we could provide so it's not duplicated in
> > every vendor driver.  But then we only need a few u32s to do that, not
> > all of config space.  
> 
> We can borrow same vconfig emulation from existing vfio-pci driver.
> But doing so doesn't mean that vendor vgpu driver cannot have its
> own vconfig emulation further. vGPU is not like a real device, since
> there may be no physical config space implemented for each vGPU.
> So anyway vendor vGPU driver needs to create/emulate the virtualized 
> config space while the way how is created might be vendor specific. 
> So better to keep the interface to access raw vconfig space from
> vendor vGPU driver.

I'm hoping config space will be very simple for a vgpu, so I don't know
that it makes sense to add that complexity early on.  Neo/Kirti, what
capabilities do you expect to provide?  Who provides the MSI
capability?  Is a PCIe capability provided?  Others?
 
> > > +static ssize_t vgpu_dev_rw(void *device_data, char __user *buf,
> > > +         size_t count, loff_t *ppos, bool iswrite)
> > > +{
> > > + unsigned int index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
> > > + struct vfio_vgpu_device *vdev = device_data;
> > > +
> > > + if (index >= VFIO_PCI_NUM_REGIONS)
> > > +         return -EINVAL;
> > > +
> > > + switch (index) {
> > > + case VFIO_PCI_CONFIG_REGION_INDEX:
> > > +         return vgpu_dev_config_rw(vdev, buf, count, ppos, iswrite);
> > > +
> > > + case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
> > > +         return vgpu_dev_bar_rw(vdev, buf, count, ppos, iswrite);
> > > +
> > > + case VFIO_PCI_ROM_REGION_INDEX:
> > > + case VFIO_PCI_VGA_REGION_INDEX:  
> > 
> > Wait a sec, who's doing the VGA emulation?  We can't be claiming to
> > support a VGA region and then fail to provide read/write access to it
> > like we said it has.  
> 
> For Intel side we plan to not support VGA region when upstreaming our
> KVMGT work, which means Intel vGPU will be exposed only as a 
> secondary graphics card then so legacy VGA is not required. Also no
> VBIOS/ROM requirement. Guess we can remove above two regions.

So this needs to be optional based on what the mediation driver
provides.  It seems like we're just making passthroughs for the vendor
mediation driver to speak vfio.

> > > +
> > > +static int vgpu_dev_mmio_fault(struct vm_area_struct *vma, struct 
> > > vm_fault *vmf)
> > > +{
> > > + int ret = 0;
> > > + struct vfio_vgpu_device *vdev = vma->vm_private_data;
> > > + struct vgpu_device *vgpu_dev;
> > > + struct gpu_device *gpu_dev;
> > > + u64 virtaddr = (u64)vmf->virtual_address;
> > > + u64 offset, phyaddr;
> > > + unsigned long req_size, pgoff;
> > > + pgprot_t pg_prot;
> > > +
> > > + if (!vdev && !vdev->vgpu_dev)
> > > +         return -EINVAL;
> > > +
> > > + vgpu_dev = vdev->vgpu_dev;
> > > + gpu_dev  = vgpu_dev->gpu_dev;
> > > +
> > > + offset   = vma->vm_pgoff << PAGE_SHIFT;
> > > + phyaddr  = virtaddr - vma->vm_start + offset;
> > > + pgoff    = phyaddr >> PAGE_SHIFT;
> > > + req_size = vma->vm_end - virtaddr;
> > > + pg_prot  = vma->vm_page_prot;
> > > +
> > > + if (gpu_dev->ops->validate_map_request) {
> > > +         ret = gpu_dev->ops->validate_map_request(vgpu_dev, virtaddr, 
> > > &pgoff,
> > > +                                                  &req_size, &pg_prot);
> > > +         if (ret)
> > > +                 return ret;
> > > +
> > > +         if (!req_size)
> > > +                 return -EINVAL;
> > > + }
> > > +
> > > + ret = remap_pfn_range(vma, virtaddr, pgoff, req_size, pg_prot);  
> > 
> > So not supporting validate_map_request() means that the user can
> > directly mmap BARs of the host GPU and as shown below, we assume a 1:1
> > mapping of vGPU BAR to host GPU BAR.  Is that ever valid in a vGPU
> > scenario or should this callback be required?  It's not clear to me how
> > the vendor driver determines what this maps to, do they compare it to
> > the physical device's own BAR addresses?  
> 
> I didn't quite understand too. Based on earlier discussion, do we need
> something like this, or could achieve the purpose just by leveraging
> recent sparse mmap support?

The reason for faulting in the mmio space, if I recall correctly, is to
enable an ordering where the user driver (QEMU) can mmap regions of the
device prior to resources being allocated on the host GPU to handle
them.  Sparse mmap only partially handles that, it's not dynamic.  With
this faulting mechanism, the host GPU doesn't need to commit resources
until the mmap is actually accessed.  Thanks,

Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]