qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device


From: Alex Williamson
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Fri, 30 Sep 2011 15:59:29 -0600

On Fri, 2011-09-30 at 10:37 -0600, Alex Williamson wrote:
> On Fri, 2011-09-30 at 18:46 +1000, David Gibson wrote:
> > On Mon, Sep 26, 2011 at 12:34:52PM -0600, Alex Williamson wrote:
> > > On Mon, 2011-09-26 at 12:04 +0200, Alexander Graf wrote:
> > > > Am 26.09.2011 um 09:51 schrieb David Gibson <address@hidden>:
> > [snip]
> > > > Also, if you can come up with an interface that does not have variable
> > > > length descriptors but is still able to export all the required
> > > > generic information, please send a proposal to the list :)
> > > > 
> > > 
> > > Hi,
> > > 
> > > The other obvious possibility is a pure ioctl interface.  To match what
> > > this proposal is trying to describe, plus the runtime interfaces, we'd
> > > need something like:
> > 
> > Right, this also seems a reasonable possibility to me, depending on
> > the details.
> > 
> > > /* :0 - PCI devices, :1 - Devices path device, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_FLAGS                     _IOR(, , u64)
> > > 
> > > 
> > > /* Return number of mmio/iop/config regions.
> > >  * For PCI this is always 8 (BAR0-5 + ROM + Config) */
> > > #define VFIO_DEVICE_GET_NUM_REGIONS               _IOR(, , int)
> > > 
> > > /* Return length for region index (may be zero) */
> > > #define VFIO_DEVICE_GET_REGION_LEN                _IOWR(, , u64)
> > > 
> > > /* Return flags for region index
> > >  * :0 - mmap'able, :1 - read-only, 63:2 - reserved */
> > > #define VFIO_DEVICE_GET_REGION_FLAGS              _IOR(, , u64)
> > > 
> > > /* Return file offset for region index */
> > > #define VFIO_DEVICE_GET_REGION_OFFSET             _IOWR(, , u64)
> > 
> > The above 3 can be be folded into one "getregioninfo" call.
> 
> Yep, and the phys addr one below.  We can use a flags bit to indicate
> whether it's valid.
> 
> > > /* Return physical address for region index - not implemented for PCI */
> > > #define VFIO_DEVICE_GET_REGION_PHYS_ADDR  _IOWR(, , u64)
> > > 
> > > 
> > > 
> > > /* Return number of IRQs (Not including MSI/MSI-X for PCI) */
> > > #define VFIO_DEVICE_GET_NUM_IRQ                   _IOR(, , int)
> > > 
> > > /* Set IRQ eventfd for IRQ index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_IRQ_EVENTFD               _IOW(, , int)
> > > 
> > > /* Unmask IRQ index */
> > > #define VFIO_DEVICE_UNMASK_IRQ                    _IOW(, , int)
> > > 
> > > /* Set unmask eventfd for index, arg[0] = index, arg[1] = fd */
> > > #define VFIO_DEVICE_SET_UNMASK_IRQ_EVENTFD        _IOW(, , int)
> > > 
> > > 
> > > /* Return the device tree path for type/index into the user
> > >  * allocated buffer */
> > > struct dtpath {
> > >   u32     type; (0 = region, 1 = IRQ)
> > >   u32     index;
> > >   u32     buf_len;
> > >   char    *buf;
> > > };
> > > #define VFIO_DEVICE_GET_DTPATH                    _IOWR(, , struct dtpath)
> > > 
> > > /* Return the device tree index for type/index */
> > > struct dtindex {
> > >   u32     type; (0 = region, 1 = IRQ)
> > >   u32     index;
> > >   u32     prop_type;
> > >   u32     prop_index;
> > > };
> > > #define VFIO_DEVICE_GET_DTINDEX                   _IOWR(, , struct 
> > > dtindex)
> > 
> > I think those need some work, but that doesn't impinge on the core
> > semantics.
> > 
> > > /* Reset the device */
> > > #define VFIO_DEVICE_RESET                 _IO(, ,)
> > > 
> > > 
> > > /* PCI MSI setup, arg[0] = #, arg[1-n] = eventfds */
> > > #define VFIO_DEVICE_PCI_SET_MSI_EVENTFDS  _IOW(, , int)
> > > #define VFIO_DEVICE_PCI_SET_MSIX_EVENTFDS _IOW(, , int)
> > 
> > Why does this need seperate controls, rather than just treating MSIs
> > as interrupts beyond the first for PCI devices?
> 
> Well, we could say that PCI will always report 3 for
> VFIO_DEVICE_GET_NUM_IRQ where 0 = legacy, 1 = MSI, 2 = MSI-X.  ioctls on
> unimplemented IRQs will fail, UNMASK* ioctls on non-level triggered
> interrupts will fail, and the parameter to SET_IRQ_EVENTFD becomes
> arg[0] = index, arg[1] = count, arg[2-n] = fd.  Maybe we'd then have a
> GET_IRQ_INFO that takes something like:
> 
> struct vfio_irq_info {
>       int index;
>       unsigned int count;
>       u64 flags;
> #define VFIO_IRQ_INFO_FLAGS_LEVEL     (1 << 0)
> };
> 
> count would be 0 on PCI if the type of interrupt isn't supported.
> Better?  Thanks,

FYI for all, I've pushed a branch out to github with the current state
of the re-write.  You can find it here

https://address@hidden/awilliam/linux-vfio.git
git://github.com/awilliam/linux-vfio.git

The vfio-ng branch is the latest.  The framework is quite a bit more
solid now, so I figure it's time to move into the device and iommu
implementation.  vfio-pci is now it's own module that depends on vfio, I
expect vfio-dt to be implemented the same.  The PCI ioctl is in place
and supports the interface described above.  I'll continue to port
pieces of the old vfio code into this new infrastructure.  Comments and
patches welcome.  Thanks,

Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]