qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device


From: Stuart Yoder
Subject: Re: [Qemu-devel] RFC [v2]: vfio / device assignment -- layout of device fd files
Date: Mon, 26 Sep 2011 14:57:31 -0500

On Mon, Sep 26, 2011 at 2:51 AM, David Gibson
<address@hidden> wrote:
> On Fri, Sep 09, 2011 at 08:11:54AM -0500, Stuart Yoder wrote:
>> Based on the discussions over the last couple of weeks
>> I have updated the device fd file layout proposal and
>> tried to specify it a bit more formally.
>>
>> ===============================================================
>>
>> 1.  Overview
>>
>>   This specification describes the layout of device files
>>   used in the context of vfio, which gives user space
>>   direct access to I/O devices that have been bound to
>>   vfio.
>>
>>   When a device fd is opened and read, offset 0x0 contains
>>   a fixed sized header followed by a number of variable length
>>   records that describe different characteristics
>>   of the device-- addressable regions, interrupts, etc.
>>
>>   0x0  +-------------+-------------+
>>        |         magic             | u32  // identifies this as a vfio
>> device file
>>        +---------------------------+         and identifies the type of bus
>>        |         version           | u32  // specifies the version of this
>>        +---------------------------+
>>        |         flags             | u32  // encodes any flags
>>        +---------------------------+
>>        |  dev info record 0        |
>>        |    type                   | u32   // type of record
>>        |    rec_len                | u32   // length in bytes of record
>>        |                           |          (including record header)
>>        |    flags                  | u32   // type specific flags
>>        |    ...content...          |       // record content, which could
>>        +---------------------------+       // include sub-records
>>        |  dev info record 1        |
>>        +---------------------------+
>>        |  dev info record N        |
>>        +---------------------------+
>
> I really should have chimed in on this earlier, but I've been very
> busy.
>
> Um, not to put too fine a point on it, this is madness.
>
> Yes, it's very flexible and can thereby cover a very wide range of
> cases.  But it's much, much too complex.  Userspace has to parse a
> complex, multilayered data structure, with variable length elements
> just to get an address at which to do IO.  I can pretty much guarantee
> that if we went with this, most userspace programs using this
> interface would just ignore this metadata and directly map the
> offsets at which they happen to know the kernel will put things for
> the type of device they care about.
>
> _At least_ for PCI, I think the original VFIO layout of each BAR at a
> fixed, well known offset is much better.  Despite its limitations,
> just advertising a "device type" ID which describes one of a few fixed
> layouts would be preferable to this.  I'm still hoping, that we can do
> a bit better than that.  But we should try really hard to at the very
> least force the metadata into a simple array of resources each with a
> fixed size record describing it, even if it means some space wastage
> with occasionally-used fields.  Anything more complex than that and
> userspace is just never going to use it properly.

So, is your issue really the variable length nature of what was
proposed?

I don't think it would be that hard to make the different resources
fixed length.   I think we have 2 types of resources now-- address
regions and interrupts.

The only thing that get's a bit tricky is device tree paths, which
are obviously variable length.

We could put a description of all the resources in an array with
each element being something like 4KB??

Stuart



reply via email to

[Prev in Thread] Current Thread [Next in Thread]