[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: vfio / device assignment -- layout of device fd fi
From: |
Scott Wood |
Subject: |
Re: [Qemu-devel] RFC: vfio / device assignment -- layout of device fd files |
Date: |
Mon, 29 Aug 2011 18:14:29 -0500 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10 |
On 08/29/2011 05:46 PM, Alex Williamson wrote:
> On Mon, 2011-08-29 at 16:58 -0500, Scott Wood wrote:
>> On 08/29/2011 02:51 PM, Alex Williamson wrote:
>>> On Mon, 2011-08-29 at 16:51 +0000, Yoder Stuart-B08248 wrote:
>>>> The device info records following the file header have the following
>>>> record types each with content encoded in a record specific way:
>>>>
>>>> REGION - describes an addressable address range for the device
>>>> DTPATH - describes the device tree path for the device
>>>> DTINDEX - describes the index into the related device tree
>>>> property (reg,ranges,interrupts,interrupt-map)
>>>
>>> I don't quite understand if these are physical or virtual.
>>
>> If what are physical or virtual?
>
> Can you give an example of a path vs an index? I don't understand
> enough about these to ask a useful question about what they're
> describing.
You'd have both path and index.
Example, for this tree:
/ {
...
foo {
...
bar {
reg = <0x1000 64 0x1800 64>;
ranges = <0 0x20000 0x10000>;
...
child {
reg = <0x100 0x100>;
...
};
};
};
};
There would be 4 regions if you bind to /foo/bar:
// this is 64 bytes at 0x1000
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=0
// this is 64 bytes at 0x1800
DTPATH "/foo/bar"
DTINDEX prop_type=REG prop_index=1
// this is 16K at 0x20000
DTPATH "/foo/bar"
DTINDEX prop_type=RANGES prop_index=0
// this is 256 bytes at 0x20100
DTPATH "/foo/bar/child"
DTINDEX prop_type=REG prop_index=0
Both ranges and the child reg are needed, since ranges could be a simple
"ranges;" that passes everything with no translation, and child nodes
could be absent-but-implied in some other cases (such as when they
represent PCI devices which can be probed -- we still need to map the
ranges that correspond to PCI controller windows).
>>>> INTERRUPT - describes an interrupt for the device
>>>> PCI_CONFIG_SPACE - describes config space for the device
>>>
>>> I would have expected this to be a REGION with a property of
>>> PCI_CONFIG_SPACE.
>>
>> Could be, if physical address is made optional.
>
> Or physical address is also a property, aka sub-region.
A subrecord of REGION is fine with me.
>>> Would we only need to expose phys addr for 1:1 mapping requirements?
>>> I'm not sure why we'd care to expose this otherwise.
>>
>> It's more important for non-PCI, where it avoids the need for userspace
>> to parse the device tree to find the guest address (we'll usually want
>> 1:1), or to consolidate pages shared by multiple regions. It could be
>> nice for debugging, as well.
>
> So the device tree path is ripped straight from the system, so it's the
> actual 1:1, matching physical hardware, path.
Yes.
>>> Even for non-PCI we need to
>>> know if the region is pio/mmio32/mmio64/prefetchable/etc.
>>
>> Outside of PCI, what standardized form would you put such information
>> in? Where would the kernel get this information? What does
>> mmio32/mmio64 mean in this context?
>
> I could imagine a platform device described by ACPI that might want to
> differentiate. The physical device doesn't get moved of course, but
> guest drivers might care how the device is described if we need to
> rebuild those ACPI tables. ACPI might even be a good place to leverage
> these data structures... /me ducks.
ACPI info could be another subrecord type, but in the device tree
system-bus case we generally don't have this information at the generic
infrastructure level. Drivers are expected to know how their devices'
regions should be mapped.
>>> BAR index could really just translate to a REGION instance number.
>>
>> How would that work if you make non-BAR things (such as config space)
>> into regions?
>
> Put their instance numbers outside of the BAR region? We have a fixed
> REGION space on PCI, so we could just define BAR0 == instance 0, BAR1 ==
> instance 1... ROM == instance 6, CONFIG == instance 0xF (or 7).
Seems more awkward than just having each region say what it is. What do
you do to fill in the gaps?
-Scott