qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] device assignment for embedded Power


From: Alexander Graf
Subject: Re: [Qemu-devel] device assignment for embedded Power
Date: Fri, 1 Jul 2011 13:33:51 +0200

On 01.07.2011, at 13:16, Paul Brook wrote:

>> One feature we need for QEMU/KVM on embedded Power Architecture is the
>> ability to do passthru assignment of SoC I/O devices and memory.  An
>> important use case in embedded is creating static partitions--
>> taking physical memory and I/O devices (non-PCI) and partitioning
>> them between the host Linux and several virtual machines.   Things like
>> live migration would not be needed or supported in these types of
>> scenarios.
>> 
>> SoC devices do not sit on a probeable bus and there are no identifiers
>> like 01:00.0 with PCI that we can use to identify devices--  the host
>> Linux kernel is made aware of SoC I/O devices from nodes/properties in a
>> device tree structure passed at boot.   QEMU needs to generate a
>> device tree to pass to the guest as well with all the guest's virtual
>> and physical resources.  Today a number of mostly complete guest device
>> trees are kept under ./pc-bios in QEMU, but this too static and
>> inflexible.
> 
> I doubt you're going to get generic passthrough of arbitrary devices working 
> in a useful way. My expectation is that, at minimum, you'll need a bus 
> specific proxy device. i.e. create a virtual device in qemu that responds to 
> the guest, and happens poke at a host device rather than emulating things 
> directly.
> 
> For busses like I2C this is fairly trivial - all communication with the 
> device 
> goes down a single well defined and easily proxied channel.  For more complex 
> busses you end up having to emulate a lot more.  Basically you have to 
> emulate 
> everything that is different between the host and guest.  If that happens to 
> include device specific state then you loose.
> 
> Using PCI devices as an example: The resources provided by the device are 
> self-describing, so proxying those is fairly straightforward, and doesn't 
> even 
> require manual configuration.  However replicating the environment seen by 
> the 
> device is trickier as PCI devices can initiate memory accesses (i.e. bus-
> master).  For machines without an IOMMU this means passthrough in general 
> can't work, and substantial amounts of device specific knowledge is required. 
> You'd need to intercept and modify and/oor proxy all data relating to DMA 
> addresses.  In practice you need to emulate an IOMMU inside qemu (so you can 
> determine the address space accessed by the device), and arrange for the host 
> IOMMU to present the same virtual address space to the real device.

Well, for DMA the solution is reasonably simple. We have basically two choices:

  * run 1:1 mapped, so the guest physical address == host physical address, at 
which point DMA works, but everything is insecure
  * use an IOMMU

We can easily limit it to those two cases. The more challenging part here (and 
the main reason for the email) is the question on how to configure all of that 
in a flexible, yet simple way. We can find the IO regions for devices from the 
host device tree - no problem there.

But the real challenge is how to expose the device to the guest device tree. 
Especially when it comes to links between dt nodes, interrupt maps, etc. We 
basically have 3 choices there:

  * take the host device tree pieces and modify them
  * provide device tree chunks for each device (manually or through qdev 
parameters)
  * use the device tree as machine config file and base everything on it 
(solves the linking problem)

The main question is which one would be the cleanest solution. And how would it 
be implemented.


Alex




reply via email to

[Prev in Thread] Current Thread [Next in Thread]