qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Towards an ivshmem 2.0?


From: Jan Kiszka
Subject: Re: [Qemu-devel] Towards an ivshmem 2.0?
Date: Mon, 16 Jan 2017 14:10:17 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Hi Marc-André,

On 2017-01-16 13:41, Marc-André Lureau wrote:
> Hi
> 
> On Mon, Jan 16, 2017 at 12:37 PM Jan Kiszka <address@hidden
> <mailto:address@hidden>> wrote:
> 
>     Hi,
> 
>     some of you may know that we are using a shared memory device similar to
>     ivshmem in the partitioning hypervisor Jailhouse [1].
> 
>     We started as being compatible to the original ivshmem that QEMU
>     implements, but we quickly deviated in some details, and in the recent
>     months even more. Some of the deviations are related to making the
>     implementation simpler. The new ivshmem takes <500 LoC - Jailhouse is
>     aiming at safety critical systems and, therefore, a small code base.
>     Other changes address deficits in the original design, like missing
>     life-cycle management.
> 
>     Now the question is if there is interest in defining a common new
>     revision of this device and maybe also of some protocols used on top,
>     such as virtual network links. Ideally, this would enable us to share
>     Linux drivers. We will definitely go for upstreaming at least a network
>     driver such as [2], a UIO driver and maybe also a serial port/console.
> 
> 
> This sounds like duplicating efforts done with virtio and vhost-pci.
> Have you looked at Wei Wang proposal?

I didn't follow it recently, but the original concept was about
introducing an IOMMU model to the picture, and that's complexity-wise a
no-go for us (we can do this whole thing in less than 500 lines, even
virtio itself is more complex). IIUC, the alternative to an IOMMU is
mapping the whole frontend VM memory into the backend VM - that's
security/safety-wise an absolute no-go.

> 
>     I've attached a first draft of the specification of our new ivshmem
>     device. A working implementation can be found in the wip/ivshmem2 branch
>     of Jailhouse [3], the corresponding ivshmem-net driver in [4].
> 
> 
> You don't have qemu branch, right?

Yes, not yet. I would look into creating a QEMU device model if there is
serious interest.

>  
> 
> 
>     Deviations from the original design:
> 
>     - Only two peers per link
> 
> 
> sound sane, that's also what vhost-pci aims to afaik
>  
> 
>       This simplifies the implementation and also the interfaces (think of
>       life-cycle management in a multi-peer environment). Moreover, we do
>       not have an urgent use case for multiple peers, thus also not
>       reference for a protocol that could be used in such setups. If someone
>       else happens to share such a protocol, it would be possible to discuss
>       potential extensions and their implications.
> 
>     - Side-band registers to discover and configure share memory regions
> 
>       This was one of the first changes: We removed the memory regions from
>       the PCI BARs and gave them special configuration space registers. By
>       now, these registers are embedded in a PCI capability. The reasons are
>       that Jailhouse does not allow to relocate the regions in guest address
>       space (but other hypervisors may if they like to) and that we now have
>       up to three of them.
> 
> 
>  Sorry, I can't comment on that.
> 
> 
>     - Changed PCI base class code to 0xff (unspecified class)
> 
>       This allows us to define our own sub classes and interfaces. That is
>       now exploited for specifying the shared memory protocol the two
>       connected peers should use. It also allows the Linux drivers to match
>       on that.
> 
> 
> Why not, but it worries me that you are going to invent protocols
> similar to virtio devices, aren't you?

That partly comes with the desire to simplify the transport (pure shared
memory). With ivshmem-net, we are at least reusing virtio rings and will
try to do this with the new (and faster) virtio ring format as well.

>  
> 
>     - INTx interrupts support is back
> 
>       This is needed on target platforms without MSI controllers, i.e.
>       without the required guest support. Namely some PCI-less ARM SoCs
>       required the reintroduction. While doing this, we also took care of
>       keeping the MMIO registers free of privileged controls so that a
>       guest OS can map them safely into a guest userspace application.
> 
> 
> Right, it's not completely removed from ivshmem qemu upstream, although
> it should probably be allowed to setup a doorbell-ivshmem with msi=off
> (this may be quite trivial to add back)
>  
> 
>     And then there are some extensions of the original ivshmem:
> 
>     - Multiple shared memory regions, including unidirectional ones
> 
>       It is now possible to expose up to three different shared memory
>       regions: The first one is read/writable for both sides. The second
>       region is read/writable for the local peer and read-only for the
>       remote peer (useful for output queues). And the third is read-only
>       locally but read/writable remotely (ie. for input queues).
>       Unidirectional regions prevent that the receiver of some data can
>       interfere with the sender while it is still building the message, a
>       property that is not only useful for safety critical communication,
>       we are sure.
> 
> 
> Sounds like a good idea, and something we may want in virtio too
> 
> 
>     - Life-cycle management via local and remote state
> 
>       Each device can now signal its own state in form of a value to the
>       remote side, which triggers an event there. Moreover, state changes
>       done by the hypervisor to one peer are signalled to the other side.
>       And we introduced a write-to-shared-memory mechanism for the
>       respective remote state so that guests do not have to issue an MMIO
>       access in order to check the state.
> 
> 
> There is also ongoing work to better support disconnect/reconnect in
> virtio.
>  
> 
> 
>     So, this is our proposal. Would be great to hear some opinions if you
>     see value in adding support for such an "ivshmem 2.0" device to QEMU as
>     well and expand its ecosystem towards Linux upstream, maybe also DPDK
>     again. If you see problems in the new design /wrt what QEMU provides so
>     far with its ivshmem device, let's discuss how to resolve them. Looking
>     forward to any feedback!
> 
> 
> My feeling is that ivshmem is not being actively developped in qemu, but
> rather virtio-based solutions (vhost-pci for vm2vm).

As pointed out, for us it's most important to keep the design simple -
even at the price of "reinventing" some drivers for upstream (at least,
we do not need two sets of drivers because our interface is fully
symmetric). I don't see yet how vhost-pci could achieve the same, but
I'm open to learn more!

Thanks,
Jan

> 
>     Jan
> 
>     [1] https://github.com/siemens/jailhouse
>     [2]
>     
> http://git.kiszka.org/?p=linux.git;a=blob;f=drivers/net/ivshmem-net.c;h=0e770ca293a4aca14a55ac0e66871b09c82647af;hb=refs/heads/queues/jailhouse
>     [3] https://github.com/siemens/jailhouse/commits/wip/ivshmem2
>     [4]
>     
> http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/jailhouse-ivshmem2
> 
>     --
>     Siemens AG, Corporate Technology, CT RDA ITP SES-DE
>     Corporate Competence Center Embedded Linux
> 
> -- 
> Marc-André Lureau

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]