qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Towards an ivshmem 2.0?


From: Jan Kiszka
Subject: [Qemu-devel] Towards an ivshmem 2.0?
Date: Mon, 16 Jan 2017 09:36:51 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Hi,

some of you may know that we are using a shared memory device similar to
ivshmem in the partitioning hypervisor Jailhouse [1].

We started as being compatible to the original ivshmem that QEMU
implements, but we quickly deviated in some details, and in the recent
months even more. Some of the deviations are related to making the
implementation simpler. The new ivshmem takes <500 LoC - Jailhouse is
aiming at safety critical systems and, therefore, a small code base.
Other changes address deficits in the original design, like missing
life-cycle management.

Now the question is if there is interest in defining a common new
revision of this device and maybe also of some protocols used on top,
such as virtual network links. Ideally, this would enable us to share
Linux drivers. We will definitely go for upstreaming at least a network
driver such as [2], a UIO driver and maybe also a serial port/console.

I've attached a first draft of the specification of our new ivshmem
device. A working implementation can be found in the wip/ivshmem2 branch
of Jailhouse [3], the corresponding ivshmem-net driver in [4].

Deviations from the original design:

- Only two peers per link

  This simplifies the implementation and also the interfaces (think of
  life-cycle management in a multi-peer environment). Moreover, we do
  not have an urgent use case for multiple peers, thus also not
  reference for a protocol that could be used in such setups. If someone
  else happens to share such a protocol, it would be possible to discuss
  potential extensions and their implications.

- Side-band registers to discover and configure share memory regions

  This was one of the first changes: We removed the memory regions from
  the PCI BARs and gave them special configuration space registers. By
  now, these registers are embedded in a PCI capability. The reasons are
  that Jailhouse does not allow to relocate the regions in guest address
  space (but other hypervisors may if they like to) and that we now have
  up to three of them.

- Changed PCI base class code to 0xff (unspecified class)

  This allows us to define our own sub classes and interfaces. That is
  now exploited for specifying the shared memory protocol the two
  connected peers should use. It also allows the Linux drivers to match
  on that.

- INTx interrupts support is back

  This is needed on target platforms without MSI controllers, i.e.
  without the required guest support. Namely some PCI-less ARM SoCs
  required the reintroduction. While doing this, we also took care of
  keeping the MMIO registers free of privileged controls so that a
  guest OS can map them safely into a guest userspace application.

And then there are some extensions of the original ivshmem:

- Multiple shared memory regions, including unidirectional ones

  It is now possible to expose up to three different shared memory
  regions: The first one is read/writable for both sides. The second
  region is read/writable for the local peer and read-only for the
  remote peer (useful for output queues). And the third is read-only
  locally but read/writable remotely (ie. for input queues).
  Unidirectional regions prevent that the receiver of some data can
  interfere with the sender while it is still building the message, a
  property that is not only useful for safety critical communication,
  we are sure.

- Life-cycle management via local and remote state

  Each device can now signal its own state in form of a value to the
  remote side, which triggers an event there. Moreover, state changes
  done by the hypervisor to one peer are signalled to the other side.
  And we introduced a write-to-shared-memory mechanism for the
  respective remote state so that guests do not have to issue an MMIO
  access in order to check the state.

So, this is our proposal. Would be great to hear some opinions if you
see value in adding support for such an "ivshmem 2.0" device to QEMU as
well and expand its ecosystem towards Linux upstream, maybe also DPDK
again. If you see problems in the new design /wrt what QEMU provides so
far with its ivshmem device, let's discuss how to resolve them. Looking
forward to any feedback!

Jan

[1] https://github.com/siemens/jailhouse
[2]
http://git.kiszka.org/?p=linux.git;a=blob;f=drivers/net/ivshmem-net.c;h=0e770ca293a4aca14a55ac0e66871b09c82647af;hb=refs/heads/queues/jailhouse
[3] https://github.com/siemens/jailhouse/commits/wip/ivshmem2
[4]
http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/jailhouse-ivshmem2

-- 
Siemens AG, Corporate Technology, CT RDA ITP SES-DE
Corporate Competence Center Embedded Linux

Attachment: ivshmem-v2-specification.md
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]