[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Towards an ivshmem 2.0?
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] Towards an ivshmem 2.0? |
Date: |
Mon, 23 Jan 2017 15:19:23 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) |
Jan Kiszka <address@hidden> writes:
> Hi,
>
> some of you may know that we are using a shared memory device similar to
> ivshmem in the partitioning hypervisor Jailhouse [1].
>
> We started as being compatible to the original ivshmem that QEMU
> implements, but we quickly deviated in some details, and in the recent
> months even more. Some of the deviations are related to making the
> implementation simpler. The new ivshmem takes <500 LoC - Jailhouse is
Compare: hw/misc/ivshmem.c ~1000 SLOC, measured with sloccount.
> aiming at safety critical systems and, therefore, a small code base.
> Other changes address deficits in the original design, like missing
> life-cycle management.
>
> Now the question is if there is interest in defining a common new
> revision of this device and maybe also of some protocols used on top,
> such as virtual network links. Ideally, this would enable us to share
> Linux drivers. We will definitely go for upstreaming at least a network
> driver such as [2], a UIO driver and maybe also a serial port/console.
>
> I've attached a first draft of the specification of our new ivshmem
> device. A working implementation can be found in the wip/ivshmem2 branch
> of Jailhouse [3], the corresponding ivshmem-net driver in [4].
>
> Deviations from the original design:
>
> - Only two peers per link
Uh, define "link".
> This simplifies the implementation and also the interfaces (think of
> life-cycle management in a multi-peer environment). Moreover, we do
> not have an urgent use case for multiple peers, thus also not
> reference for a protocol that could be used in such setups. If someone
> else happens to share such a protocol, it would be possible to discuss
> potential extensions and their implications.
>
> - Side-band registers to discover and configure share memory regions
>
> This was one of the first changes: We removed the memory regions from
> the PCI BARs and gave them special configuration space registers. By
> now, these registers are embedded in a PCI capability. The reasons are
> that Jailhouse does not allow to relocate the regions in guest address
> space (but other hypervisors may if they like to) and that we now have
> up to three of them.
I'm afraid I don't quite understand the change, nor the rationale. I
guess I could figure out the former by studying the specification.
> - Changed PCI base class code to 0xff (unspecified class)
Changed from 0x5 (memory controller).
> This allows us to define our own sub classes and interfaces. That is
> now exploited for specifying the shared memory protocol the two
> connected peers should use. It also allows the Linux drivers to match
> on that.
>
> - INTx interrupts support is back
>
> This is needed on target platforms without MSI controllers, i.e.
> without the required guest support. Namely some PCI-less ARM SoCs
> required the reintroduction. While doing this, we also took care of
> keeping the MMIO registers free of privileged controls so that a
> guest OS can map them safely into a guest userspace application.
So you need interrupt capability. Current upstream ivshmem requires a
server such as the one in contrib/ivshmem-server/. What about yours?
The interrupt feature enables me to guess a definition of "link": A and
B are peers of the same link if they can interrupt each other.
Does your ivshmem2 support interrupt-less operation similar to
ivshmem-plain?
> And then there are some extensions of the original ivshmem:
>
> - Multiple shared memory regions, including unidirectional ones
>
> It is now possible to expose up to three different shared memory
> regions: The first one is read/writable for both sides. The second
> region is read/writable for the local peer and read-only for the
> remote peer (useful for output queues). And the third is read-only
> locally but read/writable remotely (ie. for input queues).
> Unidirectional regions prevent that the receiver of some data can
> interfere with the sender while it is still building the message, a
> property that is not only useful for safety critical communication,
> we are sure.
>
> - Life-cycle management via local and remote state
>
> Each device can now signal its own state in form of a value to the
> remote side, which triggers an event there.
How are "events" related to interrupts?
> Moreover, state changes
> done by the hypervisor to one peer are signalled to the other side.
> And we introduced a write-to-shared-memory mechanism for the
> respective remote state so that guests do not have to issue an MMIO
> access in order to check the state.
>
> So, this is our proposal. Would be great to hear some opinions if you
> see value in adding support for such an "ivshmem 2.0" device to QEMU as
> well and expand its ecosystem towards Linux upstream, maybe also DPDK
> again. If you see problems in the new design /wrt what QEMU provides so
> far with its ivshmem device, let's discuss how to resolve them. Looking
> forward to any feedback!
My general opinion on ivshmem is well-known, but I repeat it for the
record: merging it was a mistake, and using it is probably a mistake. I
detailed my concerns in "Why I advise against using ivshmem"[*].
My philosophical concerns remain. Perhaps you can assuage them.
Only some of my practical concerns have since been addressed. In part
by myself, because having a flawed implementation of a bad idea is
strictly worse than the same with flaws corrected as far as practical.
But even today, docs/specs/ivshmem-spec.txt is a rather depressing read.
However, there's one thing that's still worse than a more or less flawed
implementation of a bad idea: two implementations of a bad idea. Could
ivshmem2 be done in a way that permits *replacing* ivshmem?
> [1] https://github.com/siemens/jailhouse
> [2]
> http://git.kiszka.org/?p=linux.git;a=blob;f=drivers/net/ivshmem-net.c;h=0e770ca293a4aca14a55ac0e66871b09c82647af;hb=refs/heads/queues/jailhouse
> [3] https://github.com/siemens/jailhouse/commits/wip/ivshmem2
> [4]
> http://git.kiszka.org/?p=linux.git;a=shortlog;h=refs/heads/queues/jailhouse-ivshmem2
[*] http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg02968.html
Re: [Qemu-devel] Towards an ivshmem 2.0?, Stefan Hajnoczi, 2017/01/16
Re: [Qemu-devel] Towards an ivshmem 2.0?,
Markus Armbruster <=
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Jan Kiszka, 2017/01/25
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Markus Armbruster, 2017/01/27
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Jan Kiszka, 2017/01/29
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Marc-André Lureau, 2017/01/29
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Jan Kiszka, 2017/01/29
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Markus Armbruster, 2017/01/30
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Jan Kiszka, 2017/01/30
- Re: [Qemu-devel] Towards an ivshmem 2.0?, Wang, Wei W, 2017/01/30
Re: [Qemu-devel] Towards an ivshmem 2.0?, Markus Armbruster, 2017/01/30
Re: [Qemu-devel] Towards an ivshmem 2.0?, Jan Kiszka, 2017/01/30