[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-arm] [Qemu-devel] [RFC 0/8] VIRTIO-IOMMU device
From: |
Jean-Philippe Brucker |
Subject: |
Re: [Qemu-arm] [Qemu-devel] [RFC 0/8] VIRTIO-IOMMU device |
Date: |
Wed, 7 Jun 2017 11:19:50 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
Hi Jason,
On 07/06/17 10:17, Jason Wang wrote:
> On 2017年06月07日 16:35, Eric Auger wrote:
>> This series implements the virtio-iommu device. This is a proof
>> of concept based on the virtio-iommu specification written by
>> Jean-Philippe Brucker [1]. This was tested with a guest using
>> the virtio-iommu driver [2] and exposed with a virtio-net-pci
>> using dma ops.
>>
>> The device gets instantiated using the "-device virtio-iommu-device"
>> option. It currently works with ARM virt machine only as the machine
>> must handle the dt binding between the virtio-mmio "iommu" node and
>> the PCI host bridge node. ACPI booting is not yet supported.
>>
>> This should allow to start some benchmarking activities against
>> pure emulated IOMMU (especially ARM SMMU).
>
> Yes, it would be also interesting to compare it with intel IOMMU. Actually
> the core function is similar to the subset of intel one with CM enabled.
> Since each map and unmap requires a command, it would be very slow for
> dynamic mappings. I wonder whether or not we can do any optimization on this.
In general we will have to send the same number of map/unmap requests than
the number of invalidations needed for an emulated IOMMU such as the Intel
one (if I understand correctly with CM there are invalidations both on map
and unmap, to avoid trapping the page tables). Using virtio allows to
reduce the number of round-trips to the host, by batching map/unmap
requests where possible. Adding vhost-iommu in the host could further
reduce the latency of map/unmap requests.
To actually reduce the number of requests, I see two possible
optimizations (loosely described in [1]), both requiring invasive changes.
* Relaxed (insecure) mode, where the guest batches unmap request or
doesn't send them at all. Map will override existing mappings if
necessary. You end up sending far less unmap requests, but there is a
vulnerability window where devices can access stale mappings, so you have
to trust your peripherals. I believe the x86 IOMMU drivers in Linux
already allow this.
* Page table handover, which is a new mode orthogonal to map/unmap. This
uses nested translation - the guest has one set of page tables for
gva->gpa and the host has one set for gpa->hpa. After setup, the guest
populates the page tables and only sends invalidation requests, no map. I
think that with the Intel IOMMU this would only be possible with PASID
traffic. But nested translation will inherently be slower than "classic"
mode, so it might end up being overall slower than map/unmap, if there is
a lot of TLB invalidation and trashing. This mode is mostly useful for SVM
virtualization.
Thanks,
Jean
- [Qemu-arm] [RFC 0/8] VIRTIO-IOMMU device, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 1/8] update-linux-headers: import virtio_iommu.h, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 2/8] linux-headers: Update for virtio-iommu, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 3/8] virtio_iommu: add skeleton, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 4/8] virtio-iommu: Decode the command payload, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 5/8] virtio_iommu: Add the iommu regions, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 6/8] virtio-iommu: Implement the translation and commands, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 7/8] hw/arm/virt: Add 2.10 machine type, Eric Auger, 2017/06/07
- [Qemu-arm] [RFC 8/8] hw/arm/virt: Add virtio-iommu the virt board, Eric Auger, 2017/06/07
- Re: [Qemu-arm] [Qemu-devel] [RFC 0/8] VIRTIO-IOMMU device, Jason Wang, 2017/06/07
- Re: [Qemu-arm] [Qemu-devel] [RFC 0/8] VIRTIO-IOMMU device,
Jean-Philippe Brucker <=
- Re: [Qemu-arm] [Qemu-devel] [RFC 0/8] VIRTIO-IOMMU device, no-reply, 2017/06/07