qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap


From: Lan Tianyu
Subject: Re: [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications
Date: Wed, 7 Dec 2016 22:04:57 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.5.0

On 2016年12月07日 14:43, Peter Xu wrote:
On Wed, Dec 07, 2016 at 02:09:16PM +0800, Lan Tianyu wrote:
On 2016年12月06日 18:59, Peter Xu wrote:
On Tue, Dec 06, 2016 at 04:27:39PM +0800, Lan Tianyu wrote:

[...]


User space driver(E.G DPDK) also can enable/disable
IOVA for device dynamically.

Could you provide more detailed (or any pointer) on how to do that? I
did try to find it myself, I see an VFIO_IOMMU_ENABLE ioctl, but looks
like it is for ppc only.

No, I just give an example that user space may do that but no more
research. But since Qemu already can enable device's IOVA, other user
application also should can do that with the same VFIO interface, right?

AFAIU we can't do that at least on x86. We can use vfio interface to
bind group into container, but we should not be able to dynamically
disable IOMMU protection. IIUC That needs to taint the kernel.

The only way I know is that we probe vfio-pci with no-iommu mode, in
that case, we disabled IOMMU, but we can never dynamically enable it
as well.

Please correct me if I am wrong.


Actually, disabling device's IOVA doesn't require to disable kernel
global DMA protect and just clear device's VTD context entry in the
context table. Go though IOMMU and VFIO code, find this will happen when
call VFIO_GROUP_UNSET_CONTAINER ioctl and it will be called when destroy
VM or unplug assigned device in Qemu. Please help to double check.

Call trace:
__vfio_group_unset_container()
vfio_iommu_type1_detach_group()
iommu_detach_group()
dmar_remove_one_dev_info()
__dmar_remove_one_dev_info()
domain_context_clear()


The legacy KVM device assign code also will call iommu_detach_device()
when deassign a device.

From device emulation view, we need to make sure correct register
emulation regardless of guest behavior.

Even if the context entry is cleared and invalidated, IMHO it does not
mean that we should be using GPA address space, nor do we need to put
it into guest physical address space. Instead, it simply means this
device cannot do any IO at that time. If IO comes, IOMMU should do
fault reporting to guest OS, which should be treated as error.

Yes, that looks right and there will be fault event reported by pIOMMU
if context entry is no present for DMA untranslated request. This goes back to the first gap to report pIOMMU fault event to vIOMMU.

For disabling via clearing DMA translation via gcmd.TE bit, assigned
device should work after clearing operation and it's still necessary to
restore GPA->HPA mapping since we can't assume guest won't clear the bit
after enabling DMA translation.

This maybe low priority gap since Linux IOMMU driver don't disable DMA
translation frequently or dynamically. But we also should consider the
situation.


So I think we are emulating the correct guest behavior here - we don't
need to do anything if a device is detached from an existing IOMMU
domain in guest. If we do (e.g., we replay the GPA address space on
that device when it is detached, so the shadow page table for that
device maps the whole guest memory), that is dangerous, because with
that the device can DMA to anywhere it wants to guest memory.

If guest want to disabling DMA translation, this is expected behavior
and device model should follow guest configuration. This just likes most
distributions don't enable VTD DMA translation by default and it's OS
choice.

--
Best regards
Tianyu Lan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]