[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PULL 08/15] intel_iommu: allow dynamic switch of IOMMU reg
From: |
Eduardo Habkost |
Subject: |
[Qemu-devel] [PULL 08/15] intel_iommu: allow dynamic switch of IOMMU region |
Date: |
Thu, 20 Apr 2017 15:46:58 -0300 |
From: Peter Xu <address@hidden>
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.
Let me explain.
vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:
(1) switch from domain A -> B
(2) switch from domain A -> no domain (e.g., turn DMAR off)
Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.
However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.
Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).
Reviewed-by: Jason Wang <address@hidden>
Reviewed-by: David Gibson <address@hidden>
Reviewed-by: \"Michael S. Tsirkin\" <address@hidden>
Signed-off-by: Peter Xu <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Eduardo Habkost <address@hidden>
---
include/hw/i386/intel_iommu.h | 2 ++
hw/i386/intel_iommu.c | 81 ++++++++++++++++++++++++++++++++++++++++---
hw/i386/trace-events | 2 +-
3 files changed, 79 insertions(+), 6 deletions(-)
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index fe645aa93a..8f212a1198 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -83,6 +83,8 @@ struct VTDAddressSpace {
uint8_t devfn;
AddressSpace as;
MemoryRegion iommu;
+ MemoryRegion root;
+ MemoryRegion sys_alias;
MemoryRegion iommu_ir; /* Interrupt region: 0xfeeXXXXX */
IntelIOMMUState *iommu_state;
VTDContextCacheEntry context_cache_entry;
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 7af4e22958..f7dec82066 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1291,9 +1291,49 @@ static void vtd_handle_gcmd_sirtp(IntelIOMMUState *s)
vtd_set_clear_mask_long(s, DMAR_GSTS_REG, 0, VTD_GSTS_IRTPS);
}
+static void vtd_switch_address_space(VTDAddressSpace *as)
+{
+ assert(as);
+
+ trace_vtd_switch_address_space(pci_bus_num(as->bus),
+ VTD_PCI_SLOT(as->devfn),
+ VTD_PCI_FUNC(as->devfn),
+ as->iommu_state->dmar_enabled);
+
+ /* Turn off first then on the other */
+ if (as->iommu_state->dmar_enabled) {
+ memory_region_set_enabled(&as->sys_alias, false);
+ memory_region_set_enabled(&as->iommu, true);
+ } else {
+ memory_region_set_enabled(&as->iommu, false);
+ memory_region_set_enabled(&as->sys_alias, true);
+ }
+}
+
+static void vtd_switch_address_space_all(IntelIOMMUState *s)
+{
+ GHashTableIter iter;
+ VTDBus *vtd_bus;
+ int i;
+
+ g_hash_table_iter_init(&iter, s->vtd_as_by_busptr);
+ while (g_hash_table_iter_next(&iter, NULL, (void **)&vtd_bus)) {
+ for (i = 0; i < X86_IOMMU_PCI_DEVFN_MAX; i++) {
+ if (!vtd_bus->dev_as[i]) {
+ continue;
+ }
+ vtd_switch_address_space(vtd_bus->dev_as[i]);
+ }
+ }
+}
+
/* Handle Translation Enable/Disable */
static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool en)
{
+ if (s->dmar_enabled == en) {
+ return;
+ }
+
VTD_DPRINTF(CSR, "Translation Enable %s", (en ? "on" : "off"));
if (en) {
@@ -1308,6 +1348,8 @@ static void vtd_handle_gcmd_te(IntelIOMMUState *s, bool
en)
/* Ok - report back to driver */
vtd_set_clear_mask_long(s, DMAR_GSTS_REG, VTD_GSTS_TES, 0);
}
+
+ vtd_switch_address_space_all(s);
}
/* Handle Interrupt Remap Enable/Disable */
@@ -2529,15 +2571,44 @@ VTDAddressSpace *vtd_find_add_as(IntelIOMMUState *s,
PCIBus *bus, int devfn)
vtd_dev_as->devfn = (uint8_t)devfn;
vtd_dev_as->iommu_state = s;
vtd_dev_as->context_cache_entry.context_cache_gen = 0;
+
+ /*
+ * Memory region relationships looks like (Address range shows
+ * only lower 32 bits to make it short in length...):
+ *
+ * |-----------------+-------------------+----------|
+ * | Name | Address range | Priority |
+ * |-----------------+-------------------+----------+
+ * | vtd_root | 00000000-ffffffff | 0 |
+ * | intel_iommu | 00000000-ffffffff | 1 |
+ * | vtd_sys_alias | 00000000-ffffffff | 1 |
+ * | intel_iommu_ir | fee00000-feefffff | 64 |
+ * |-----------------+-------------------+----------|
+ *
+ * We enable/disable DMAR by switching enablement for
+ * vtd_sys_alias and intel_iommu regions. IR region is always
+ * enabled.
+ */
memory_region_init_iommu(&vtd_dev_as->iommu, OBJECT(s),
- &s->iommu_ops, "intel_iommu", UINT64_MAX);
+ &s->iommu_ops, "intel_iommu_dmar",
+ UINT64_MAX);
+ memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
+ "vtd_sys_alias", get_system_memory(),
+ 0, memory_region_size(get_system_memory()));
memory_region_init_io(&vtd_dev_as->iommu_ir, OBJECT(s),
&vtd_mem_ir_ops, s, "intel_iommu_ir",
VTD_INTERRUPT_ADDR_SIZE);
- memory_region_add_subregion(&vtd_dev_as->iommu,
VTD_INTERRUPT_ADDR_FIRST,
- &vtd_dev_as->iommu_ir);
- address_space_init(&vtd_dev_as->as,
- &vtd_dev_as->iommu, name);
+ memory_region_init(&vtd_dev_as->root, OBJECT(s),
+ "vtd_root", UINT64_MAX);
+ memory_region_add_subregion_overlap(&vtd_dev_as->root,
+ VTD_INTERRUPT_ADDR_FIRST,
+ &vtd_dev_as->iommu_ir, 64);
+ address_space_init(&vtd_dev_as->as, &vtd_dev_as->root, name);
+ memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+ &vtd_dev_as->sys_alias, 1);
+ memory_region_add_subregion_overlap(&vtd_dev_as->root, 0,
+ &vtd_dev_as->iommu, 1);
+ vtd_switch_address_space(vtd_dev_as);
}
return vtd_dev_as;
}
diff --git a/hw/i386/trace-events b/hw/i386/trace-events
index f725bca33e..3c3a16755f 100644
--- a/hw/i386/trace-events
+++ b/hw/i386/trace-events
@@ -4,7 +4,6 @@
x86_iommu_iec_notify(bool global, uint32_t index, uint32_t mask) "Notify IEC
invalidation: global=%d index=%" PRIu32 " mask=%" PRIu32
# hw/i386/intel_iommu.c
-vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on)
"Device %02x:%02x.%x switching address space (iommu enabled=%d)"
vtd_inv_desc(const char *type, uint64_t hi, uint64_t lo) "invalidate desc type
%s high 0x%"PRIx64" low 0x%"PRIx64
vtd_inv_desc_invalid(uint64_t hi, uint64_t lo) "invalid inv desc hi
0x%"PRIx64" lo 0x%"PRIx64
vtd_inv_desc_cc_domain(uint16_t domain) "context invalidate domain 0x%"PRIx16
@@ -37,6 +36,7 @@ vtd_page_walk_one(uint32_t level, uint64_t iova, uint64_t
gpa, uint64_t mask, in
vtd_page_walk_skip_read(uint64_t iova, uint64_t next) "Page walk skip iova
0x%"PRIx64" - 0x%"PRIx64" due to unable to read"
vtd_page_walk_skip_perm(uint64_t iova, uint64_t next) "Page walk skip iova
0x%"PRIx64" - 0x%"PRIx64" due to perm empty"
vtd_page_walk_skip_reserve(uint64_t iova, uint64_t next) "Page walk skip iova
0x%"PRIx64" - 0x%"PRIx64" due to rsrv set"
+vtd_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on)
"Device %02x:%02x.%x switching address space (iommu enabled=%d)"
# hw/i386/amd_iommu.c
amdvi_evntlog_fail(uint64_t addr, uint32_t head) "error: fail to write at addr
0x%"PRIx64" + offset 0x%"PRIx32
--
2.11.0.259.g40922b1
- [Qemu-devel] [PULL 00/15] Machine queue for 2.10, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 02/15] memory: provide IOMMU_NOTIFIER_FOREACH macro, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 01/15] memory: add section range info for IOMMU notifier, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 03/15] memory: provide iommu_replay_all(), Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 04/15] memory: introduce memory_region_notify_one(), Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 05/15] memory: add MemoryRegionIOMMUOps.replay() callback, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 06/15] intel_iommu: use the correct memory region for device IOTLB notification, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 07/15] intel_iommu: provide its own replay() callback, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 08/15] intel_iommu: allow dynamic switch of IOMMU region,
Eduardo Habkost <=
- [Qemu-devel] [PULL 10/15] qdev: Make "hotplugged" property read-only, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 09/15] intel_iommu: enable remote IOTLB, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 11/15] hw/core/null-machine: Print error message when using the -kernel parameter, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 12/15] hostmem: introduce host_memory_backend_mr_inited(), Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 13/15] hostmem: use host_memory_backend_mr_inited() where proper, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 14/15] qdev: Constify value passed to qdev_prop_set_macaddr, Eduardo Habkost, 2017/04/20
- [Qemu-devel] [PULL 15/15] qdev: Constify local variable returned by blk_bs, Eduardo Habkost, 2017/04/20
- Re: [Qemu-devel] [PULL 00/15] Machine queue for 2.10, Peter Maydell, 2017/04/21