qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.repla


From: Jason Wang
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add MemoryRegionIOMMUOps.replay() callback
Date: Sat, 1 Apr 2017 13:00:43 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0



On 2017年03月31日 15:30, Liu, Yi L wrote:
-----Original Message-----
From: Jason Wang [mailto:address@hidden
Sent: Friday, March 31, 2017 3:17 PM
To: Liu, Yi L <address@hidden>; 'Peter Xu' <address@hidden>
Cc: Lan, Tianyu <address@hidden>; Tian, Kevin <address@hidden>;
'address@hidden' <address@hidden>; 'address@hidden'
<address@hidden>; 'address@hidden' <address@hidden>; 'qemu-
address@hidden' <address@hidden>; 'address@hidden'
<address@hidden>; 'David Gibson' <address@hidden>
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
MemoryRegionIOMMUOps.replay() callback



On 2017年03月31日 13:34, Liu, Yi L wrote:
-----Original Message-----
From: Jason Wang [mailto:address@hidden
Sent: Thursday, March 30, 2017 7:58 PM
To: Liu, Yi L <address@hidden>; 'Peter Xu' <address@hidden>
Cc: 'address@hidden' <address@hidden>; Lan,
Tianyu <address@hidden>; Tian, Kevin <address@hidden>;
'address@hidden'
<address@hidden>; 'address@hidden' <address@hidden>;
'address@hidden' <address@hidden>; 'David Gibson'
<address@hidden>; 'address@hidden' <qemu-
address@hidden>
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
MemoryRegionIOMMUOps.replay() callback



On 2017年03月30日 19:06, Liu, Yi L wrote:
-----Original Message-----
From: Liu, Yi L
Sent: Monday, March 27, 2017 5:22 PM
To: Peter Xu <address@hidden>
Cc: address@hidden; Lan, Tianyu <address@hidden>;
Tian, Kevin <address@hidden>; address@hidden;
address@hidden; address@hidden; address@hidden;
David Gibson <address@hidden>; address@hidden
Subject: RE: [Qemu-devel] [PATCH v7 14/17] memory: add
MemoryRegionIOMMUOps.replay() callback

-----Original Message-----
From: Peter Xu [mailto:address@hidden
Sent: Monday, March 27, 2017 5:12 PM
To: Liu, Yi L <address@hidden>
Cc: address@hidden; Lan, Tianyu
<address@hidden>; Tian, Kevin <address@hidden>;
address@hidden; address@hidden; address@hidden;
address@hidden; David Gibson <address@hidden>;
address@hidden
Subject: Re: [Qemu-devel] [PATCH v7 14/17] memory: add
MemoryRegionIOMMUOps.replay() callback

On Mon, Mar 27, 2017 at 08:35:05AM +0000, Liu, Yi L wrote:
-----Original Message-----
From: Qemu-devel
[mailto:address@hidden On
Behalf Of Peter Xu
Sent: Tuesday, February 7, 2017 4:28 PM
To: address@hidden
Cc: Lan, Tianyu <address@hidden>; Tian, Kevin
<address@hidden>; address@hidden; address@hidden;
address@hidden; address@hidden;
address@hidden; address@hidden; David Gibson
<address@hidden>
Subject: [Qemu-devel] [PATCH v7 14/17] memory: add
MemoryRegionIOMMUOps.replay() callback

Originally we have one memory_region_iommu_replay() function,
which is the default behavior to replay the translations of the
whole IOMMU region. However, on some platform like x86, we may
want our own
replay logic for IOMMU regions.
This patch add one more hook for IOMMUOps for the callback, and
it'll override the default if set.

Signed-off-by: Peter Xu <address@hidden>
---
    include/exec/memory.h | 2 ++
    memory.c              | 6 ++++++
    2 files changed, 8 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h index
0767888..30b2a74 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -191,6 +191,8 @@ struct MemoryRegionIOMMUOps {
        void (*notify_flag_changed)(MemoryRegion *iommu,
                                    IOMMUNotifierFlag old_flags,
                                    IOMMUNotifierFlag new_flags);
+    /* Set this up to provide customized IOMMU replay function */
+    void (*replay)(MemoryRegion *iommu, IOMMUNotifier
+ *notifier);
    };

    typedef struct CoalescedMemoryRange CoalescedMemoryRange;
diff --git a/memory.c b/memory.c index 7a4f2f9..9c253cc 100644
--- a/memory.c
+++ b/memory.c
@@ -1630,6 +1630,12 @@ void
memory_region_iommu_replay(MemoryRegion
*mr, IOMMUNotifier *n,
        hwaddr addr, granularity;
        IOMMUTLBEntry iotlb;
+    /* If the IOMMU has its own replay callback, override */
+    if (mr->iommu_ops->replay) {
+        mr->iommu_ops->replay(mr, n);
+        return;
+    }
Hi Alex, Peter,

Will all the other vendors(e.g. PPC, s390, ARM) add their own
replay callback as well? I guess it depends on whether the
original replay algorithm work well for them? Do you have such knowledge?
I guess so. At least for VT-d we had this callback since the
default replay mechanism did not work well on x86 due to its
extremely large memory region size. Thanks,
thx. that would make sense.
Peter,

Just come to mind that there may be a corner case here.

Intel VT-d actually has a "pt" mode which allows device use physical
address even when VT-d is enabled. In kernel, there is a
iommu_identity_mapping.
If a device is in this map, then it would use "pt" mode. So that
IOMMU driver would not build second-level page table for it.
Yes, but qemu does not support ECAP_PT now, so guest will still have
a page table in this case.
That's true. Without ECAP_PT, IOMMU driver would create a 1:1 map. So
this solution can work well even a device is in identify_map.

Back to the virtual IOVA implementation, if an assigned device is in
the iommu_identity_mapping(e.g. VGA controller), it uses GPA directly to do
DMA.
So it demands a GPA->HPA mapping in host. However, the
iommu->ops.replay is not able to build it when guest SL page table is empty.

So I think building an entire guest PA->HPA mapping before guest
kernel boot would be recommended. Any thoughts?
We plan to add PT in 2.10, a possible rough idea is disabled iommu
dmar region and use another region without iommu_ops. Then
vfio_listener_region_add() will just do the correct mappings.
Good to know it. Actually, I also need to expose ECAP_PT for vSVM. So
just comes to realize that the current replay solution may not work well when I
expose ECAP_PT to guest.
I also have a rough idea here. The current listener in container
listens to address space named with devfn if virtual VTd is added. How
about adding one more listener to listen memory address space. So that the
listener can build entire guest PA->HPA mapping.

This is only needed for PT. So looks like current code is sufficient to do this 
I think.
See the else part of if (memory_region_is_iommu()) of 
vfio_listener_region_add().
Jason, when the listener listen to device address space, the "else part" may 
not work
even we set the mr->iommu_ops = NULL. The mr would be a non-ram region when the
time region_add is called since it is actually listen to changes from device 
address space.

Regards,
Yi L


See Peter's patch ("intel_iommu: allow dynamic switch of IOMMU region"). It has

+        memory_region_init_alias(&vtd_dev_as->sys_alias, OBJECT(s),
+                                 "vtd_sys_alias", get_system_memory(),
+ 0, memory_region_size(get_system_memory()));

We can enable sys_alias in when PT is used which should work I think.

Thanks



reply via email to

[Prev in Thread] Current Thread [Next in Thread]