qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device


From: Jean-Philippe Brucker
Subject: Re: [Qemu-arm] [Qemu-devel] [RFC v2 0/8] VIRTIO-IOMMU device
Date: Fri, 14 Jul 2017 12:25:37 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1

On 14/07/17 08:20, Tian, Kevin wrote:
>> From: Jean-Philippe Brucker [mailto:address@hidden
>> Sent: Friday, July 7, 2017 11:15 PM
>>
>> On 07/07/17 07:21, Tian, Kevin wrote:
>>> sorry I didn't quite get this part, and here is my understanding:
>>>
>>> Guest programs vIOMMU to map a gIOVA (used by MSI to a GPA
>>> of doorbell register of virtual irqchip. vIOMMU then
>>> triggers VFIO map/unmap to update physical IOMMU page
>>> table for gIOVA -> HPA of real doorbell of physical irqchip
>>
>> At the moment (non-SVM), physical and virtual MSI doorbell are completely
>> dissociated. VFIO itself maps the doorbell GPA->HPA during container
>> initialization. The GPA, chosen arbitrarily by the host, is then removed
>> from the guest GPA space.
> 
> got you. I also got some basic understanding from below link. :-)
> 
> https://www.linaro.org/blog/core-dump/kvm-pciemsi-passthrough-armarm64/
> 
>>
>> When the guest programs the vIOMMU to map a gIOVA to the virtual irqchip
>> doorbell, I suppose Qemu will notice that the GPA doesn't correspond to
>> RAM and will withhold sending a VFIO_IOMMU_MAP_DMA request.
>>
>> (For SVM I don't want to go into the details just now, but we will
>> probably need a separate VFIO mechanism to update the physical MSI-X
>> tables with whatever gIOVA the guest mapped in its private stage-1 page
>> tables.)
> 
> I guess there may be either a terminology difference or a hardware
> difference here, since I noted you mentioned IOVA with stage-1
> multiple times.
> 
> For Intel VT-d:
> 
> - stage-1 is only for VA translation, tagged with PASID
> - stage-2 can be used for IOVA translation on bare metal or GPA/gIOVA
> translation in virtualization, w/o PASID tagged

The terminology is indeed a bit confusing, and the hardware slightly
different. For me IOVA is the address used as input of the pIOMMU, PA is
the output address, and GPA only exists if there is stage-1 + stage-2. So
I think what I meant by gIOVA above was VA in your description.

I understand your "stage-1" and "stage-2" are named "first-level" and
"second level" in the VT-d spec?

If I read the VT-d spec correctly, I think the main difference on ARM SMMU
is that stage-2 always follows stage-1 translation, but either stage may
be disabled (or both, for bypass mode). There is no mode like in VT-d,
where non-PASID transactions go only through stage-2 and PASID
transactions go only through stage-1. I believe this is (NESTE=0,
T=000b/001b) in the Extended-Context-Entry.

Something equivalent in SMMU is disabling stage-2 and using the entry 0 in
the PASID table for non-PASID traffic. In this mode, traffic that uses
PASID#0 would be aborted. So using your terms, the SMMU can have VAs and
IOVAs be translated by stage-1 and then, if enabled, be translated by
stage-2 as well.

Thanks,
Jean

> Does ARM SMMU allow stage-1 used for both VA and IOVA? IIRC
> you said PASID#0 reserved for traffic w/o PASID in some mail...>
>>> (assume your irqchip will provide multiple doorbells so each
>>> device can have its own channel).
>>
>> In existing irqchips the doorbell is shared by endpoints, which are
>> differentiated by their device ID (generally the BDF). I'm not sure why
>> this matters here?
> 
> Not matter now with device ID
> 
>>
>>> then once this update is
>>> done, later MSI interrupts from assigned device will go
>>> through physical IOMMU (gIOVA->HPA) then reach irqchip
>>> for irq remapping. vIOMMU is involved only in configuration
>>> path instead of actual interrupt path.
>>
>> Yes the vIOMMU is used to correlate the IOVA written by the guest in its
>> virtual MSI-X table with the MAP request received by the vIOMMU. That is
>> probably used to setup IRQFD routes with KVM. But the vIOMMU is not
>> involved further than that in MSIs.
>>
>>> If my understanding is correct, above will be the natural flow then
>>> why is additional virtio-iommu change required? :-)
>>
>> The change is not *required* for ARM systems, I only proposed removing the
>> doorbell address translation stage to make host implementation simpler
>> (and since virtio-iommu on x86 won't translate the doorbell anyway, we
>> have to add support for this to virtio-iommu). But for Qemu, since vSMMU
>> needs to implement the natural flow anyway, it might not be a lot of
>> effort to also do it for virtio-iommu. Other implementations (e.g.
>> kvmtool) might piggy-back on the x86 way and declare the irqchip doorbell
>> as untranslated.
>>
>> My proposal also breaks when confronted to virtual SVM in a physical ARM
>> system, where the guest owns stage-1 page tables and *has* to map the
>> doorbell if it wants MSIs to work, so you can disregard it :)
>>
> 
> It is a good learning. thanks.
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]