qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multiple vIOMMU instance support in QEMU?


From: Peter Xu
Subject: Re: Multiple vIOMMU instance support in QEMU?
Date: Thu, 18 May 2023 15:45:24 -0400

On Thu, May 18, 2023 at 11:56:46AM -0300, Jason Gunthorpe wrote:
> On Thu, May 18, 2023 at 10:16:24AM -0400, Peter Xu wrote:
> 
> > What you mentioned above makes sense to me from the POV that 1 vIOMMU may
> > not suffice, but that's at least totally new area to me because I never
> > used >1 IOMMUs even bare metal (excluding the case where I'm aware that
> > e.g. a GPU could have its own IOMMU-like dma translator).
> 
> Even x86 systems are multi-iommu, one iommu per physical CPU socket.

I tried to look at a 2-node system on hand and I indeed got two dmars:

[    4.444788] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap 8d2078c106f0466 
ecap f020df
[    4.459673] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap 8d2078c106f0466 
ecap f020df

Though they do not seem to be all parallel on attaching devices.  E.g.,
most of the devices on this host are attached to dmar1, while there're only
two devices attached to dmar0:

80:05.2 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 
v4/Xeon D IIO RAS/Control Status/Global Errors (rev 01)
80:05.0 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 
v4/Xeon D Map/VTd_Misc/System Management (rev 01)

> 
> I'm not sure how they model this though - Kevin do you know? Do we get
> multiple iommu instances in Linux or is all the broadcasting of
> invalidates and sharing of tables hidden?
> 
> > What's the system layout of your multi-vIOMMU world?  Is there still a
> > centric vIOMMU, or multi-vIOMMUs can run fully in parallel, so that e.g. we
> > can have DEV1,DEV2 under vIOMMU1 and DEV3,DEV4 under vIOMMU2?
> 
> Just like physical, each viommu is parallel and independent. Each has
> its own caches, ASIDs, DIDs/etc and thus invalidation domains.
> 
> The seperated caches is the motivating reason to do this as something
> like vCMDQ is a direct command channel for invalidations to only the
> caches of a single IOMMU block.

>From cache invalidation pov, shouldn't the best be per-device granule (like
dev-iotlb in VT-d? No idea for ARM)?

But that's two angles I assume - currently dev-iotlb is still emulated at
least in QEMU.  Having a hardware accelerated queue is definitely another
thing.

> 
> > Is it a common hardware layout or nVidia specific?
> 
> I think it is pretty normal, you have multiple copies of the IOMMU and
> its caches for physical reasons.
> 
> The only choice is if the platform HW somehow routes invalidations to
> all IOMMUs or requires SW to route/replicate invalidates.
> 
> ARM's IP seems to be designed toward the latter so I expect it is
> going to be common on ARM.

Thanks for the information, Jason.

I see that Intel is already copied here (at least Yi and Kevin) so I assume
there're already some kind of synchronizations on multi-vIOMMU vs recent
works on Intel side, which is definitely nice and can avoid work conflicts.

We should probably also copy Jason Wang and mst when there's any formal
proposal.  I've got them all copied here too.

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]