qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU and vIOMMU support for emulated VF passthrough to


From: Peter Xu
Subject: Re: [Qemu-devel] QEMU and vIOMMU support for emulated VF passthrough to nested (L2) VM
Date: Mon, 8 Apr 2019 13:56:29 +0800
User-agent: Mutt/1.10.1 (2018-07-13)

On Mon, Apr 08, 2019 at 12:32:12AM +0000, Tian, Kevin wrote:

[...]

> > > > Probably.  Currently VT-d emulation does not support snooping control,
> > > > and if you modify that ecap only you probably will encounter this
> > > > problem because then the guest kernel will setup the SNP bit in the
> > > > IOMMU page table entries which will violate the reserved bits in the
> > > > emulation code then you can see these errors.
> > > >
> > > > Now talking about implementing the Snoop Control for Intel IOMMU for
> > > > real (which corresponds to vt-d ecap bit 7) - I'd confess I'm not 100%
> > > > clear on what does the "snooping" mean and what we need to do as an
> > > > emulator. I'm quotting from spec:
> > > >
> > > >   "Snoop behavior for a memory access (to a translation structure
> > > >   entry or access to the mapped page) specifies if the access is
> > > >   coherent (snoops the processor caches) or not."
> > > >
> > > > If it is only a capability showing that whether the hardware is
> > > > capable of snooping processor caches, then I don't think we need to do
> > > > much here as an emulator of VT-d simply because when we access the
> > > > data we're still from the processor's side (because we're emulating
> > > > the IOMMU behavior only) so the cache should always been coherent
> > from
> > > > the POV of guest vCPUs, just like how the processors provide cache
> > > > coherence between two cores (so IMHO here the VT-d emulation code
> > can
> > > > be run on one core/thread, and the vcpu which runs the guest iommu
> > > > driver can be run on another core/thread).  If so, maybe we can simply
> > > > declare support of that but we at least also need to remove the SNP
> > > > bit from vtd_paging_entry_rsvd_field[] array to reflect that we
> > > > understand that bit.
> > > >
> > > > CCing Alex and Kevin to see whether I'm misunderstanding or in case of
> > > > any further input on the snooping support.
> > > >
> > >
> > > for software DMA yes snoop is guaranteed since it's just CPU access.
> > >
> > > However for VFIO device i.e. hardware DMA, snoop should be reported
> > > based on physical IOMMU capability. It's fine to report no snoop control 
> > > on
> > > vIOMMU (current state) even when it's physically supported. It just 
> > > results
> > > that L1 VMM must favor guest cache attributes instead of forcing WB in L1
> > > EPT when doing nested passthrough. However it's incorrect to report snoop
> > > control on vIOMMU when physically it's not supported, otherwise L1 VMM
> > > may force WB in L1 EPT and enable snoop field in vIOMMU 2nd level PTE
> > with
> > > assumption that hardware snoop is guaranteed (however it isn't). Then it
> > > becomes a correctness issue.
> > >
> > 
> > If my device is fully emulated, can I ignore the SNP bit in the SLPTE? What 
> > is
> > the cost of ignoring it in such a case? What could go wrong?
> > (I tried to ignore it and it seems that translations work for me now).
> > 
> 
> I'm not sure what you meant by 'ignore' here. But as earlier pointed
> out by Peter, for emulated devices you don't need do anything special
> here. You can just report snoop capability and then remove it from
> reserved bit check in SLPTE.

Yes.  For simplicity, you can add a new patch for a new property
"x-snooping" into vtd_properties and make it false by default, then
allow the user to turn it on manually considering that the user should
be clear on the consequence of this knob.

Later on we can consider to enrich this property by checking the host
configurations when detected assigned devices (I feel like it can be a
VFIO_DMA_CC_IOMMU ioctl upon every assigned device, or container), or
more.

Regards,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]