qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ran


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ranges
Date: Thu, 3 May 2018 20:01:20 +0800
User-agent: Mutt/1.9.3 (2018-01-21)

On Thu, May 03, 2018 at 05:53:59PM +0800, Peter Xu wrote:
> On Thu, May 03, 2018 at 05:22:03PM +0800, Jason Wang wrote:
> > 
> > 
> > On 2018年05月03日 15:53, Peter Xu wrote:
> > > On Thu, May 03, 2018 at 03:43:35PM +0800, Jason Wang wrote:
> > > > 
> > > > On 2018年05月03日 15:28, Peter Xu wrote:
> > > > > On Thu, May 03, 2018 at 03:20:11PM +0800, Jason Wang wrote:
> > > > > > On 2018年05月03日 14:04, Peter Xu wrote:
> > > > > > > IMHO the guest can't really detect this, but it'll found that the
> > > > > > > device is not working functionally if it's doing something like 
> > > > > > > what
> > > > > > > Jason has mentioned.
> > > > > > > 
> > > > > > > Actually now I have had an idea if we really want to live well 
> > > > > > > even
> > > > > > > with Jason's example: maybe we'll need to identify PSI/DSI.  For 
> > > > > > > DSI,
> > > > > > > we don't remap for mapped pages; for PSI, we unmap and remap the
> > > > > > > mapped pages.  That'll complicate the stuff a bit, but it should
> > > > > > > satisfy all the people.
> > > > > > > 
> > > > > > > Thanks,
> > > > > > So it looks like there will be still unnecessary unamps.
> > > > > Could I ask what do you mean by "unecessary unmaps"?
> > > > It's for "for PSI, we unmap and remap the mapped pages". So for the 
> > > > first
> > > > "unmap" how do you know it was really necessary without knowing the 
> > > > state of
> > > > current shadow page table?
> > > I don't.  Could I just unmap it anyway?  Say, now the guest _modified_
> > > the PTE already.  Yes I think it's following the spec, but it is
> > > really _unsafe_.  We can know that from what it has done already.
> > > Then I really think a unmap+map would be good enough for us...  After
> > > all that behavior can cause DMA error even on real hardwares.  It can
> > > never tell.
> > 
> > I mean for following case:
> > 
> > 1) guest maps A1 (iova) to XXX
> > 2) guest maps A2 (A1 + 4K) (iova) to YYY
> > 3) guest maps A3 (A1 + 8K) (iova) to ZZZ
> > 4) guest unmaps A2 and A2, for reducing the number of PSIs, it can
> > invalidate A1 with a range of 2M
> > 
> > If this is allowed by spec, looks like A1 will be unmaped and mapped.
> 
> My follow-up patch won't survive with this one but the original patch
> will work.
> 
> Jason and I discussed a bit on IRC on this matter.  Here's the
> conclusion we got: for now we use my original patch (which solves
> everything except PTE modifications).  We mark that modify-PTE problem
> as TODO. Then at least we can have the nested device assignment work
> well on known OSs first.

Here just to mention that we actually have no way to emulate a PTE
modification procedure.  The problem is that we can never atomically
modify a PTE on the host with Linux, either via VFIO interface or even
directly using IOMMU API in kernel.  To be more specific to our use
case - VFIO provides VFIO_IOMMU_MAP_DMA and VFIO_IOMMU_UNMAP_DMA, but
it never provides VFIO_IOMMU_MODIFY_DMA to modify a PTE atomically.
It means that even if we know the PTE has changed, then we can only
unmap it and remap.  It'll still have the same "invalid window"
problem we have discussed since during unmap and remap the page is
invalid (while from guest POV it should never, since the PTE
modification is atomic).

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]