qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ran


From: Peter Xu
Subject: Re: [Qemu-devel] [PATCH 08/10] intel-iommu: maintain per-device iova ranges
Date: Thu, 3 May 2018 14:04:42 +0800
User-agent: Mutt/1.9.3 (2018-01-21)

On Fri, Apr 27, 2018 at 11:37:24PM +0000, Tian, Kevin wrote:

[...]

> > Self NAK on this...
> > 
> > More than half of the whole series tries to solve the solo problem
> > that we unmapped some pages that were already mapped, which proved
> > to
> > be wrong.  Now if we squash the change we will do the same wrong
> > thing, so we'll still have a very small window that the remapped page
> > be missing from a device's POV.
> > 
> > Now to solve this I suppose we'll need to cache every translation then
> > we can know whether a mapping has changed, and we only remap when it
> > really has changed.  But I'm afraid that can be a big amount of data
> > for nested guests.  For a most common 4G L2 guest, I think the worst
> > case (e.g., no huge page at all, no continuous pages) is 4G/4K=1M
> > entries in that tree.
> 
> I think one key factor to think about is the effect of PSI. From
> VT-d spec, all internal caches (IOTLB entries, paging-structure
> cache entries, etc.) associated with specified address range
> must be flushed accordingly, i.e. no cache on stale mapping.
> Now per-device iova ranges is new type of cache introduced
> by Qemu VT-d. It doesn't cache actual mapping but its purpose
> is to decide whether to notify VFIO for updated mapping. In
> this manner if we don't differentiate whether an entry is
> for stale mapping, looks the definition of PSI is broken.
> 
> ask another question. In reality how much improvement this
> patch can bring, i.e. is it usual to see guest map on an already
> mapped region, or unmap an already unmapped region?

The funny thing is that there is actually no MAP/UNMAP flag for
PSI/DSI.  For either MAP/UNMAP, guest send one PSI for that range, or
even a DSI (Domain Selective Invalidations).

This patch will mostly be helpful not really for PSIs, but for DSI.
Please have a look on Issue (4) that I mentioned in the cover letter.
This patch is the core part to solve that DMA error problem.

An example is that we can get DSI for a domain that already have
existing mappings. In QEMU, we handle DSI as a whole-range PSI, so
before this patch we will unmap those already mapped pages then remap
all of them.  However we can't map the same page again, so we cache
what we have mapped here.

> 
> > 
> > Is it really worth it to solve this possibly-buggy-guest-OS problem
> > with such a overhead?  I don't know..
> 
> If adding overhead removes the benefit of this patch, then 
> definitely not a good thing.

For the problem that we are going to solve, this patch is not really a
beneficial one, but fixes a critical bug.  Again, please refer to the
issue (4) of the cover letter for that 3ms window problem.

> 
> > 
> > I'm not sure whether it's still acceptable that we put this issue
> > aside.  We should know that normal OSs should not do this, and if they
> > do, IMHO it proves a buggy OS already (so even from hardware POV we
> > allow this, from software POV it should still be problematic), then
> > it'll have problem for sure, but only within the VM itself, and it
> > won't affect other VMs or the host.  That sounds still reasonable to
> > me so far.
> 
> As said earlier, what I'm worried is whether there is a way to
> detect such case when your assumption is violated. usually
> emulation can choose to not implement all the features which
> are supported on the real device, but it's done in a way that
> non-supported features/behaviors can be reported to guest
> (based on spec definition) thus guest knows the expectation
> from the emulated device...

IMHO the guest can't really detect this, but it'll found that the
device is not working functionally if it's doing something like what
Jason has mentioned.

Actually now I have had an idea if we really want to live well even
with Jason's example: maybe we'll need to identify PSI/DSI.  For DSI,
we don't remap for mapped pages; for PSI, we unmap and remap the
mapped pages.  That'll complicate the stuff a bit, but it should
satisfy all the people.

Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]