[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support
From: |
Alex Williamson |
Subject: |
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support |
Date: |
Mon, 31 Mar 2014 13:59:46 -0600 |
On Fri, 2014-03-28 at 16:12 +1100, Alexey Kardashevskiy wrote:
> On 03/20/2014 04:25 PM, David Gibson wrote:
> > On Wed, Mar 19, 2014 at 01:57:41PM -0600, Alex Williamson wrote:
> >> On Wed, 2014-03-12 at 16:52 +1100, Alexey Kardashevskiy wrote:
> >>> From: David Gibson <address@hidden>
> > [snip]
> >>> + if (!memory_region_is_ram(mr)) {
> >>> + DPRINTF("iommu map to non memory area %"HWADDR_PRIx"\n",
> >>> + xlat);
> >>> + return;
> >>> + }
> >>> + if (len & iotlb->addr_mask) {
> >>> + DPRINTF("iommu has granularity incompatible with target AS\n");
> >>
> >> Is this possible? Assuming len is initially a power-of-2, would the
> >> translate function change it? Maybe worth a comment to explain.
> >
> > translate can absolutely change the length. It will generally
> > truncate it to the IOMMU page size, in fact.
> >
> > [snip]
> >>> + DPRINTF("region_add [iommu] %"HWADDR_PRIx" - %"HWADDR_PRIx"\n",
> >>> + iova, int128_get64(int128_sub(llend, int128_one())));
> >>> + /*
> >>> + * FIXME: We should do some checking to see if the
> >>> + * capabilities of the host VFIO IOMMU are adequate to model
> >>> + * the guest IOMMU
> >>> + *
> >>> + * FIXME: This assumes that the guest IOMMU is empty of
> >>> + * mappings at this point - we should either enforce this, or
> >>> + * loop through existing mappings to map them into VFIO.
> >>> + *
> >>> + * FIXME: For VFIO iommu types which have KVM acceleration to
> >>> + * avoid bouncing all map/unmaps through qemu this way, this
> >>> + * would be the right place to wire that up (tell the KVM
> >>> + * device emulation the VFIO iommu handles to use).
> >>> + */
> >>
> >> That's a lot of FIXMEs... The second one in particular looks like it
> >> needs to expand a bit on why this is likely a valid assumption. The
> >> last one is more of a TODO than a FIXME.
> >
> > I think #2 isn't a valid assumption in general. It was true for the
> > situation I was testing at the time, due to the order of pseries
> > initialization, so I left it to get a proof of concept reasonably
> > quickly.
> >
> > But I think that one's a FIXME that actually needs to be fixed.
>
>
> But how?
>
> At the moment, for SPAPR, the table gets cleaned when group is attached to
> a container (VFIO_GROUP_SET_CONTAINER ioctl) which happens right before
> registering the memory listener so we are fine (at least for SPAPR).
> Is that true for x86 or we need something more?
>
> "loop through existing mappings to map them into VFIO" - this I do not
> really understand. We do not track mapping in QEMU so we cannot really loop
> through them.
Making registration of a guest IOMMU more like a MemoryListener would
solve this, the infrastructure should replay existing mappings on
startups and clear them on shutdown, then we could also get rid of the
FIXME on the region_del path.
> > [snip]
> >>> + /*
> >>> + * FIXME: We assume the one big unmap below is adequate to
> >>> + * remove any individual page mappings in the IOMMU which
> >>> + * might have been copied into VFIO. That may not be true for
> >>> + * all IOMMU types
> >>> + */
> >>
> >> We assume this because the IOVA that gets unmapped is the same
> >> regardless of whether a guest IOMMU is present?
> >
> > Uh.. no. This assumption works for a page table based IOMMU where a
> > big unmap just flattens a large range of IO-PTEs.
>
>
> Sorry for my poor english but how exactly is that different from what Alex
> said? IOVA is a PCI bus address between dma_window_start and
> dma_window_start+dma_window_size.
I think David is trying to describe that the IOMMU must be able to unmap
a sparse sub-region of a larger unmap, while I'm trying to make sure
there's no IOVA translation that might interfere with using the
"standard" unmap path rather than the guest IOMMU unmap path. Thanks,
Alex
> > It might not work
> > for some kind of extent or TLB based IOMMU, where operations are
> > expected to exactly match the addresses of map operations.
> >
> > I don't know if IOMMUs that have trouble with this are a realistic
> > prospect, but they're at least a theoretical possibility, hence the
> > comment.
> >
> >>
> >>> + }
> >>> +
> >>> iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> >>> end = (section->offset_within_address_space +
> >>> int128_get64(section->size)) &
> >>> TARGET_PAGE_MASK;
> >>
> >>
> >>
> >
>
>
- Re: [Qemu-devel] [PATCH v5 05/11] vfio: Introduce VFIO address spaces, (continued)
[Qemu-devel] [PATCH v5 03/11] vfio: Fix 128 bit handling, Alexey Kardashevskiy, 2014/03/12
[Qemu-devel] [PATCH v5 04/11] vfio: rework to have error paths, Alexey Kardashevskiy, 2014/03/12
[Qemu-devel] [PATCH v5 09/11] spapr vfio: add vfio_container_spapr_get_info(), Alexey Kardashevskiy, 2014/03/12
[Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Alexey Kardashevskiy, 2014/03/12
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Alexey Kardashevskiy, 2014/03/21
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Alex Williamson, 2014/03/21
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Paolo Bonzini, 2014/03/21
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Alexey Kardashevskiy, 2014/03/28
Re: [Qemu-devel] [PATCH v5 07/11] vfio: Add guest side IOMMU support, Alex Williamson, 2014/03/31
[Qemu-devel] [PATCH v5 02/11] int128: add int128_exts64(), Alexey Kardashevskiy, 2014/03/12
[Qemu-devel] [PATCH v5 08/11] spapr-iommu: add SPAPR VFIO IOMMU device, Alexey Kardashevskiy, 2014/03/12
[Qemu-devel] [PATCH v5 11/11] spapr-vfio: enable for spapr, Alexey Kardashevskiy, 2014/03/12