Re: [Qemu-devel] RFC: vfio API changes needed for powerpc

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] RFC: vfio API changes needed for powerpc

From:	Scott Wood
Subject:	Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Date:	Wed, 3 Apr 2013 16:19:36 -0500

On 04/02/2013 10:37:20 PM, Alex Williamson wrote:

On Tue, 2013-04-02 at 17:50 -0500, Scott Wood wrote:
> On 04/02/2013 04:38:45 PM, Alex Williamson wrote:
> > On Tue, 2013-04-02 at 16:08 -0500, Stuart Yoder wrote:
> > > On Tue, Apr 2, 2013 at 3:57 PM, Scott Wood
> > <address@hidden> wrote:

> > > >> > C. Explicit mapping using normal DMA map. The lastidea

> > is that

> > > >> > we would introduce a new ioctl to give user-spacean fd

> > to

> > > >> > the MSI bank, which could be mmapped. The flowwould be

> > > >> >        something like this:
> > > >> >           -for each group user space calls new ioctl
> > > >> > VFIO_GROUP_GET_MSI_FD
> > > >> >           -user space mmaps the fd, getting a vaddr

> > > >> > -user space does a normal DMA map for desirediova> > > >> > This approach makes everything explicit, but adds anew

> > ioctl

> > > >> > applicable most likely only to the PAMU (type2iommu).

> > > >>

> > > >> And the DMA_MAP of that mmap then allows userspace to selectthe

> > window
> > > >> used?  This one seems like a lot of overhead, adding a new
> > ioctl, new
> > > >> fd, mmap, special mapping path, etc.
> > > >
> > > >
> > > > There's going to be special stuff no matter what.  This would
> > keep it
> > > > separated from the IOMMU map code.
> > > >
> > > > I'm not sure what you mean by "overhead" here... the runtime
> > overhead of
> > > > setting things up is not particularly relevant as long as it's
> > reasonable.
> > > > If you mean development and maintenance effort, keeping things
> > well
> > > > separated should help.
> > >

> > > We don't need to change DMA_MAP. If we can simply add a new"type

> > 2"
> > > ioctl that allows user space to set which windows are MSIs, it
> > seems vastly
> > > less complex than an ioctl to supply a new fd, mmap of it, etc.
> > >
> > > So maybe 2 ioctls:
> > >     VFIO_IOMMU_GET_MSI_COUNT
>

> Do you mean a count of actual MSIs or a count of MSI banks used bythe

> whole VFIO group?

I hope the latter, which would clarify how this is distinct from
DEVICE_GET_IRQ_INFO.  Is hotplug even on the table?  Presumably
dynamically adding a device could bring along additional MSI banks?

I'm not sure -- maybe we could say that hotplug can add banks, but notremove them or change the order, so userspace would just need to checkif the number of banks changed, and map the extras.

The current VFIO MSI support has the host handling everything aboutMSI.
The user never programs an MSI vector to the physical device, they set
up everything through ioctl. On interrupt, we simply trigger aneventfdand leave it to things like KVM irqfd or QEMU to do the right thingin a
virtual machine.

Here the MSI vector has to go through a PAMU window to hit the correct
MSI bank.  So that means it has some component of the iova involved,
which we're proposing here is controlled by userspace (whether that
vector uses an offset from 0x10000000 or 0x00000000 depending on which
window slot is used to make the MSI bank). I assume we're stillworking
in a model where the physical interrupt fires into the host and a
host-based interrupt handler triggers an eventfd, right?


Yes (subject to possible future optimizations).

So that means the vector also has host components so we trigger thecorrect ISR. How
is that coordinated?

Everything but the iova component needs to come from the host MSIallocator.

Would is be possible for userspace to simply leave room for MSI bank
mapping (how much room could be determined by something like
VFIO_IOMMU_GET_MSI_BANK_COUNT) then document the API that userspacecan
DMA_MAP starting at the 0x0 address of the aperture, growing up, and
VFIO will map banks on demand at the top of the aperture, growingdown?
Wouldn't that avoid a lot of issues with userspace needing to know
anything about MSI banks (other than count) and coordinating irqnumbers
and enabling handlers?

This would restrict a (possibly unlikely) use case where the user wantsto map something near the top of the aperture but has another placeMSIs can go (or is willing to live without MSIs). Otherwise it couldbe workable, as long as we can require an explicit MSI enabling on adevice to happen after the aperture and subwindow count are set up.I'm not sure it would really buy anything over having userspace iterateover the MSI bank count, though -- it would probably be a bit morecomplicated.

> > On x86 MSI count is very
> > device specific, which means it wold be a VFIO_DEVICE_* ioctl
> > (actually
> > VFIO_DEVICE_GET_IRQ_INFO does this for us on x86). The troublewith
> > it
> > being a device ioctl is that you need to get the device FD, butthe> > IOMMU protection needs to be established before you can getthat... so
> > there's an ordering problem if you need it from the device before
> > configuring the IOMMU.  Thanks,
>
> What do you mean by "IOMMU protection needs to be established"?
> Wouldn't we just start with no mappings in place?
If no mappings blocks all DMA, sure, that's fine. Once the VFIOdevice
FD is accessible by userspace we have to protect the host against DMA.
If any IOMMU_SET_ATTR calls temporarily disable DMA protection, that
could be exploitable.  Thanks,

Unless the PAMU is globally in bypass mode (which it wouldn't be),there's no way to disable protection other than creating one giantmapping.


-Scott

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] RFC: vfio API changes needed for powerpc, (continued)

Prev by Date: Re: [Qemu-devel] [PATCH v2] target-ppc: Fix narrow-mode add/sub carry output
Next by Date: Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Previous by thread: Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Next by thread: Re: [Qemu-devel] RFC: vfio API changes needed for powerpc
Index(es):
- Date
- Thread