qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for ho


From: Jean-Philippe Brucker
Subject: Re: [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
Date: Tue, 30 Jan 2024 18:22:39 +0000

On Mon, Jan 29, 2024 at 05:38:55PM +0100, Eric Auger wrote:
> > There may be a separate argument for clearing bypass. With a coldplugged
> > VFIO device the flow is:
> >
> > 1. Map the whole guest address space in VFIO to implement boot-bypass.
> >    This allocates all guest pages, which takes a while and is wasteful.
> >    I've actually crashed a host that way, when spawning a guest with too
> >    much RAM.
> interesting
> > 2. Start the VM
> > 3. When the virtio-iommu driver attaches a (non-identity) domain to the
> >    assigned endpoint, then unmap the whole address space in VFIO, and most
> >    pages are given back to the host.
> >
> > We can't disable boot-bypass because the BIOS needs it. But instead the
> > flow could be:
> >
> > 1. Start the VM, with only the virtual endpoints. Nothing to pin.
> > 2. The virtio-iommu driver disables bypass during boot
> We needed this boot-bypass mode for booting with virtio-blk-scsi
> protected with virtio-iommu for instance.
> That was needed because we don't have any virtio-iommu driver in edk2 as
> opposed to intel iommu driver, right?

Yes. What I had in mind is the x86 SeaBIOS which doesn't have any IOMMU
driver and accesses the default SATA device:

 $ qemu-system-x86_64 -M q35 -device virtio-iommu,boot-bypass=off
 qemu: virtio_iommu_translate sid=250 is not known!!
 qemu: no buffer available in event queue to report event
 qemu: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address

But it's the same problem with edk2. Also a guest OS without a
virtio-iommu driver needs boot-bypass. Once firmware boot is complete, the
OS with a virtio-iommu driver normally can turn bypass off in the config
space, it's not useful anymore. If it needs to put some endpoints in
bypass, then it can attach them to a bypass domain.

> > 3. Hotplug the VFIO device. With bypass disabled there is no need to pin
> >    the whole guest address space, unless the guest explicitly asks for an
> >    identity domain.
> >
> > However, I don't know if this is a realistic scenario that will actually
> > be used.
> >
> > By the way, do you have an easy way to reproduce the issue described here?
> > I've had to enable iommu.forcedac=1 on the command-line, otherwise Linux
> > just allocates 32-bit IOVAs.
> I don't have a simple generic reproducer. It happens when assigning this
> device:
> Ethernet Controller E810-C for QSFP (Ethernet Network Adapter E810-C-Q2)
> 
> I have not encountered that issue with another device yet.
> I see on guest side in dmesg:
> [    6.849292] ice 0000:00:05.0: Using 64-bit DMA addresses
> 
> That's emitted in dma-iommu.c iommu_dma_alloc_iova().
> Looks like the guest first tries to allocate an iova in the 32-bit AS
> and if this fails use the whole dma_limit.
> Seems the 32b IOVA alloc failed here ;-)

Interesting, are you running some demanding workload and a lot of CPUs?
That's a lot of IOVAs used up, I'm curious about what kind of DMA pattern
does that.

Thanks,
Jean



reply via email to

[Prev in Thread] Current Thread [Next in Thread]