qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] s390x/pci: add support for guests that request direct ma


From: David Hildenbrand
Subject: Re: [PATCH 1/2] s390x/pci: add support for guests that request direct mapping
Date: Tue, 10 Dec 2024 09:58:36 +0100
User-agent: Mozilla Thunderbird

On 10.12.24 00:22, Matthew Rosato wrote:
On 12/9/24 5:09 PM, David Hildenbrand wrote:
On 09.12.24 22:45, Matthew Rosato wrote:
On 12/9/24 4:01 PM, David Hildenbrand wrote:
On 09.12.24 20:29, Matthew Rosato wrote:

Hi,

Trying to wrap my head around that ... you mention that "pin the entirety of guest 
memory".

Do you mean that we will actually end up longterm pinning all guest RAM in the 
kernel, similar to what vfio ends up doing?

Yes.  Actually, the usecase here is specifically PCI passthrough via vfio-pci 
on s390.  Unlike other platforms, the default s390 approach only pins on-demand 
and doesn't longterm pin all guest RAM, which is nice from a memory footprint 
perspective but pays a price via all those guest-2 RPCIT instructions.  The 
goal here is now provide the optional alternative to longterm pin like other 
platforms.

Okay, thanks for confirming. One more question: who will trigger this 
longterm-pinning? Is it vfio?

(the code flow from your code to the pinning code would be nice)


Thanks for the details.


Yes, the vfio IOMMU code triggers it.  My s390_pci_setup_stage2_map added by this patch 
calls memory_region_notify_iommu in a loop such that we trigger iommu notifications to map 
iova X+pba -> GPA X for all GPA, where pba = a "base address" offset that has 
to be applied when mapping on s390.

Ah, now I see that you use "ms->ram_size", so indeed, this will map initial RAM only.

The more-future-proof approach would indeed be to register a memory listener on &address_space_memory, to then map/unmap whenever notified about map/unmap.

See "struct vfio_memory_listener" with its region_add/region_del callbacks that do that, and also implement the RAMDiscardManager plumbing.


And now I wonder if there would be a way to just reuse "struct vfio_memory_listener" here some way? Or at least reuse the map/unmap functions, because ...

The notifications are sent in the largest batches possible to minimize vfio ioctls / use the least number of vfio dma mappings.

The iommu notifications get picked up in vfio_iommu_map_notify where we will 
follow the container DMA path down to vfio_legacy_dma_map; then ultimately vfio 
is handling the pinning via the VFIO_IOMMU_MAP_DMA ioctl.

... vfio_listener_region_add() will also just call vfio_container_dma_map().

Maybe there is a reason s390x needs to handle this using memory_region_notify_iommu() callbacks instead of doing it similar to "struct vfio_memory_listener" when registered on &address_space_memory without a viommu.




In that case, it would be incompatible with virtio-balloon (and without 
modifications with upcoming virtio-mem). Is there already a mechanism in place 
to handle that -- a call  to ram_block_discard_disable() -- or even a way to 
support coordinated discarding of RAM (e.g., virtio-mem + vfio)?

Good point, should be calling add ram_block_discard_disable(true) when set 
register + a corresponding (false) during deregister...  Will add for v2.

As for supporting coordinated discard, I was hoping to subsequently look at 
virtio-mem for this.

As long as discarding is blocked for now, we're good. To support it, the 
RAMDiscardManager would have to be wired up, similar to vfio.

I think the current way of handling it via
vf
+    IOMMUTLBEvent event = {
+        .type = IOMMU_NOTIFIER_MAP,
+        .entry = {
+            .target_as = &address_space_memory,
+            .translated_addr = 0,
+            .perm = IOMMU_RW,
+        },
+    };


Is probably not ideal: it cannot cope with memory holes (which virtio-mem would 
create).

Likely, you'd instead want an address space notifier, and really only map the 
memory region sections you get notified about.

There, you can test for RAMDiscardManager and handle it like vfio does.


I'll start looking into this; for the moment I'll plan on blocking discarding 
in this series with a follow-on to then enable virtio-mem, but if I get 
something working sooner I'll add it to this series.  Either way I'll put you 
on CC.

Thanks, exploring whether we can reuse the vfio bits to map/unmap might be very valuable and simplify things.

--
Cheers,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]