Re: [RFC PATCH 0/6] Enable shared device assignment

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/6] Enable shared device assignment

From:	David Hildenbrand
Subject:	Re: [RFC PATCH 0/6] Enable shared device assignment
Date:	Wed, 31 Jul 2024 13:18:35 +0200
User-agent:	Mozilla Thunderbird

Sorry for the late reply!

Current users must skip it, yes. How private memory would have to be
handled, and who would handle it, is rather unclear.

Again, maybe we'd want separate RamDiscardManager for private and shared
memory (after all, these are two separate memory backends).


We also considered distinguishing the populate and discard operation for
private and shared memory separately. As in method 2 above, we mentioned
to add a new argument to indicate the memory attribute to operate on.
They seem to have a similar idea.

Yes. Likely it's just some implementation detail. I think the followingstates would be possible:


* Discarded in shared + discarded in private (not populated)
* Discarded in shared + populated in private (private populated)
* Populated in shared + discarded in private (shared populated)

One could map these to states discarded/private/shared indeed.

[...]

I've had this talk with Intel, because the 4K granularity is a pain. I
was told that ship has sailed ... and we have to cope with random 4K
conversions :(

The many mappings will likely add both memory and runtime overheads in
the kernel. But we only know once we measure.


In the normal case, the main runtime overhead comes from
private<->shared flip in SWIOTLB, which defaults to 6% of memory with a
maximum of 1Gbyte. I think this overhead is acceptable. In non-default
case, e.g. dynamic allocated DMA buffer, the runtime overhead will
increase. As for the memory overheads, It is indeed unavoidable.

Will these performance issues be a deal breaker for enabling shared
device assignment in this way?

I see the most problematic part being the dma_entry_limit and all ofthese individual MAP/UNMAP calls on 4KiB granularity.

dma_entry_limit is "unsigned int", and defaults to U16_MAX. So thepossible maximum should be 4294967296, and the default is 65535.

So we should be able to have a maximum of 16 TiB shared memory all in4KiB chunks.

sizeof(struct vfio_dma) is probably something like <= 96 bytes, implyinga per-page overhead of ~2.4%, excluding the actual rbtree.

Tree lookup/modifications with that many nodes might also get a bitslower, but likely still tolerable as you note.

Deal breaker? Not sure. Rather "suboptimal" :) ... but maybe unavoidablefor your use case?


--
Cheers,

David / dhildenb

[Prev in Thread]

Current Thread

[Next in Thread]

[RFC PATCH 5/6] guest-memfd: Default to discarded (private) in guest_memfd_manager, (continued)
- [RFC PATCH 5/6] guest-memfd: Default to discarded (private) in guest_memfd_manager, Chenyi Qiang, 2024/07/25
- [RFC PATCH 6/6] RAMBlock: make guest_memfd require coordinate discard, Chenyi Qiang, 2024/07/25
- Re: [RFC PATCH 0/6] Enable shared device assignment, David Hildenbrand, 2024/07/25
  - RE: [RFC PATCH 0/6] Enable shared device assignment, Tian, Kevin, 2024/07/26
    - Re: [RFC PATCH 0/6] Enable shared device assignment, David Hildenbrand, 2024/07/26
    - Re: [RFC PATCH 0/6] Enable shared device assignment, Xu Yilun, 2024/07/31
    - Re: [RFC PATCH 0/6] Enable shared device assignment, David Hildenbrand, 2024/07/31
  - Re: [RFC PATCH 0/6] Enable shared device assignment, Chenyi Qiang, 2024/07/26
    - Re: [RFC PATCH 0/6] Enable shared device assignment, David Hildenbrand, 2024/07/26
    - Re: [RFC PATCH 0/6] Enable shared device assignment, Chenyi Qiang, 2024/07/26
    - Re: [RFC PATCH 0/6] Enable shared device assignment, David Hildenbrand <=

Prev by Date: [PULL 06/10] target/i386/cpu: Mask off SGX/SGX_LC feature words for non-PC machine
Next by Date: [PULL 09/10] qemu-vmsr-helper: fix socket loop breakage
Previous by thread: Re: [RFC PATCH 0/6] Enable shared device assignment
Next by thread: Re: [PATCH 0/3] S3 and S4 sleep for loongarch/virt & microvm
Index(es):
- Date
- Thread