qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] intel-iommu: Document iova_tree


From: Eric Auger
Subject: Re: [PATCH v3] intel-iommu: Document iova_tree
Date: Wed, 7 Dec 2022 10:51:24 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1

Hi Peter,

On 12/6/22 23:13, Peter Xu wrote:
> It seems not super clear on when iova_tree is used, and why.  Add a rich
> comment above iova_tree to track why we needed the iova_tree, and when we
> need it.
>
> Also comment for the map/unmap messages, on how they're used and
> implications (e.g. unmap can be larger than the mapped ranges).
>
> Suggested-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
> v3:
> - Adjust according to Eric's comment
> ---
>  include/exec/memory.h         | 28 ++++++++++++++++++++++++++
>  include/hw/i386/intel_iommu.h | 38 ++++++++++++++++++++++++++++++++++-
>  2 files changed, 65 insertions(+), 1 deletion(-)
>
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index 91f8a2395a..269ecb873b 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -129,6 +129,34 @@ struct IOMMUTLBEntry {
>  /*
>   * Bitmap for different IOMMUNotifier capabilities. Each notifier can
>   * register with one or multiple IOMMU Notifier capability bit(s).
> + *
> + * Normally there're two use cases for the notifiers:
> + *
> + *   (1) When the device needs accurate synchronizations of the vIOMMU page
> + *       tables, it needs to register with both MAP|UNMAP notifies (which
> + *       is defined as IOMMU_NOTIFIER_IOTLB_EVENTS below).
> + *
> + *       Regarding to accurate synchronization, it's when the notified
> + *       device maintains a shadow page table and must be notified on each
> + *       guest MAP (page table entry creation) and UNMAP (invalidation)
> + *       events (e.g. VFIO). Both notifications must be accurate so that
> + *       the shadow page table is fully in sync with the guest view.
> + *
> + *   (2) When the device doesn't need accurate synchronizations of the
> + *       vIOMMU page tables, it needs to register only with UNMAP or
> + *       DEVIOTLB_UNMAP notifies.
> + *
> + *       It's when the device maintains a cache of IOMMU translations
> + *       (IOTLB) and is able to fill that cache by requesting translations
> + *       from the vIOMMU through a protocol similar to ATS (Address
> + *       Translation Service).
> + *
> + *       Note that in this mode the vIOMMU will not maintain a shadowed
> + *       page table for the address space, and the UNMAP messages can be
> + *       actually larger than the real invalidations (just like how the
> + *       Linux IOMMU driver normally works, where an invalidation can be
> + *       enlarged as long as it still covers the target range).  The IOMMU
> + *       notifiee should be able to take care of over-sized invalidations.
>   */
>  typedef enum {
>      IOMMU_NOTIFIER_NONE = 0,
> diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
> index 46d973e629..89dcbc5e1e 100644
> --- a/include/hw/i386/intel_iommu.h
> +++ b/include/hw/i386/intel_iommu.h
> @@ -109,7 +109,43 @@ struct VTDAddressSpace {
>      QLIST_ENTRY(VTDAddressSpace) next;
>      /* Superset of notifier flags that this address space has */
>      IOMMUNotifierFlag notifier_flags;
> -    IOVATree *iova_tree;          /* Traces mapped IOVA ranges */
> +    /*
> +     * @iova_tree traces mapped IOVA ranges.
> +     *
> +     * The tree is not needed if no MAP notifier is registered with current
> +     * VTD address space, because all guest invalidate commands can be
> +     * directly passed to the IOMMU UNMAP notifiers without any further
> +     * reshuffling.
> +     *
> +     * The tree OTOH is required for MAP typed iommu notifiers for a few
> +     * reasons.
> +     *
> +     * Firstly, there's no way to identify whether an PSI (Page Selective
> +     * Invalidations) or DSI (Domain Selective Invalidations) event is an
> +     * MAP or UNMAP event within the message itself.  Without having prior
> +     * knowledge of existing state vIOMMU doesn't know whether it should
> +     * notify MAP or UNMAP for a PSI message it received when caching mode
> +     * is enabled (for MAP notifiers).
> +     *
> +     * Secondly, PSI messages received from guest driver can be enlarged in
> +     * range, covers but not limited to what the guest driver wanted to
> +     * invalidate.  When the range to invalidates gets bigger than the
> +     * limit of a PSI message, it can even become a DSI which will
> +     * invalidate the whole domain.  If the vIOMMU directly notifies the
> +     * registered device with the unmodified range, it may confuse the
> +     * registered drivers (e.g. vfio-pci) on either:
> +     *
> +     *   (1) Trying to map the same region more than once (for
> +     *       VFIO_IOMMU_MAP_DMA, -EEXIST will trigger), or,
> +     *
> +     *   (2) Trying to UNMAP a range that is still partially mapped.
> +     *
> +     * That accuracy is not required for UNMAP-only notifiers, but it is a
> +     * must-to-have for notifiers registered with MAP events, because the
> +     * vIOMMU needs to make sure the shadow page table is always in sync
> +     * with the guest IOMMU pgtables for a device.
> +     */
> +    IOVATree *iova_tree;
>  };
>  
>  struct VTDIOTLBEntry {




reply via email to

[Prev in Thread] Current Thread [Next in Thread]