qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v4 1/3] memory: introduce IOMMUNotifier and its


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v4 1/3] memory: introduce IOMMUNotifier and its caps
Date: Wed, 14 Sep 2016 17:15:03 +1000
User-agent: Mutt/1.7.0 (2016-08-17)

On Wed, Sep 14, 2016 at 03:48:32PM +1000, David Gibson wrote:
> On Fri, Sep 09, 2016 at 10:57:42AM +0800, Peter Xu wrote:
> > IOMMU Notifier list is used for notifying IO address mapping changes.
> > Currently VFIO is the only user.
> > 
> > However it is possible that future consumer like vhost would like to
> > only listen to part of its notifications (e.g., cache invalidations).
> > 
> > This patch introduced IOMMUNotifier and IOMMUNotfierFlag bits for a
> > finer grained control of it.
> > 
> > IOMMUNotifier contains a bitfield for the notify consumer describing
> > what kind of notification it is interested in. Currently two kinds of
> > notifications are defined:
> > 
> > - IOMMU_NOTIFIER_MAP:    for newly mapped entries (additions)
> > - IOMMU_NOTIFIER_UNMAP:  for entries to be removed (cache invalidates)
> > 
> > When registering the IOMMU notifier, we need to specify one or multiple
> > types of messages to listen to.
> > 
> > When notifications are triggered, its type will be checked against the
> > notifier's type bits, and only notifiers with registered bits will be
> > notified.
> > 
> > Signed-off-by: Peter Xu <address@hidden>
> > ---
> >  hw/vfio/common.c              |  3 ++-
> >  include/exec/memory.h         | 38 +++++++++++++++++++++++++++++++-------
> >  include/hw/vfio/vfio-common.h |  2 +-
> >  memory.c                      | 37 ++++++++++++++++++++++++++++---------
> >  4 files changed, 62 insertions(+), 18 deletions(-)
> > 
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index b313e7c..41b6a13 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -293,7 +293,7 @@ static bool 
> > vfio_listener_skipped_section(MemoryRegionSection *section)
> >             section->offset_within_address_space & (1ULL << 63);
> >  }
> >  
> > -static void vfio_iommu_map_notify(Notifier *n, void *data)
> > +static void vfio_iommu_map_notify(IOMMUNotifier *n, void *data)
> >  {
> >      VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
> >      VFIOContainer *container = giommu->container;
> > @@ -454,6 +454,7 @@ static void vfio_listener_region_add(MemoryListener 
> > *listener,
> >                                 section->offset_within_region;
> >          giommu->container = container;
> >          giommu->n.notify = vfio_iommu_map_notify;
> > +        giommu->n.notifier_flags = IOMMU_NOTIFIER_ALL;
> >          QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
> >  
> >          memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
> > diff --git a/include/exec/memory.h b/include/exec/memory.h
> > index 3e4d416..e69e984 100644
> > --- a/include/exec/memory.h
> > +++ b/include/exec/memory.h
> > @@ -67,6 +67,27 @@ struct IOMMUTLBEntry {
> >      IOMMUAccessFlags perm;
> >  };
> >  
> > +/*
> > + * Bitmap for differnet IOMMUNotifier capabilities. Each notifier can
> > + * register with one or multiple IOMMU Notifier capability bit(s).
> > + */
> > +typedef enum {
> > +    IOMMU_NOTIFIER_NONE = 0,
> > +    /* Notify cache invalidations */
> > +    IOMMU_NOTIFIER_UNMAP = 0x1,
> > +    /* Notify entry changes (newly created entries) */
> > +    IOMMU_NOTIFIER_MAP = 0x2,
> > +} IOMMUNotifierFlag;
> > +
> > +#define IOMMU_NOTIFIER_ALL (IOMMU_NOTIFIER_MAP | IOMMU_NOTIFIER_UNMAP)
> > +
> > +struct IOMMUNotifier {
> > +    void (*notify)(struct IOMMUNotifier *notifier, void *data);
> > +    IOMMUNotifierFlag notifier_flags;
> > +    QLIST_ENTRY(IOMMUNotifier) node;
> > +};
> > +typedef struct IOMMUNotifier IOMMUNotifier;
> > +
> >  /* New-style MMIO accessors can indicate that the transaction failed.
> >   * A zero (MEMTX_OK) response means success; anything else is a failure
> >   * of some kind. The memory subsystem will bitwise-OR together results
> > @@ -201,7 +222,7 @@ struct MemoryRegion {
> >      const char *name;
> >      unsigned ioeventfd_nb;
> >      MemoryRegionIoeventfd *ioeventfds;
> > -    NotifierList iommu_notify;
> > +    QLIST_HEAD(, IOMMUNotifier) iommu_notify;
> >  };
> >  
> >  /**
> > @@ -620,11 +641,12 @@ void memory_region_notify_iommu(MemoryRegion *mr,
> >   * IOMMU translation entries.
> >   *
> >   * @mr: the memory region to observe
> > - * @n: the notifier to be added; the notifier receives a pointer to an
> > - *     #IOMMUTLBEntry as the opaque value; the pointer ceases to be
> > - *     valid on exit from the notifier.
> > + * @n: the IOMMUNotifier to be added; the notify callback receives a
> > + *     pointer to an #IOMMUTLBEntry as the opaque value; the pointer
> > + *     ceases to be valid on exit from the notifier.
> >   */
> > -void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
> > +void memory_region_register_iommu_notifier(MemoryRegion *mr,
> > +                                           IOMMUNotifier *n);
> >  
> >  /**
> >   * memory_region_iommu_replay: replay existing IOMMU translations to
> > @@ -636,7 +658,8 @@ void memory_region_register_iommu_notifier(MemoryRegion 
> > *mr, Notifier *n);
> >   * @is_write: Whether to treat the replay as a translate "write"
> >   *     through the iommu
> >   */
> > -void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool 
> > is_write);
> > +void memory_region_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n,
> > +                                bool is_write);
> >  
> >  /**
> >   * memory_region_unregister_iommu_notifier: unregister a notifier for
> > @@ -646,7 +669,8 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
> > Notifier *n, bool is_write);
> >   *      needs to be called
> >   * @n: the notifier to be removed.
> >   */
> > -void memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier 
> > *n);
> > +void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
> > +                                             IOMMUNotifier *n);
> >  
> >  /**
> >   * memory_region_name: get a memory region's name
> > diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > index 94dfae3..c17602e 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -93,7 +93,7 @@ typedef struct VFIOGuestIOMMU {
> >      VFIOContainer *container;
> >      MemoryRegion *iommu;
> >      hwaddr iommu_offset;
> > -    Notifier n;
> > +    IOMMUNotifier n;
> >      QLIST_ENTRY(VFIOGuestIOMMU) giommu_next;
> >  } VFIOGuestIOMMU;
> >  
> > diff --git a/memory.c b/memory.c
> > index 0eb6895..f65c600 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -1418,7 +1418,7 @@ void memory_region_init_iommu(MemoryRegion *mr,
> >      memory_region_init(mr, owner, name, size);
> >      mr->iommu_ops = ops,
> >      mr->terminates = true;  /* then re-forwards */
> > -    notifier_list_init(&mr->iommu_notify);
> > +    QLIST_INIT(&mr->iommu_notify);
> >  }
> >  
> >  static void memory_region_finalize(Object *obj)
> > @@ -1513,13 +1513,16 @@ bool memory_region_is_logging(MemoryRegion *mr, 
> > uint8_t client)
> >      return memory_region_get_dirty_log_mask(mr) & (1 << client);
> >  }
> >  
> > -void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
> > +void memory_region_register_iommu_notifier(MemoryRegion *mr,
> > +                                           IOMMUNotifier *n)
> >  {
> > +    /* We need to register for at least one bitfield */
> > +    assert(n->notifier_flags != IOMMU_NOTIFIER_NONE);
> >      if (mr->iommu_ops->notify_started &&
> > -        QLIST_EMPTY(&mr->iommu_notify.notifiers)) {
> > +        QLIST_EMPTY(&mr->iommu_notify)) {
> >          mr->iommu_ops->notify_started(mr);
> >      }
> > -    notifier_list_add(&mr->iommu_notify, n);
> > +    QLIST_INSERT_HEAD(&mr->iommu_notify, n, node);
> >  }
> >  
> >  uint64_t memory_region_iommu_get_min_page_size(MemoryRegion *mr)
> > @@ -1531,7 +1534,8 @@ uint64_t 
> > memory_region_iommu_get_min_page_size(MemoryRegion *mr)
> >      return TARGET_PAGE_SIZE;
> >  }
> >  
> > -void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n, bool 
> > is_write)
> > +void memory_region_iommu_replay(MemoryRegion *mr, IOMMUNotifier *n,
> > +                                bool is_write)
> >  {
> >      hwaddr addr, granularity;
> >      IOMMUTLBEntry iotlb;
> > @@ -1552,11 +1556,12 @@ void memory_region_iommu_replay(MemoryRegion *mr, 
> > Notifier *n, bool is_write)
> >      }
> >  }
> >  
> > -void memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n)
> > +void memory_region_unregister_iommu_notifier(MemoryRegion *mr,
> > +                                             IOMMUNotifier *n)
> >  {
> > -    notifier_remove(n);
> > +    QLIST_REMOVE(n, node);
> >      if (mr->iommu_ops->notify_stopped &&
> > -        QLIST_EMPTY(&mr->iommu_notify.notifiers)) {
> > +        QLIST_EMPTY(&mr->iommu_notify)) {
> >          mr->iommu_ops->notify_stopped(mr);
> >      }
> >  }
> > @@ -1564,8 +1569,22 @@ void 
> > memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n)
> >  void memory_region_notify_iommu(MemoryRegion *mr,
> >                                  IOMMUTLBEntry entry)
> >  {
> > +    IOMMUNotifier *iommu_notifier;
> > +    IOMMUNotifierFlag request_flags;
> > +
> >      assert(memory_region_is_iommu(mr));
> > -    notifier_list_notify(&mr->iommu_notify, &entry);
> > +
> > +    if (entry.perm & IOMMU_RW) {
> > +        request_flags = IOMMU_NOTIFIER_MAP;
> > +    } else {
> > +        request_flags = IOMMU_NOTIFIER_UNMAP;
> > +    }
> 
> This is still wrong.  UNMAP depends on the *previous* state of the
> mapping, not the new state.

Peter pointed out to be on IRC that VFIO already assumes that it's
only an unmap if the new permissions are NONE.  So one can argue that
it's an existing constraint of the IOMMUTLBEntry interface that a
mapping can only ever transition from valid->invalid or
invalid->valid.  Changing one valid entry to another would require two
notifications one switching it to a blank entry with NONE permissions,
then another notifying the new valid mapping.

Assuming that constraint, Peter's patch is correct.

I'm pretty uneasy about that constraint, because it's not necessarily
obvious to someone implementing a new vIOMMU device, which is
responsible for triggering the notifies.  From just the callback, it
looks like it should be fine to just fire the notify with the new
mapping which replaced the old.

Peter suggested commenting this next to the IOTLBEntry definition, and
I think that's probably ok for now.  I do think we should consider
changing the notify interface to make this more obvious.  I can see
one of two ways to do that:

    * Fully allow in-place changes to be notified - the callback would
      need to be passed both the new entry and at least the old
      permissions, if not the old entry.

    * Instead have separate map and unmap notifier chains with
      separate callbacks.  That should make it obvious to a vIOMMU
      author that an in-place change would need first an unmap
      notify, then a map notify.

> 
> > +
> > +    QLIST_FOREACH(iommu_notifier, &mr->iommu_notify, node) {
> > +        if (iommu_notifier->notifier_flags & request_flags) {
> > +            iommu_notifier->notify(iommu_notifier, &entry);
> > +        }
> > +    }
> >  }
> >  
> >  void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
> 



-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]