qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 09/16] hw/vfio/platform: add vfio-platform su


From: Eric Auger
Subject: Re: [Qemu-devel] [PATCH v7 09/16] hw/vfio/platform: add vfio-platform support
Date: Wed, 26 Nov 2014 10:45:52 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 11/05/2014 11:29 AM, Alexander Graf wrote:
> 
> 
> On 31.10.14 15:05, Eric Auger wrote:
>> Minimal VFIO platform implementation supporting
>> - register space user mapping,
>> - IRQ assignment based on eventfds handled on qemu side.
>>
>> irqfd kernel acceleration comes in a subsequent patch.
>>
>> Signed-off-by: Kim Phillips <address@hidden>
>> Signed-off-by: Eric Auger <address@hidden>
>>
>> ---
>> v6 -> v7:
>> - compat is not exposed anymore as a user option. Rationale is
>>   the vfio device became abstract and a specialization is needed
>>   anyway. The derived device must set the compat string.
>> - in v6 vfio_start_irq_injection was exposed in vfio-platform.h.
>>   A new function dubbed vfio_register_irq_starter replaces it. It
>>   registers a machine init done notifier that programs & starts
>>   all dynamic VFIO device IRQs. This function is supposed to be
>>   called by the machine file. A set of static helper routines are
>>   added too. It must be called before the creation of the platform
>>   bus device.
>>
>> v5 -> v6:
>> - vfio_device property renamed into host property
>> - correct error handling of VFIO_DEVICE_GET_IRQ_INFO ioctl
>>   and remove PCI related comment
>> - remove declaration of vfio_setup_irqfd and irqfd_allowed
>>   property.Both belong to next patch (irqfd)
>> - remove declaration of vfio_intp_interrupt in vfio-platform.h
>> - functions that can be static get this characteristic
>> - remove declarations of vfio_region_ops, vfio_memory_listener,
>>   group_list, vfio_address_spaces. All are moved to vfio-common.h
>> - remove vfio_put_device declaration and definition
>> - print_regions removed. code moved into vfio_populate_regions
>> - replace DPRINTF by trace events
>> - new helper routine to set the trigger eventfd
>> - dissociate intp init from the injection enablement:
>>   vfio_enable_intp renamed into vfio_init_intp and new function
>>   named vfio_start_eventfd_injection
>> - injection start moved to vfio_start_irq_injection (not anymore
>>   in vfio_populate_interrupt)
>> - new start_irq_fn field in VFIOPlatformDevice corresponding to
>>   the function that will be used for starting injection
>> - user handled eventfd:
>>   x add mutex to protect IRQ state & list manipulation,
>>   x correct misleading comment in vfio_intp_interrupt.
>>   x Fix bugs thanks to fake interrupt modality
>> - VFIOPlatformDeviceClass becomes abstract
>> - add error_setg in vfio_platform_realize
>>
>> v4 -> v5:
>> - vfio-plaform.h included first
>> - cleanup error handling in *populate*, vfio_get_device,
>>   vfio_enable_intp
>> - vfio_put_device not called anymore
>> - add some includes to follow vfio policy
>>
>> v3 -> v4:
>> [Eric Auger]
>> - merge of "vfio: Add initial IRQ support in platform device"
>>   to get a full functional patch although perfs are limited.
>> - removal of unrealize function since I currently understand
>>   it is only used with device hot-plug feature.
>>
>> v2 -> v3:
>> [Eric Auger]
>> - further factorization between PCI and platform (VFIORegion,
>>   VFIODevice). same level of functionality.
>>
>> <= v2:
>> [Kim Philipps]
>> - Initial Creation of the device supporting register space mapping
>> ---
>>  hw/vfio/Makefile.objs           |   1 +
>>  hw/vfio/platform.c              | 672 
>> ++++++++++++++++++++++++++++++++++++++++
>>  include/hw/vfio/vfio-common.h   |   1 +
>>  include/hw/vfio/vfio-platform.h |  87 ++++++
>>  trace-events                    |  12 +
>>  5 files changed, 773 insertions(+)
>>  create mode 100644 hw/vfio/platform.c
>>  create mode 100644 include/hw/vfio/vfio-platform.h
>>
>> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
>> index e31f30e..c5c76fe 100644
>> --- a/hw/vfio/Makefile.objs
>> +++ b/hw/vfio/Makefile.objs
>> @@ -1,4 +1,5 @@
>>  ifeq ($(CONFIG_LINUX), y)
>>  obj-$(CONFIG_SOFTMMU) += common.o
>>  obj-$(CONFIG_PCI) += pci.o
>> +obj-$(CONFIG_SOFTMMU) += platform.o
>>  endif
>> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
>> new file mode 100644
>> index 0000000..9f66610
>> --- /dev/null
>> +++ b/hw/vfio/platform.c
>> @@ -0,0 +1,672 @@
>> +/*
>> + * vfio based device assignment support - platform devices
>> + *
>> + * Copyright Linaro Limited, 2014
>> + *
>> + * Authors:
>> + *  Kim Phillips <address@hidden>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Based on vfio based PCI device assignment support:
>> + *  Copyright Red Hat, Inc. 2012
>> + */
>> +
>> +#include <linux/vfio.h>
>> +#include <sys/ioctl.h>
>> +
>> +#include "hw/vfio/vfio-platform.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/range.h"
>> +#include "sysemu/sysemu.h"
>> +#include "exec/memory.h"
>> +#include "qemu/queue.h"
>> +#include "hw/sysbus.h"
>> +#include "trace.h"
>> +#include "hw/platform-bus.h"
>> +
>> +static void vfio_intp_interrupt(VFIOINTp *intp);
>> +typedef void (*eventfd_user_side_handler_t)(VFIOINTp *intp);
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler);
>> +
>> +/*
>> + * Functions only used when eventfd are handled on user-side
>> + * ie. without irqfd
>> + */
>> +
>> +/**
>> + * vfio_platform_eoi - IRQ completion routine
>> + * @vbasedev: the VFIO device
>> + *
>> + * de-asserts the active virtual IRQ and unmask the physical IRQ
>> + * (masked by the  VFIO driver). Handle pending IRQs if any.
>> + * eoi function is called on the first access to any MMIO region
>> + * after an IRQ was triggered. It is assumed this access corresponds
>> + * to the IRQ status register reset. With such a mechanism, a single
>> + * IRQ can be handled at a time since there is no way to know which
>> + * IRQ was completed by the guest (we would need additional details
>> + * about the IRQ status register mask)
>> + */
>> +static void vfio_platform_eoi(VFIODevice *vbasedev)
>> +{
>> +    VFIOINTp *intp;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
>> +        if (intp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_eoi(intp->pin,
>> +                                event_notifier_get_fd(&intp->interrupt));
>> +            intp->state = VFIO_IRQ_INACTIVE;
>> +
>> +            /* deassert the virtual IRQ and unmask physical one */
>> +            qemu_set_irq(intp->qemuirq, 0);
>> +            vfio_unmask_irqindex(vbasedev, intp->pin);
>> +
>> +            /* a single IRQ can be active at a time */
>> +            break;
>> +        }
>> +    }
>> +    /* in case there are pending IRQs, handle them one at a time */
>> +    if (!QSIMPLEQ_EMPTY(&vdev->pending_intp_queue)) {
>> +        intp = QSIMPLEQ_FIRST(&vdev->pending_intp_queue);
>> +        trace_vfio_platform_eoi_handle_pending(intp->pin);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        vfio_intp_interrupt(intp);
>> +        qemu_mutex_lock(&vdev->intp_mutex);
>> +        QSIMPLEQ_REMOVE_HEAD(&vdev->pending_intp_queue, pqnext);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +    } else {
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_mmap_set_enabled - enable/disable the fast path mode
>> + * @vdev: the VFIO platform device
>> + * @enabled: the target mmap state
>> + *
>> + * true ~ fast path = MMIO region is mmaped (no KVM TRAP)
>> + * false ~ slow path = MMIO region is trapped and region callbacks
>> + * are called slow path enables to trap the IRQ status register
>> + * guest reset
>> +*/
>> +
>> +static void vfio_mmap_set_enabled(VFIOPlatformDevice *vdev, bool enabled)
>> +{
>> +    VFIORegion *region;
>> +    int i;
>> +
>> +    trace_vfio_platform_mmap_set_enabled(enabled);
>> +
>> +    for (i = 0; i < vdev->vbasedev.num_regions; i++) {
>> +        region = vdev->regions[i];
>> +
>> +        /* register space is unmapped to trap EOI */
>> +        memory_region_set_enabled(&region->mmap_mem, enabled);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_intp_mmap_enable - timer function, restores the fast path
>> + * if there is no more active IRQ
>> + * @opaque: actually points to the VFIO platform device
>> + *
>> + * Called on mmap timer timout, this function checks whether the
>> + * IRQ is still active and in the negative restores the fast path.
>> + * by construction a single eventfd is handled at a time.
>> + * if the IRQ is still active, the timer is restarted.
>> + */
>> +static void vfio_intp_mmap_enable(void *opaque)
>> +{
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = (VFIOPlatformDevice *)opaque;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +        if (tmp->state == VFIO_IRQ_ACTIVE) {
>> +            trace_vfio_platform_intp_mmap_enable(tmp->pin);
>> +            /* re-program the timer to check active status later */
>> +            timer_mod(vdev->mmap_timer,
>> +                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                          vdev->mmap_timeout);
>> +            qemu_mutex_unlock(&vdev->intp_mutex);
>> +            return;
>> +        }
>> +    }
>> +    vfio_mmap_set_enabled(vdev, true);
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
>> +}
>> +
>> +/**
>> + * vfio_intp_interrupt - The user-side eventfd handler
>> + * @opaque: opaque pointer which in practice is the VFIOINTp*
>> + *
>> + * the function can be entered
>> + * - in event handler context: this IRQ is inactive
>> + *   in that case, the vIRQ is injected into the guest if there
>> + *   is no other active or pending IRQ.
>> + * - in IOhandler context: this IRQ is pending.
>> + *   there is no ACTIVE IRQ
>> + */
>> +static void vfio_intp_interrupt(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIOINTp *tmp;
>> +    VFIOPlatformDevice *vdev = intp->vdev;
>> +    bool delay_handling = false;
>> +
>> +    qemu_mutex_lock(&vdev->intp_mutex);
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        QLIST_FOREACH(tmp, &vdev->intp_list, next) {
>> +            if (tmp->state == VFIO_IRQ_ACTIVE ||
>> +                tmp->state == VFIO_IRQ_PENDING) {
>> +                delay_handling = true;
>> +                break;
>> +            }
>> +        }
>> +    }
>> +    if (delay_handling) {
>> +        /*
>> +         * the new IRQ gets a pending status and is pushed in
>> +         * the pending queue
>> +         */
>> +        intp->state = VFIO_IRQ_PENDING;
>> +        trace_vfio_intp_interrupt_set_pending(intp->pin);
>> +        QSIMPLEQ_INSERT_TAIL(&vdev->pending_intp_queue,
>> +                             intp, pqnext);
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        qemu_mutex_unlock(&vdev->intp_mutex);
>> +        return;
>> +    }
>> +
>> +    /* no active IRQ, the new IRQ can be forwarded to the guest */
>> +    trace_vfio_platform_intp_interrupt(intp->pin,
>> +                              event_notifier_get_fd(&intp->interrupt));
>> +
>> +    if (intp->state == VFIO_IRQ_INACTIVE) {
>> +        ret = event_notifier_test_and_clear(&intp->interrupt);
>> +        if (!ret) {
>> +            error_report("Error when clearing fd=%d (ret = %d)\n",
>> +                         event_notifier_get_fd(&intp->interrupt), ret);
>> +        }
>> +    } /* else this is a pending IRQ that moves to ACTIVE state */
>> +
>> +    intp->state = VFIO_IRQ_ACTIVE;
>> +
>> +    /* sets slow path */
>> +    vfio_mmap_set_enabled(vdev, false);
>> +
>> +    /* trigger the virtual IRQ */
>> +    qemu_set_irq(intp->qemuirq, 1);
>> +
>> +    /* schedule the mmap timer which will restore mmap path after EOI*/
>> +    if (vdev->mmap_timeout) {
>> +        timer_mod(vdev->mmap_timer,
>> +                  qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
>> +                      vdev->mmap_timeout);
>> +    }
>> +    qemu_mutex_unlock(&vdev->intp_mutex);
>> +}
>> +
>> +/**
>> + * vfio_start_eventfd_injection - starts the virtual IRQ injection using
>> + * user-side handled eventfds
>> + * @intp: the IRQ struct pointer
>> + */
>> +
>> +static int vfio_start_eventfd_injection(VFIOINTp *intp)
>> +{
>> +    int ret;
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +
>> +    vfio_mask_irqindex(vbasedev, intp->pin);
>> +
>> +    ret = vfio_set_trigger_eventfd(intp, vfio_intp_interrupt);
>> +    if (ret) {
>> +        error_report("vfio: Error: Failed to pass IRQ fd to the driver: 
>> %m");
>> +        vfio_unmask_irqindex(vbasedev, intp->pin);
>> +        return ret;
>> +    }
>> +    vfio_unmask_irqindex(vbasedev, intp->pin);
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Functions used whatever the injection method
>> + */
>> +
>> +/**
>> + * vfio_set_trigger_eventfd - set VFIO eventfd handling
>> + * ie. program the VFIO driver to associates a given IRQ index
>> + * with a fd handler
>> + *
>> + * @intp: IRQ struct pointer
>> + * @handler: handler to be called on eventfd trigger
>> + */
>> +static int vfio_set_trigger_eventfd(VFIOINTp *intp,
>> +                                    eventfd_user_side_handler_t handler)
>> +{
>> +    VFIODevice *vbasedev = &intp->vdev->vbasedev;
>> +    struct vfio_irq_set *irq_set;
>> +    int argsz, ret;
>> +    int32_t *pfd;
>> +
>> +    argsz = sizeof(*irq_set) + sizeof(*pfd);
>> +    irq_set = g_malloc0(argsz);
>> +    irq_set->argsz = argsz;
>> +    irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
>> VFIO_IRQ_SET_ACTION_TRIGGER;
>> +    irq_set->index = intp->pin;
>> +    irq_set->start = 0;
>> +    irq_set->count = 1;
>> +    pfd = (int32_t *)&irq_set->data;
>> +    *pfd = event_notifier_get_fd(&intp->interrupt);
>> +    qemu_set_fd_handler(*pfd, (IOHandler *)handler, NULL, intp);
>> +    ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set);
>> +    g_free(irq_set);
>> +    if (ret < 0) {
>> +        error_report("vfio: Failed to set trigger eventfd: %m");
>> +        qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
>> +    }
>> +    return ret;
>> +}
>> +
>> +/* not implemented yet */
>> +static bool vfio_platform_compute_needs_reset(VFIODevice *vdev)
>> +{
>> +return false;
>> +}
>> +
>> +/* not implemented yet */
>> +static int vfio_platform_hot_reset_multi(VFIODevice *vdev)
>> +{
>> +return 0;
>> +}
>> +
>> +/**
>> + * vfio_init_intp - allocate, initialize the IRQ struct pointer
>> + * and add it into the list of IRQ
>> + * @vbasedev: the VFIO device
>> + * @index: VFIO device IRQ index
>> + */
>> +static VFIOINTp *vfio_init_intp(VFIODevice *vbasedev, unsigned int index)
>> +{
>> +    int ret;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(vdev);
>> +    VFIOINTp *intp;
>> +
>> +    /* allocate and populate a new VFIOINTp structure put in a queue list */
>> +    intp = g_malloc0(sizeof(*intp));
>> +    intp->vdev = vdev;
>> +    intp->pin = index;
>> +    intp->state = VFIO_IRQ_INACTIVE;
>> +    sysbus_init_irq(sbdev, &intp->qemuirq);
>> +
>> +    /* Get an eventfd for trigger */
>> +    ret = event_notifier_init(&intp->interrupt, 0);
>> +    if (ret) {
>> +        g_free(intp);
>> +        error_report("vfio: Error: trigger event_notifier_init failed ");
>> +        return NULL;
>> +    }
>> +
>> +    /* store the new intp in qlist */
>> +    QLIST_INSERT_HEAD(&vdev->intp_list, intp, next);
>> +    return intp;
>> +}
>> +
>> +/**
>> + * vfio_populate_device - initialize MMIO region and IRQ
>> + * @vbasedev: the VFIO device
>> + *
>> + * query the VFIO device for exposed MMIO regions and IRQ and
>> + * populate the associated fields in the device struct
>> + */
>> +static int vfio_populate_device(VFIODevice *vbasedev)
>> +{
>> +    struct vfio_irq_info irq = { .argsz = sizeof(irq) };
>> +    struct vfio_region_info reg_info = { .argsz = sizeof(reg_info) };
>> +    VFIOINTp *intp;
>> +    int i, ret = 0;
>> +    VFIOPlatformDevice *vdev =
>> +        container_of(vbasedev, VFIOPlatformDevice, vbasedev);
>> +
>> +    vdev->regions = g_malloc0(sizeof(VFIORegion *) * vbasedev->num_regions);
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vdev->regions[i] = g_malloc0(sizeof(VFIORegion));
>> +        reg_info.index = i;
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, &reg_info);
>> +        if (ret) {
>> +            error_report("vfio: Error getting region %d info: %m", i);
>> +            goto error;
>> +        }
>> +        vdev->regions[i]->flags = reg_info.flags;
>> +        vdev->regions[i]->size = reg_info.size;
>> +        vdev->regions[i]->fd_offset = reg_info.offset;
>> +        vdev->regions[i]->nr = i;
>> +        vdev->regions[i]->vbasedev = vbasedev;
>> +
>> +        trace_vfio_platform_populate_regions(vdev->regions[i]->nr,
>> +                            (unsigned long)vdev->regions[i]->flags,
>> +                            (unsigned long)vdev->regions[i]->size,
>> +                            vdev->regions[i]->vbasedev->fd,
>> +                            (unsigned long)vdev->regions[i]->fd_offset);
>> +    }
>> +
>> +    vdev->mmap_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
>> +                                    vfio_intp_mmap_enable, vdev);
>> +
>> +    QSIMPLEQ_INIT(&vdev->pending_intp_queue);
>> +
>> +    for (i = 0; i < vbasedev->num_irqs; i++) {
>> +        irq.index = i;
>> +
>> +        ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
>> +        if (ret) {
>> +            error_printf("vfio: error getting device %s irq info",
>> +                         vbasedev->name);
>> +            return ret;
>> +        } else {
>> +            trace_vfio_platform_populate_interrupts(irq.index,
>> +                                                    irq.count,
>> +                                                    irq.flags);
>> +            intp = vfio_init_intp(vbasedev, irq.index);
>> +            if (!intp) {
>> +                error_report("vfio: Error installing IRQ %d up", i);
>> +                return ret;
>> +            }
>> +        }
>> +    }
>> +    return 0;
>> +error:
>> +    return ret;
>> +}
>> +
>> +/*
>> + * vfio_start_irq_injection - associates a virtual irq to a
>> + * VFIO IRQ index and start the injection of this IRQ
>> + * @s: SysBus Device
>> + * @index: VFIO IRQ index
>> + * @virq: the virtual IRQ number, aka gsi
>> + *
>> + * this function is called when the device tree is built
>> + */
>> +static void vfio_start_irq_injection(SysBusDevice *s, int index, int virq)
>> +{
>> +    VFIOPlatformDevice *vdev = container_of(s, VFIOPlatformDevice, sbdev);
>> +    VFIOINTp *intp;
>> +
>> +    QLIST_FOREACH(intp, &vdev->intp_list, next) {
>> +        if (intp->pin == index) {
>> +            intp->virtualID = virq;
>> +            vdev->start_irq_fn(intp);
>> +        }
>> +    }
>> +}
>> +
>> +/* specialized functions ofr VFIO Platform devices */
>> +static VFIODeviceOps vfio_platform_ops = {
>> +    .vfio_compute_needs_reset = vfio_platform_compute_needs_reset,
>> +    .vfio_hot_reset_multi = vfio_platform_hot_reset_multi,
>> +    .vfio_eoi = vfio_platform_eoi,
>> +    .vfio_populate_device = vfio_populate_device,
>> +};
>> +
>> +/**
>> + * vfio_base_device_init - implements some of the VFIO mechanics
>> + * @vbasedev: the VFIO device
>> + *
>> + * retrieves the group the device belongs to and get the device fd
>> + * returns the VFIO device fd
>> + * precondition: the device name must be initialized
>> + */
>> +static int vfio_base_device_init(VFIODevice *vbasedev)
>> +{
>> +    VFIOGroup *group;
>> +    VFIODevice *vbasedev_iter;
>> +    char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
>> +    ssize_t len;
>> +    struct stat st;
>> +    int groupid;
>> +    int ret;
>> +
>> +    /* name must be set prior to the call */
>> +    if (!vbasedev->name) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Check that the host device exists */
>> +    snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
>> +             vbasedev->name);
>> +
>> +    if (stat(path, &st) < 0) {
>> +        error_report("vfio: error: no such host device: %s", path);
>> +        return -errno;
>> +    }
>> +
>> +    strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
>> +    len = readlink(path, iommu_group_path, sizeof(path));
>> +    if (len <= 0 || len >= sizeof(path)) {
>> +        error_report("vfio: error no iommu_group for device");
>> +        return len < 0 ? -errno : ENAMETOOLONG;
>> +    }
>> +
>> +    iommu_group_path[len] = 0;
>> +    group_name = basename(iommu_group_path);
>> +
>> +    if (sscanf(group_name, "%d", &groupid) != 1) {
>> +        error_report("vfio: error reading %s: %m", path);
>> +        return -errno;
>> +    }
>> +
>> +    trace_vfio_platform_base_device_init(vbasedev->name, groupid);
>> +
>> +    group = vfio_get_group(groupid, &address_space_memory);
>> +    if (!group) {
>> +        error_report("vfio: failed to get group %d", groupid);
>> +        return -ENOENT;
>> +    }
>> +
>> +    snprintf(path, sizeof(path), "%s", vbasedev->name);
>> +
>> +    QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
>> +        if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
>> +            error_report("vfio: error: device %s is already attached", 
>> path);
>> +            vfio_put_group(group);
>> +            return -EBUSY;
>> +        }
>> +    }
>> +    ret = vfio_get_device(group, path, vbasedev);
>> +    if (ret) {
>> +        error_report("vfio: failed to get device %s", path);
>> +        vfio_put_group(group);
>> +    }
>> +    return ret;
>> +}
>> +
>> +/**
>> + * vfio_map_region - initialize the 2 mr (mmapped on ops) for a
>> + * given index
>> + * @vdev: the VFIO platform device
>> + * @nr: the index of the region
>> + *
>> + * init the top memory region and the mmapped memroy region beneath
>> + * VFIOPlatformDevice is used since VFIODevice is not a QOM Object
>> + * and could not be passed to memory region functions
>> +*/
>> +static void vfio_map_region(VFIOPlatformDevice *vdev, int nr)
>> +{
>> +    VFIORegion *region = vdev->regions[nr];
>> +    unsigned size = region->size;
>> +    char name[64];
>> +
>> +    if (!size) {
>> +        return;
>> +    }
>> +
>> +    snprintf(name, sizeof(name), "VFIO %s region %d",
>> +             vdev->vbasedev.name, nr);
>> +
>> +    /* A "slow" read/write mapping underlies all regions */
>> +    memory_region_init_io(&region->mem, OBJECT(vdev), &vfio_region_ops,
>> +                          region, name, size);
>> +
>> +    strncat(name, " mmap", sizeof(name) - strlen(name) - 1);
>> +
>> +    if (vfio_mmap_region(OBJECT(vdev), region, &region->mem,
>> +                         &region->mmap_mem, &region->mmap, size, 0, name)) {
>> +        error_report("%s unsupported. Performance may be slow", name);
>> +    }
>> +}
>> +
>> +/**
>> + * vfio_platform_realize  - the device realize function
>> + * @dev: device state pointer
>> + * @errp: error
>> + *
>> + * initialize the device, its memory regions and IRQ structures
>> + * IRQ are started separately
>> + */
>> +static void vfio_platform_realize(DeviceState *dev, Error **errp)
>> +{
>> +    VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(dev);
>> +    SysBusDevice *sbdev = SYS_BUS_DEVICE(dev);
>> +    VFIODevice *vbasedev = &vdev->vbasedev;
>> +    int i, ret;
>> +
>> +    vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
>> +    vbasedev->ops = &vfio_platform_ops;
>> +    vdev->start_irq_fn = vfio_start_eventfd_injection;
>> +
>> +    trace_vfio_platform_realize(vbasedev->name, vdev->compat);
>> +
>> +    ret = vfio_base_device_init(vbasedev);
>> +    if (ret) {
>> +        error_setg(errp, "vfio: vfio_base_device_init failed for %s",
>> +                   vbasedev->name);
>> +        return;
>> +    }
>> +
>> +    for (i = 0; i < vbasedev->num_regions; i++) {
>> +        vfio_map_region(vdev, i);
>> +        sysbus_init_mmio(sbdev, &vdev->regions[i]->mem);
>> +    }
>> +}
>> +
>> +/*
>> + * Mechanics to program/start irq injection on machine init done notifier:
>> + * this is needed since at finalize time, the device IRQ are not yet
>> + * bound to the platform bus IRQ. It is assumed here dynamic instantiation
>> + * always is used. Binding to the platform bus IRQ happens on a machine
>> + * init done notifier registered by the machine file. After its execution
>> + * we execute a new notifier that actually starts the injection. When using
>> + * irqfd, programming the injection consists in associating eventfds to
>> + * GSI number,ie. virtual IRQ number
>> + */
>> +
>> +typedef struct VfioIrqStarterNotifierParams {
>> +    unsigned int platform_bus_first_irq;
>> +    Notifier notifier;
>> +} VfioIrqStarterNotifierParams;
>> +
>> +typedef struct VfioIrqStartParams {
>> +    PlatformBusDevice *pbus;
>> +    int platform_bus_first_irq;
>> +} VfioIrqStartParams;
>> +
>> +/* Start injection of IRQ for a specific VFIO device */
>> +static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
>> +{
>> +    int i;
>> +    VfioIrqStartParams *p = opaque;
>> +    VFIOPlatformDevice *vdev;
>> +    VFIODevice *vbasedev;
>> +    uint64_t irq_number;
>> +    PlatformBusDevice *pbus = p->pbus;
>> +    int platform_bus_first_irq = p->platform_bus_first_irq;
>> +
>> +    if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
>> +        vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +        vbasedev = &vdev->vbasedev;
>> +        for (i = 0; i < vbasedev->num_irqs; i++) {
>> +            irq_number = platform_bus_get_irqn(pbus, sbdev, i)
>> +                             + platform_bus_first_irq;
>> +            vfio_start_irq_injection(sbdev, i, irq_number);
>> +        }
>> +    }
>> +    return 0;
>> +}
>> +
>> +/* loop on all VFIO platform devices and start their IRQ injection */
>> +static void vfio_irq_starter_notify(Notifier *notifier, void *data)
>> +{
>> +    VfioIrqStarterNotifierParams *p =
>> +        container_of(notifier, VfioIrqStarterNotifierParams, notifier);
>> +    DeviceState *dev =
>> +        qdev_find_recursive(sysbus_get_default(), TYPE_PLATFORM_BUS_DEVICE);
>> +    PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
>> +
>> +    if (pbus->done_gathering) {
>> +        VfioIrqStartParams data = {
>> +            .pbus = pbus,
>> +            .platform_bus_first_irq = p->platform_bus_first_irq,
>> +        };
>> +
>> +        foreach_dynamic_sysbus_device(vfio_irq_starter, &data);
>> +    }
>> +}
>> +
>> +/* registers the machine init done notifier that will start VFIO IRQ */
>> +void vfio_register_irq_starter(int platform_bus_first_irq)
>> +{
>> +    VfioIrqStarterNotifierParams *p = g_new(VfioIrqStarterNotifierParams, 
>> 1);
>> +
>> +    p->platform_bus_first_irq = platform_bus_first_irq;
>> +    p->notifier.notify = vfio_irq_starter_notify;
>> +    qemu_add_machine_init_done_notifier(&p->notifier);
> 
> Could you add a notifier for each device instead? Then the notifier
> would be part of the vfio device struct and not some dangling random
> pointer :).
> 
> Of course instead of foreach_dynamic_sysbus_device() you would directly
> know the device you're dealing with and only handle a single device per
> notifier.

Hi Alex,

I don't see how to practically follow your request:

- at machine init time, VFIO devices are not yet instantiated so I
cannot call foreach_dynamic_sysbus_device() there - I was definitively
wrong in my first reply :-().

- I can't register a per VFIO device notifier in the VFIO device
finalize function because this latter is called after the platform bus
instantiation. So the IRQ binding notifier (registered in platform bus
finalize fn) would be called after the IRQ starter notifier.

- then to simplify things a bit I could use a qemu_register_reset in
place of a machine init done notifier (would relax the call order
constraint) but the problem consists in passing the platform bus first
irq (all the more so you requested it became part of a const struct)

Do I miss something?

Best Regards

Eric
> 
> 
> Alex
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]