qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] New device API


From: Marcelo Tosatti
Subject: Re: [Qemu-devel] [RFC] New device API
Date: Fri, 8 May 2009 02:27:55 -0300
User-agent: Mutt/1.5.19 (2009-01-05)

Paul,

On Tue, May 05, 2009 at 12:31:09PM +0100, Paul Brook wrote:
> The attached patch is my attempt at a new internal API for device creation in 
> qemu.
> 
> The long term goal here is fully dynamic machine construction, i.e. a user 
> supplied configuration file/tree rather than the current hardcoded board 
> configs.
> 
> As a first step towards this, I've implemented an API that allows devices to 
> be created and configured without requiring device specific knowledge.
> 
> There are two sides to this API.  The device side allows a device to 
> register/configure functionality (e.g. MMIO regions, IRQs, character devices, 
> etc). Most of the infrastructure for this is already present, we just need to 
> configure it consistently, rather than on an ad-hoc basis. I expect this API 
> to remain largely unchanged[1].
> 
> The board side allows creation on configuration of devices. In the medium 
> term 
> I expect this to go away, and be replaced by data driven configuration.
> 
> I also expect that this device abstraction will let itself to a system bus 
> model to solve many of the IOMMU/DMA/address mapping issues that qemu 
> currently can't handle.
> 
> There are still a few rough edges. Busses aren't handled in a particularly 
> consistent way, PCI isn't particularly well integrated, and the 
> implementation of things like qdev_get_chardev is an ugly hack mapping onto 
> current commandline options. However I believe the device side API is fairly 
> solid, and is a necessary prerequisite for fixing the bigger configuration 
> problem.
> 
> I've converted a bunch of devices, anyone interested in seeing how it fits 
> together in practice can pull from
>   git://repo.or.cz/qemu/pbrook.git qdev
> 
> It you have objections to or suggestion about this approach please speak up 
> now, but please bear in ming that this code is still somewhat in flux and the 
> caveats mentioned above.
> 
> Paul
> 
> [1] I subscribe to the linux kernel theory that stable internal APIs are a 
> coincidence, not a feature. i.e. a consistent internal API is good, but that 
> it should be changed whenever technically desirable. I have no interest in 
> maintaining a fixed API for the benefit of third parties.

Makes lots of sense. And also its perfectly fine for the API to evolve.

> commit 11c765848af6cb345a81f2722f141b305538182d
> Author: Paul Brook <address@hidden>
> Date:   Mon May 4 17:13:08 2009 +0100
> 
>     Basic qdev infrastructure.
>     
>     Signed-off-by: Paul Brook <address@hidden>
> 
> diff --git a/Makefile.target b/Makefile.target
> index f735105..a35e724 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -498,6 +498,7 @@ endif #CONFIG_BSD_USER
>  # System emulator target
>  ifndef CONFIG_USER_ONLY
>  
> +DEVICES=
>  OBJS=vl.o osdep.o monitor.o pci.o loader.o isa_mmio.o machine.o dma-helpers.o
>  # virtio has to be here due to weird dependency between PCI and virtio-net.
>  # need to fix this properly
> @@ -686,6 +687,13 @@ ifeq ($(TARGET_BASE_ARCH), m68k)
>  OBJS+= an5206.o mcf5206.o mcf_uart.o mcf_intc.o mcf5208.o mcf_fec.o
>  OBJS+= m68k-semi.o dummy_m68k.o
>  endif
> +
> +DEVICE_OBJS=$(addsuffix .o,$(DEVICES))
> +$(DEVICE_OBJS): CPPFLAGS+=-DDEVICE_NAME=$(subst -,_,$(@:.o=))
> +
> +OBJS+=$(DEVICE_OBJS)
> +OBJS+=devices.o qdev.o
> +
>  ifdef CONFIG_GDBSTUB
>  OBJS+=gdbstub.o gdbstub-xml.o
>  endif
> @@ -752,6 +760,9 @@ endif
>  qemu-options.h: $(SRC_PATH)/qemu-options.hx
>       $(call quiet-command,sh $(SRC_PATH)/hxtool -h < $< > $@,"  GEN   
> $(TARGET_DIR)$@")
>  
> +devices.c: gen_devices.sh Makefile.target config.mak
> +     $(call quiet-command,$(SHELL) $(SRC_PATH)/gen_devices.sh $@ $(subst 
> -,_,$(DEVICES)),"  GEN   $(TARGET_DIR)$@")
> +
>  clean:
>       rm -f *.o *.a *~ $(PROGS) nwfpe/*.o fpu/*.o qemu-options.h gdbstub-xml.c
>       rm -f *.d */*.d tcg/*.o
> diff --git a/gen_devices.sh b/gen_devices.sh
> new file mode 100644
> index 0000000..1d6ec12
> --- /dev/null
> +++ b/gen_devices.sh
> @@ -0,0 +1,17 @@
> +#! /bin/sh
> +# Call device init functions.
> +
> +file="$1"
> +shift
> +devices="$@"
> +echo '/* Generated by gen_devices.sh */' > $file
> +echo '#include "sysemu.h"' >> $file
> +for x in $devices ; do
> +    echo "void ${x}_register(void);" >> $file
> +done
> +echo "void register_devices(void)" >> $file
> +echo "{" >> $file
> +for x in $devices ; do
> +    echo "    ${x}_register();" >> $file
> +done
> +echo "}" >> $file
> diff --git a/hw/qdev.c b/hw/qdev.c
> new file mode 100644
> index 0000000..eb8c172
> --- /dev/null
> +++ b/hw/qdev.c
> @@ -0,0 +1,294 @@
> +/*
> + *  Dynamic device configuration and creation.
> + *
> + *  Copyright (c) 2009 CodeSourcery
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library; if not, write to the Free Software
> + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
> + */
> +
> +#include "hw.h"
> +#include "qdev.h"
> +#include "sysemu.h"
> +
> +#include <assert.h>
> +
> +struct DeviceProperty
> +{
> +    const char *name;
> +    union {
> +        int i;
> +        void *ptr;
> +    } value;
> +    DeviceProperty *next;
> +};
> +
> +struct DeviceType
> +{
> +    const char *name;
> +    qdev_initfn init;
> +    void *opaque;
> +    int size;
> +    DeviceType *next;
> +};
> +
> +static DeviceType *device_type_list;

Perhaps "device driver" is more suitable? Or "device emulator"??

> +
> +/* Register a new device type.  */
> +DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
> +                          void *opaque)
> +{
> +    DeviceType *t;
> +
> +    assert(size >= sizeof(DeviceState));
> +
> +    t = qemu_mallocz(sizeof(DeviceType));
> +    t->next = device_type_list;
> +    device_type_list = t;
> +    t->name = qemu_strdup(name);
> +    t->size = size;
> +    t->init = init;
> +    t->opaque = opaque;
> +
> +    return t;
> +}

void virtio_blk_register(void)
{
    pci_qdev_register("virtio-blk", sizeof(VirtIOBlock), virtio_blk_init);
}

So you'd expect all "device types" to be registered by the time the
actual machine initialization starts. Similar to modular device drivers
in Linux.

    bs = qdev_init_bdrv(&pci_dev->qdev, IF_VIRTIO);
    s->vdev.get_config = virtio_blk_update_config;

And eventually you'd move qdev_init_bdrv (and all BlockDriverState
knowledge) to happen somewhere else other than device emulation code?

> +
> +/* Create a new device.  This only initializes the device state structure
> +   and allows properties to be set.  qdev_init should be called to
> +   initialize the actual device emulation.  */
> +DeviceState *qdev_create(const char *name)
> +{
> +    DeviceType *t;
> +    DeviceState *dev;
> +
> +    for (t = device_type_list; t; t = t->next) {
> +        if (strcmp(t->name, name) == 0) {
> +            break;
> +        }
> +    }
> +    if (!t) {
> +        fprintf(stderr, "Unknown device '%s'\n", name);
> +        exit(1);
> +    }
> +
> +    dev = qemu_mallocz(t->size);
> +    dev->name = name;
> +    dev->type = t;
> +    return dev;
> +}
> +
> +/* Initialize a device.  Device properties should be set before calling
> +   this function.  IRQs and MMIO regions should be connected/mapped after
> +   calling this function.  */
> +void qdev_init(DeviceState *dev)
> +{
> +    dev->type->init(dev, dev->type->opaque);
> +}
>
> +
> +void qdev_connect_irq(DeviceState *dev, int n, qemu_irq irq)
> +{
> +    assert(n >= 0 && n < dev->num_irq);
> +    dev->irqs[n] = 0;
> +    if (dev->irqp[n]) {
> +        *dev->irqp[n] = irq;
> +    }
> +}
> +
> +void qdev_mmio_map(DeviceState *dev, int n, target_phys_addr_t addr)
> +{
> +    assert(n >= 0 && n < dev->num_mmio);
> +
> +    if (dev->mmio[n].addr == addr) {
> +        /* ??? region already mapped here.  */
> +        return;
> +    }
> +    if (dev->mmio[n].addr != (target_phys_addr_t)-1) {
> +        /* Unregister previous mapping.  */
> +        cpu_register_physical_memory(dev->mmio[n].addr, dev->mmio[n].size,
> +                                     IO_MEM_UNASSIGNED);
> +    }
> +    dev->mmio[n].addr = addr;
> +    if (dev->mmio[n].cb) {
> +        dev->mmio[n].cb(dev, addr);
> +    } else {
> +        cpu_register_physical_memory(addr, dev->mmio[n].size,
> +                                     dev->mmio[n].iofunc);
> +    }
> +}
> +
> +void qdev_set_bus(DeviceState *dev, void *bus)
> +{
> +    assert(!dev->bus);
> +    dev->bus = bus;
> +}
> +
> +static DeviceProperty *create_prop(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    /* TODO: Check for duplicate properties.  */
> +    prop = qemu_mallocz(sizeof(*prop));
> +    prop->name = qemu_strdup(name);
> +    prop->next = dev->props;
> +    dev->props = prop;
> +
> +    return prop;
> +}
> +
> +void qdev_set_prop_int(DeviceState *dev, const char *name, int value)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = create_prop(dev, name);
> +    prop->value.i = value;
> +}
> +
> +void qdev_set_prop_ptr(DeviceState *dev, const char *name, void *value)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = create_prop(dev, name);
> +    prop->value.ptr = value;
> +}
> +
> +
> +qemu_irq qdev_get_irq_sink(DeviceState *dev, int n)
> +{
> +    assert(n >= 0 && n < dev->num_irq_sink);
> +    return dev->irq_sink[n];
> +}
> +
> +void qdev_init_mmio(DeviceState *dev, target_phys_addr_t size, int iofunc)
> +{
> +    int n;
> +
> +    assert(dev->num_mmio < QDEV_MAX_MMIO);
> +    n = dev->num_mmio++;
> +    dev->mmio[n].addr = -1;
> +    dev->mmio[n].size = size;
> +    dev->mmio[n].iofunc = iofunc;
> +}
> +
> +void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t size,
> +                       mmio_mapfunc cb)
> +{
> +    int n;
> +
> +    assert(dev->num_mmio < QDEV_MAX_MMIO);
> +    n = dev->num_mmio++;
> +    dev->mmio[n].addr = -1;
> +    dev->mmio[n].size = size;
> +    dev->mmio[n].cb = cb;
> +}
> +
> +/* Request an IRQ source.  The actual IRQ object may be populated later.  */
> +void qdev_init_irq(DeviceState *dev, qemu_irq *p)
> +{
> +    int n;
> +
> +    assert(dev->num_irq < QDEV_MAX_IRQ);
> +    n = dev->num_irq++;
> +    dev->irqp[n] = p;
> +}
> +
> +/* Pass IRQs from a target device.  */
> +void qdev_pass_irq(DeviceState *dev, DeviceState *target)
> +{
> +    int i;
> +    assert(dev->num_irq == 0);
> +    dev->num_irq = target->num_irq;
> +    for (i = 0; i < dev->num_irq; i++) {
> +        dev->irqp[i] = target->irqp[i];
> +    }
> +}
> +
> +/* Register device IRQ sinks.  */
> +void qdev_init_irq_sink(DeviceState *dev, qemu_irq_handler handler, int nirq)
> +{
> +    dev->num_irq_sink = nirq;
> +    dev->irq_sink = qemu_allocate_irqs(handler, dev, nirq);
> +}
> +
> +/* Get a character (serial) device interface.  */
> +CharDriverState *qdev_init_chardev(DeviceState *dev)
> +{
> +    static int next_serial;
> +    static int next_virtconsole;
> +    if (strncmp(dev->name, "virtio", 6) == 0) {
> +        return virtcon_hds[next_virtconsole++];
> +    } else {
> +        return serial_hds[next_serial++];
> +    }
> +}
> +
> +void *qdev_get_bus(DeviceState *dev)
> +{
> +    return dev->bus;
> +}
> +
> +static DeviceProperty *find_prop(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    for (prop = dev->props; prop; prop = prop->next) {
> +        if (strcmp(prop->name, name) == 0) {
> +            return prop;
> +        }
> +    }
> +    /* TODO: When this comes from a config file we will need to handle
> +       missing properties gracefully.  */
> +    abort();
> +}
> +
> +int qdev_get_prop_int(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = find_prop(dev, name);
> +    return prop->value.i;
> +}
> +
> +void *qdev_get_prop_ptr(DeviceState *dev, const char *name)
> +{
> +    DeviceProperty *prop;
> +
> +    prop = find_prop(dev, name);
> +    return prop->value.ptr;
> +}
> +
> +DeviceState *qdev_create_varargs(const char *name,
> +                                 target_phys_addr_t addr, ...)
> +{
> +    DeviceState *dev;
> +    va_list va;
> +    qemu_irq irq;
> +    int n;
> +
> +    dev = qdev_create(name);
> +    qdev_init(dev);
> +    if (addr != (target_phys_addr_t)-1) {
> +        qdev_mmio_map(dev, 0, addr);
> +    }
> +    va_start(va, addr);
> +    n = 0;
> +    while (1) {
> +        irq = va_arg(va, qemu_irq);
> +        if (!irq) {
> +            break;
> +        }
> +        qdev_connect_irq(dev, n, irq);
> +        n++;
> +    }
> +    return dev;
> +}
> diff --git a/hw/qdev.h b/hw/qdev.h
> new file mode 100644
> index 0000000..08b2713
> --- /dev/null
> +++ b/hw/qdev.h
> @@ -0,0 +1,89 @@
> +#ifndef QDEV_H
> +#define QDEV_H
> +
> +#include "irq.h"
> +
> +typedef struct DeviceType DeviceType;
> +
> +#define QDEV_MAX_MMIO 5
> +#define QDEV_MAX_IRQ 32
> +
> +typedef struct DeviceProperty DeviceProperty;
> +
> +typedef void (*mmio_mapfunc)(DeviceState *dev, target_phys_addr_t addr);
> +
> +/* This structure should not be accessed directly.  We declare it here
> +   so that it can be embedded in individual device state structures.  */
> +struct DeviceState
> +{
> +    const char *name;
> +    DeviceType *type;
> +    void *bus;

I suppose parent_bus is more descriptive?.

Here you would have a tree node, eventually.

> +    int num_irq;
> +    qemu_irq irqs[QDEV_MAX_IRQ];
> +    qemu_irq *irqp[QDEV_MAX_IRQ];
> +    int num_irq_sink;
> +    qemu_irq *irq_sink;

This is somewhat complicated. So you need irqp pointers and irqs array
due to the way IRQ initialization is performed? 

> +    struct {
> +        target_phys_addr_t addr;
> +        target_phys_addr_t size;
> +        mmio_mapfunc cb;
> +        int iofunc;
> +    } mmio[QDEV_MAX_MMIO];
> +    DeviceProperty *props;
> +};
> +
> +
> +/*** Board API.  This should go away once we have a machine config file.  
> ***/
> +
> +DeviceState *qdev_create(const char *name);
> +void qdev_init(DeviceState *dev);
> +
> +/* Set properties between creation and init.  */
> +void qdev_set_bus(DeviceState *dev, void *bus);
> +void qdev_set_prop_int(DeviceState *dev, const char *name, int value);
> +void qdev_set_prop_ptr(DeviceState *dev, const char *name, void *value);
> +
> +/* Configure device after init.   */
> +void qdev_connect_irq(DeviceState *dev, int n, qemu_irq irq);
> +void qdev_mmio_map(DeviceState *dev, int n, target_phys_addr_t addr);
> +qemu_irq qdev_get_irq_sink(DeviceState *dev, int n);
> +
> +
> +/*** Device API.  ***/
> +
> +typedef void (*qdev_initfn)(DeviceState *dev, void *opaque);
> +
> +DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
> +                          void *opaque);

Don't you want to cover savevm/restore callbacks at this level too? 

Device destruction surely (for hotunplug). Passing a structure with
callbacks would be nicer.

> +
> +/* Register device properties.  */
> +void qdev_init_mmio(DeviceState *dev, target_phys_addr_t size, int iofunc);
> +void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t size,
> +                       mmio_mapfunc cb);
> +void qdev_init_irq(DeviceState *dev, qemu_irq *p);
> +void qdev_pass_irq(DeviceState *dev, DeviceState *target);

Can you explain how init_irq_sink/pass_irq are supposed to work?

> +void qdev_init_scsi(DeviceState *dev, SCSIAttachFn attach)
> +{
> +   int bus = next_scsi_bus++;
> +   int unit;
> +   int index;
> +
> +   for (unit = 0; unit < MAX_SCSI_DEVS; unit++) {
> +       index = drive_get_index(IF_SCSI, bus, unit);
> +       if (index == -1) {
> +           continue;
> +       }
> +       attach(dev, drives_table[index].bdrv, unit);
> +   }
> +}
> diff --git a/hw/qdev.h b/hw/qdev.h
> index 08b2713..79eb22f 100644
> --- a/hw/qdev.h
> +++ b/hw/qdev.h
> @@ -54,6 +54,7 @@ qemu_irq qdev_get_irq_sink(DeviceState *dev, int n);
>  /*** Device API.  ***/
>  
>  typedef void (*qdev_initfn)(DeviceState *dev, void *opaque);
> +typedef void (*SCSIAttachFn)(void *opaque, BlockDriverState *bdrv, int unit);
>  
>  DeviceType *qdev_register(const char *name, int size, qdev_initfn init,
>                            void *opaque);
> @@ -65,6 +66,7 @@ void qdev_init_mmio_cb(DeviceState *dev, target_phys_addr_t 
> size,
>  void qdev_init_irq(DeviceState *dev, qemu_irq *p);
>  void qdev_pass_irq(DeviceState *dev, DeviceState *target);
>  void qdev_init_irq_sink(DeviceState *dev, qemu_irq_handler handler, int 
> nirq);
> +void qdev_init_scsi(DeviceState *dev, SCSIAttachFn attach);
>  
>  CharDriverState *qdev_init_chardev(DeviceState *dev);
>  

Looks like a good start to me.

Markus should have more substantial comments (he's on vacation
this week). The tree driven machine initialization introduces some
complications (such as parent_bus->child device dependencies) while
handling a lot of the problems you are ignoring (eg the ->config method
is responsible for linking host device -> emulated device information,
etc).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]