qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 13/27] acpi: memory hotplug ACPI hardware implem


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH 13/27] acpi: memory hotplug ACPI hardware implementation
Date: Thu, 21 Nov 2013 16:38:47 +0200

On Thu, Nov 21, 2013 at 03:21:37PM +0100, Igor Mammedov wrote:
> On Thu, 21 Nov 2013 11:42:02 +0200
> "Michael S. Tsirkin" <address@hidden> wrote:
> 
> > On Thu, Nov 21, 2013 at 03:38:34AM +0100, Igor Mammedov wrote:
> > > - implements QEMU hardware part of memory hotplug protocol
> > >   described at "docs/specs/acpi_mem_hotplug.txt"
> > > - handles only memory add notification event for now
> > > 
> > > Signed-off-by: Igor Mammedov <address@hidden>
> > > ---
> > >  docs/specs/acpi_mem_hotplug.txt |   38 ++++++++++
> > >  hw/acpi/core.c                  |  144 
> > > +++++++++++++++++++++++++++++++++++++++
> > >  include/hw/acpi/acpi.h          |   24 +++++++
> > >  3 files changed, 206 insertions(+), 0 deletions(-)
> > >  create mode 100644 docs/specs/acpi_mem_hotplug.txt
> > > 
> > > diff --git a/docs/specs/acpi_mem_hotplug.txt 
> > > b/docs/specs/acpi_mem_hotplug.txt
> > > new file mode 100644
> > > index 0000000..fc34142
> > > --- /dev/null
> > > +++ b/docs/specs/acpi_mem_hotplug.txt
> > > @@ -0,0 +1,38 @@
> > > +QEMU<->ACPI BIOS memory hotplug interface
> > > +--------------------------------------
> > > +
> > > +ACPI BIOS GPE.3 handler is dedicated for notifying OS about memory 
> > > hot-add
> > > +or hot-remove events.
> > > +
> > > +Memory hot-plug interface (IO port 0xa00-0xa17, 1-4 byte access):
> > > +---------------------------------------------------------------
> > > +0xa00:
> > > +  read access:
> > > +      [0x0-0x3] Lo part of memory device phys address
> > > +      [0x4-0x7] Hi part of memory device phys address
> > > +      [0x8-0xb] Lo part of memory device size in bytes
> > > +      [0xc-0xf] Hi part of memory device size in bytes
> > > +      [0x14] highest memory hot-plug interface version supported by QEMU
> > 
> > So this can make guest fail gracefully but it appears that
> > detecting guest version would be nicer?
> > It would let us actually support old guests ...
> 
> my idea of how to it was,
> guest writes its version into [0x14] register and reads QEMU version
> from it back, if they do not match then than BIOS ignores GPE.3 event
> effectively disabling hotplug on guest side.
> I haven't thought about supporting multiple implementations in QEMU though.
> Do we really want it?

I'm talking about old bios which does not read acpi from qemu.
We want it to work even if it can't see hotplugged memory.

> > 
> > > +      [0x15] Memory device status fields
> > > +          bits:
> > > +              1: device is enabled and may be used by guest
> > > +              2: device insert event, used by ACPI BIOS to distinguish
> > > +                 device for which no device check event to OSPM was 
> > > issued
> > 
> > what does the above mean?
> After OSPM issued device check on selected device it clears this bit to mark 
> event
> as handled.
> It allows to avoid keeping this state in ASL (as it's done for CPU hotplug, 
> see CPON)

That's fine.

> > what if device is not present?
> ASL will issue device check and clear bit, it might be a bug since _STA would 
> report
> not present but no eject event was issued.
> 
> Papering over it ASL could check present bit first and issue device check 
> only if
> it's present.

Is this a problem? If yes - that will still be racy won't it?


Also, should guest reset eject memory that we requested unplug for?

> > 
> > > +      [0x16-0x17] reserved
> > > +
> > > +  write access:
> > > +      [0x0-0x3] Memory device slot selector, selects active memory 
> > > device.
> > > +                All following accesses to other registers in 
> > > 0xaf80-0xaf97
> > > +                region will read/store data from/to selected memory 
> > > device.
> > > +      [0x4-0x7] OST event code reported by OSPM
> > > +      [0x8-0xb] OST status code reported by OSPM
> > > +      [0x15] Memory device status fields
> > 
> > this is control, not status?
> Thanks, I'll fix it.
> 
> > 
> > > +          bits:
> > > +              2: if set to 1 clears device insert event, set by ACPI BIOS
> > > +                 after it has sent device check event to OSPM for
> > > +                 seleted memory device
> > 
> > selected?
> see "write access: [0x0-0x3]"

yes but you have a typo above

> > 
> > How about we actually require guest to enable memory?
> > 
> > This way if we hotplug, but old guest does not enable
> > and puts a PCI device there, it just works.
> I've lost you here, could you elaborate pls?


Assume qemu adds memory by hotplug.
Is it immediately enabled?
I suggest it's not enabled, and only enable after ACPI
enables it (or after reboot?)

> > 
> > > +
> > > +Selecting memory device slot beyond present range has no effect on 
> > > platform:
> > > +   - not documented above write accesses to memory hot-plug registers
> > > +     are ignored;
> > > +   - not documented above read accesses to memory hot-plug registers 
> > > return 0xFF
> > 
> > 00 would be cleaner (matches not enabled - no event)?
> I'm following pattern, where reading from not present IO port returns 0xFF on 
> hardware.
> Fact that ASL reads 0xFF could be used as not supported indication.

But isn't this a valid pattern when all bits are set?

> > 
> > > diff --git a/hw/acpi/core.c b/hw/acpi/core.c
> > > index 8c0d48c..18e169c 100644
> > > --- a/hw/acpi/core.c
> > > +++ b/hw/acpi/core.c
> > > @@ -680,3 +680,147 @@ void acpi_update_sci(ACPIREGS *regs, qemu_irq irq, 
> > > uint32_t gpe0_sts_mask)
> > >                         (regs->pm1.evt.en & ACPI_BITMASK_TIMER_ENABLE) &&
> > >                         !(pm1a_sts & ACPI_BITMASK_TIMER_STATUS));
> > >  }
> > > +
> > > +static uint64_t acpi_memory_hotplug_read(void *opaque, hwaddr addr,
> > > +                                         unsigned int size)
> > > +{
> > > +    uint32_t val = 0;
> > > +    MemHotplugState *mem_st = opaque;
> > > +    MemStatus *mdev;
> > > +
> > > +    if (mem_st->selector >= mem_st->dev_count) {
> > > +        return 0;
> > > +    }
> > > +
> > > +    mdev = &mem_st->devs[mem_st->selector];
> > > +    switch (addr) {
> > > +    case 0x0: /* Lo part of phys address where DIMM is mapped */
> > > +        val = object_property_get_int(OBJECT(mdev->dimm), "start", NULL);
> > > +        break;
> > > +    case 0x4: /* Hi part of phys address where DIMM is mapped */
> > > +        val = object_property_get_int(OBJECT(mdev->dimm), "start", NULL) 
> > > >> 32;
> > > +        break;
> > > +    case 0x8: /* Lo part of DIMM size */
> > > +        val = object_property_get_int(OBJECT(mdev->dimm), "size", NULL);
> > > +        break;
> > > +    case 0xc: /* Hi part of DIMM size */
> > > +        val = object_property_get_int(OBJECT(mdev->dimm), "size", NULL) 
> > > >> 32;
> > > +        break;
> > > +    case 0x10: /* node proximity for _PXM method */
> > > +        val = object_property_get_int(OBJECT(mdev->dimm), "node", NULL);
> > > +        break;
> > > +    case 0x14: /* intf version */
> > > +        val = 1;
> > > +        break;
> > > +    case 0x15: /* pack and return is_* fields */
> > > +        val |= mdev->is_enabled   ? 1 : 0;
> > > +        val |= mdev->is_inserting ? 2 : 0;
> > > +        break;
> > > +    }
> > > +    return val;
> > > +}
> > > +
> > > +static void acpi_memory_hotplug_write(void *opaque, hwaddr addr, 
> > > uint64_t data,
> > > +                                      unsigned int size)
> > > +{
> > > +    MemHotplugState *mem_st = opaque;
> > > +    MemStatus *mdev;
> > > +
> > > +    if (!mem_st->dev_count) {
> > > +        return;
> > > +    }
> > > +
> > > +    if (addr) {
> > > +        if (mem_st->selector >= mem_st->dev_count) {
> > > +            return;
> > > +        }
> > > +    }
> > > +
> > > +    switch (addr) {
> > > +    case 0x0: /* DIMM slot selector */
> > > +        mem_st->selector = data;
> > > +        break;
> > > +    case 0x4: /* _OST event  */
> > > +        mdev = &mem_st->devs[mem_st->selector];
> > > +        if (data == 1) {
> > > +            /* TODO: handle device insert OST event */
> > > +        } else if (data == 3) {
> > > +            /* TODO: handle device remove OST event */
> > > +        }
> > > +        mdev->ost_event = data;
> > > +        break;
> > > +    case 0x8: /* _OST status */
> > > +        mdev = &mem_st->devs[mem_st->selector];
> > > +        mdev->ost_status = data;
> > > +        /* TODO: report async error */
> > > +        /* TODO: implement memory removal on guest signal */
> > > +        break;
> > > +    case 0x15:
> > > +        mdev = &mem_st->devs[mem_st->selector];
> > > +        if (data & 2) { /* clear insert event */
> > > +            mdev->is_inserting  = false;
> > > +        }
> > > +        break;
> > > +    }
> > > +
> > > +}
> > > +static const MemoryRegionOps acpi_memory_hotplug_ops = {
> > > +    .read = acpi_memory_hotplug_read,
> > > +    .write = acpi_memory_hotplug_write,
> > > +    .endianness = DEVICE_LITTLE_ENDIAN,
> > > +    .valid = {
> > > +        .min_access_size = 1,
> > > +        .max_access_size = 4,
> > > +    },
> > > +};
> > > +
> > > +void acpi_memory_hotplug_init(Object *owner, MemoryRegion *io,
> > > +                              MemHotplugState *state)
> > > +{
> > > +    QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), NULL);
> > > +    g_assert(opts);
> > > +
> > > +    state->dev_count = qemu_opt_get_number(opts, "slots", 0);
> > > +
> > > +    if (!state->dev_count) {
> > > +        return;
> > > +    }
> > > +
> > > +    state->devs = g_malloc0(sizeof(*state->devs) * state->dev_count);
> > > +    memory_region_init_io(io, owner, &acpi_memory_hotplug_ops, state,
> > > +                          "apci-mem-hotplug", 
> > > ACPI_MEMORY_HOTPLUG_IO_LEN);
> > > +}
> > > +
> > > +int acpi_memory_hotplug_cb(ACPIREGS *regs, MemHotplugState *mem_st,
> > > +                           DeviceState *dev, HotplugState state)
> > > +{
> > > +    MemStatus *mdev;
> > > +    Error *local_err = NULL;
> > > +    int slot = object_property_get_int(OBJECT(dev), "slot", &local_err);
> > > +
> > > +    if (error_is_set(&local_err)) {
> > > +        qerror_report_err(local_err);
> > > +        error_free(local_err);
> > > +        return -1;
> > > +    }
> > > +
> > > +    if (slot >= mem_st->dev_count) {
> > > +        char *dev_path = object_get_canonical_path(OBJECT(dev));
> > > +        qerror_report(ERROR_CLASS_GENERIC_ERROR, 
> > > "acpi_memory_hotplug_cb: "
> > > +                      "device [%s] returned invalid memory slot[%d]",
> > > +                      dev_path, slot);
> > > +        g_free(dev_path);
> > > +        return -1;
> > > +    }
> > > +
> > > +    mdev = &mem_st->devs[slot];
> > > +    if (state == HOTPLUG_ENABLED) {
> > > +        mdev->dimm = dev;
> > > +        mdev->is_enabled = true;
> > > +        mdev->is_inserting = true;
> > > +    }
> > > +
> > > +    /* do ACPI magic */
> > > +    regs->gpe.sts[0] |= ACPI_MEMORY_HOTPLUG_STATUS;
> > > +    return 0;
> > > +}
> > > diff --git a/include/hw/acpi/acpi.h b/include/hw/acpi/acpi.h
> > > index c4ae7d7..e5717df 100644
> > > --- a/include/hw/acpi/acpi.h
> > > +++ b/include/hw/acpi/acpi.h
> > > @@ -168,6 +168,30 @@ uint32_t acpi_gpe_ioport_readb(ACPIREGS *ar, 
> > > uint32_t addr);
> > >  
> > >  void acpi_update_sci(ACPIREGS *acpi_regs, qemu_irq irq, uint32_t 
> > > gpe0_sts_mask);
> > >  
> > > +#define ACPI_MEMORY_HOTPLUG_IO_LEN 24
> > > +#define ACPI_MEMORY_HOTPLUG_BASE 0x0a00
> > > +
> > > +#define ACPI_MEMORY_HOTPLUG_STATUS 8
> > > +
> > > +typedef struct MemStatus {
> > > +    DeviceState *dimm;
> > > +    bool is_enabled;
> > > +    bool is_inserting;
> > > +    uint32_t ost_event;
> > > +    uint32_t ost_status;
> > > +} MemStatus;
> > > +
> > > +typedef struct MemHotplugState {
> > > +    uint32_t selector;
> > > +    uint32_t dev_count;
> > > +    MemStatus *devs;
> > > +} MemHotplugState;
> > > +
> > > +void acpi_memory_hotplug_init(Object *owner, MemoryRegion *io,
> > > +                              MemHotplugState *state);
> > > +
> > > +int acpi_memory_hotplug_cb(ACPIREGS *regs, MemHotplugState *mem_st,
> > > +                           DeviceState *dev, HotplugState state);
> > >  /* acpi.c */
> > >  extern int acpi_enabled;
> > >  extern char unsigned *acpi_tables;
> > > -- 
> > > 1.7.1



reply via email to

[Prev in Thread] Current Thread [Next in Thread]