[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 2/3] qdev: add cleanup logic in device_set_re
From: |
Gonglei (Arei) |
Subject: |
Re: [Qemu-devel] [PATCH v3 2/3] qdev: add cleanup logic in device_set_realized() to avoid resource leak |
Date: |
Tue, 2 Sep 2014 02:13:35 +0000 |
> From: address@hidden
> [mailto:address@hidden On Behalf Of Peter Crosthwaite
> Sent: Tuesday, September 02, 2014 6:51 AM
> >> On Mon, Sep 1, 2014 at 3:50 PM, <address@hidden> wrote:
> >> > From: Gonglei <address@hidden>
> >> >
> >> > At present, this function doesn't have partial cleanup implemented,
> >> > which will cause resource leak in some scenarios.
> >> >
> >> > Example:
> >> >
> >> > 1. Assuming that "dc->realize(dev, &local_err)" execute successful
> >> > and local_err == NULL;
> >> > 2. Executing device hotplug in hotplug_handler_plug(), but failed
> >> > (It is prone to occur). Then local_err != NULL;
> >> > 3. error_propagate(errp, local_err) and return. But the resources
> >> > which been allocated in dc->realize() will be leaked.
> >> > Simple backtrace:
> >> > dc->realize()
> >> > |->device_realize
> >> > |->pci_qdev_init()
> >> > |->do_pci_register_device()
> >> > |->etc.
> >> >
> >> > Adding fuller cleanup logic which assure that function can
> >> > goto appropriate error label as local_err population is
> >> > detected as each relevant point.
> >> >
> >> > Signed-off-by: Gonglei <address@hidden>
> >> > ---
> >> > hw/core/qdev.c | 73
> >> ++++++++++++++++++++++++++++++++++++++++++----------------
> >> > 1 file changed, 53 insertions(+), 20 deletions(-)
> >> >
> >> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> >> > index 4a1ac5b..2a82cc1 100644
> >> > --- a/hw/core/qdev.c
> >> > +++ b/hw/core/qdev.c
> >> > @@ -834,12 +834,14 @@ static void device_set_realized(Object *obj,
> bool
> >> value, Error **errp)
> >> > dc->realize(dev, &local_err);
> >> > }
> >> >
> >> > - if (dev->parent_bus && dev->parent_bus->hotplug_handler &&
> >> > - local_err == NULL) {
> >> > + if (local_err != NULL) {
> >> > + goto fail;
> >> > + }
> >> > +
> >> > + if (dev->parent_bus && dev->parent_bus->hotplug_handler) {
> >> >
> hotplug_handler_plug(dev->parent_bus->hotplug_handler,
> >> > dev, &local_err);
> >> > - } else if (local_err == NULL &&
> >> > - object_dynamic_cast(qdev_get_machine(),
> >> TYPE_MACHINE)) {
> >> > + } else if (object_dynamic_cast(qdev_get_machine(),
> >> TYPE_MACHINE)) {
> >> > HotplugHandler *hotplug_ctrl;
> >> > MachineState *machine =
> MACHINE(qdev_get_machine());
> >> > MachineClass *mc = MACHINE_GET_CLASS(machine);
> >> > @@ -852,21 +854,25 @@ static void device_set_realized(Object *obj,
> bool
> >> value, Error **errp)
> >> > }
> >> > }
> >> >
> >> > - if (qdev_get_vmsd(dev) && local_err == NULL) {
> >> > + if (local_err != NULL) {
> >> > + goto post_realize_fail;
> >> > + }
> >> > +
> >> > + if (qdev_get_vmsd(dev)) {
> >> > vmstate_register_with_alias_id(dev, -1,
> >> qdev_get_vmsd(dev), dev,
> >> >
> >> dev->instance_id_alias,
> >> >
> >> dev->alias_required_for_version);
> >> > }
> >> > - if (local_err == NULL) {
> >> > - QLIST_FOREACH(bus, &dev->child_bus, sibling) {
> >> > - object_property_set_bool(OBJECT(bus), true,
> "realized",
> >> > - &local_err);
> >> > - if (local_err != NULL) {
> >> > - break;
> >> > - }
> >> > +
> >> > + QLIST_FOREACH(bus, &dev->child_bus, sibling) {
> >> > + object_property_set_bool(OBJECT(bus), true, "realized",
> >> > + &local_err);
> >> > + if (local_err != NULL) {
> >> > + goto set_realized_fail;
> >>
> >> should this label be child_realize_fail for clarity? The object is
> >> already realized by now so this name unqualified is a little
> >> misleading.
> >>
> >> > }
> >> > }
> >> > - if (dev->hotplugged && local_err == NULL) {
> >> > +
> >> > + if (dev->hotplugged) {
> >> > device_reset(dev);
> >> > }
> >> > dev->pending_deleted_event = false;
> >> > @@ -875,24 +881,51 @@ static void device_set_realized(Object *obj,
> bool
> >> value, Error **errp)
> >> > object_property_set_bool(OBJECT(bus), false, "realized",
> >> > &local_err);
> >> > if (local_err != NULL) {
> >> > - break;
> >> > + goto set_unrealized_fail;
> >> > }
> >> > }
> >> > - if (qdev_get_vmsd(dev) && local_err == NULL) {
> >> > + if (qdev_get_vmsd(dev)) {
> >> > vmstate_unregister(dev, qdev_get_vmsd(dev), dev);
> >> > }
> >> > - if (dc->unrealize && local_err == NULL) {
> >> > + if (dc->unrealize) {
> >> > dc->unrealize(dev, &local_err);
> >> > }
> >> > dev->pending_deleted_event = true;
> >> > - }
> >> >
> >> > - if (local_err != NULL) {
> >> > - error_propagate(errp, local_err);
> >> > - return;
> >> > + if (local_err != NULL) {
> >> > + goto fail;
> >> > + }
> >> > }
> >> >
> >> > dev->realized = value;
> >> > + return;
> >> > +
> >> > +set_unrealized_fail:
> >>
> >> child_unrealize_fail
> >>
> >> > + QLIST_FOREACH(bus, &dev->child_bus, sibling) {
> >> > + object_property_set_bool(OBJECT(bus), true, "realized",
> >>
> >> "true" doesn't look right to me. It means if the recursive unrealize
> >> botches it will re-realize all children? The idea behind my comment
> >> last spin was to forcefully unrealize all children regardless of
> >> early-iteration errors. I think set_realized_fail and set
> >> unrealized_fail should be the same logic. You should keep going with
> >> your cleanup on error rather than undo the cleanup already done.
> >>
> > I knew your mean, but forgot to explain in pervious email, sorry for this.
> > When one child unrealized failed, the interface will return error for user,
> > so user think the device existed yet in QEMU, but not hotplugged.
> > If we keep going with cleanup, the state saved by QEMU is not same with
> user's
> > thought.
>
> This is also broken in original code right?
>
Yes. Any child unrealize failed, would return error immediately.
> So there's an argument to say that this case would be a fatal error,
> i.e. the hotplug interface should be &error_abort ing on the unrealize
> path. I don't think Qdev has a chance of being able to guarantee that
> a botched recursive unrealize has no side-effects so hotplug framework
> has no ability for sane recovery in these cases.
>
OK. But actually I more agree with using NULL instead of
using '&error_abort', we can chose to ignore this kind of error, but not abort.
What's your opinion? Thanks. :)
Best regards,
-Gonglei