qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] target-i386: Remove icc_bridge parameter


From: Andreas Färber
Subject: Re: [Qemu-devel] [PATCH v2 1/1] target-i386: Remove icc_bridge parameter from cpu_x86_create()
Date: Wed, 11 Mar 2015 12:59:33 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0

Am 11.03.2015 um 12:11 schrieb Eduardo Habkost:
> On Tue, Mar 10, 2015 at 11:43:41PM +0100, Andreas Färber wrote:
>> Am 10.03.2015 um 22:57 schrieb Eduardo Habkost:
>>> Instead of passing icc_bridge from the PC initialization code to
>>> cpu_x86_create(), make the PC initialization code attach the CPU to
>>> icc_bridge.
>>>
>>> The only difference here is that icc_bridge attachment will now be done
>>> after x86_cpu_parse_featurestr() is called. But this shouldn't make any
>>> difference, as property setters shouldn't depend on icc_bridge.
>>>
>>> Signed-off-by: Eduardo Habkost <address@hidden>
>>> ---
>>> Changes v1 -> v2:
>>>  * Keep existing check for NULL icc_bridge and error reporting, instead
>>>    of assing assert(icc_bridge)
>>> ---
>>>  hw/i386/pc.c      | 13 +++++++++++--
>>>  target-i386/cpu.c | 14 ++------------
>>>  target-i386/cpu.h |  3 +--
>>>  3 files changed, 14 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>> index b5b2aad..a26e0ec 100644
>>> --- a/hw/i386/pc.c
>>> +++ b/hw/i386/pc.c
>>> @@ -992,18 +992,27 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int 
>>> level)
>>>  static X86CPU *pc_new_cpu(const char *cpu_model, int64_t apic_id,
>>>                            DeviceState *icc_bridge, Error **errp)
>>>  {
>>> -    X86CPU *cpu;
>>> +    X86CPU *cpu = NULL;
>>>      Error *local_err = NULL;
>>>  
>>> -    cpu = cpu_x86_create(cpu_model, icc_bridge, &local_err);
>>> +    if (icc_bridge == NULL) {
>>> +        error_setg(&local_err, "Invalid icc-bridge value");
>>> +        goto out;
>>> +    }
>>> +
>>> +    cpu = cpu_x86_create(cpu_model, &local_err);
>>
>> We had previously discussed reference counting. Here I would expect:
> 
> I will try to extend the analysis with ownership of each reference:
> 
>>
>> OBJECT(cpu)->ref == 1
> 
> And the owner of the reference is pc_new_cpu() (cpu variable).
> 
>>
>>>      if (local_err != NULL) {
>>>          error_propagate(errp, local_err);
>>>          return NULL;
>>>      }
>>>  
>>> +    qdev_set_parent_bus(DEVICE(cpu), qdev_get_child_bus(icc_bridge, 
>>> "icc"));
>>
>> OBJECT(cpu)->ref == 2
> 
> And the owners are: pc_new_cpu()/cpu and icc_bridge.
> 
> Now, what if we error out and destroy the CPU after we already added the
> CPU to icc_bridge? Is icc_bridge going to keep pointing to the dead
> object, or is there some bus-detach magic somewhere that will make it
> work?

Lowering the ref count to 0 will trigger object_finalize(), which via
object_property_del_all() triggers unparenting via
object_finalize_child_property(), which in the device case causes
unrealization of the device and unparenting of its owned buses
(qdev.c:device_unparent()).

>>> +    object_unref(OBJECT(cpu));
>>
>> OBJECT(cpu)->ref == 1
> 
> Here pc_new_cpu() is dropping its reference too early! In practice it is
> now borrowing the reference owned by icc_bridge, and I don't think we
> should do that.
> 
> I just kept the object_unref() call here because I didn't want to change
> any behavior when moving code, but I think it doesn't belong here.

Agree. In my patches I placed it after the last usage of cpu in this
function, but actually since it is the return value it would even need
to go into the callers. pc_cpus_init() is currently ignoring the return
value.

For using the CPU as a child<> of a CPU core object that changes in my
series, so I'll fix that.

>>> +
>>>      object_property_set_int(OBJECT(cpu), apic_id, "apic-id", &local_err);
>>>      object_property_set_bool(OBJECT(cpu), true, "realized", &local_err);
>>
>> OBJECT(cpu)->ref == 1 or 2 depending on DeviceClass::realize :)
> 
> If it's 2, it won't be our problem as we don't own the extra reference.
> It's the responsibility of whoever grabbed the extra reference.
> 
> But I assume the property setters above MUST not add any extra reference
> in case they return an error. Correct?

So, the extra reference would be the one added by device_set_realized()
for /machine/unattached parent. That happens as very first thing on
false -> true transition. If an error occurs either in
DeviceClass::realize or later in device_set_realized() then that
reference will remain, but it is re-entrant in only doing so once.

I have refreshing and combining the two competing qom-tree HMP patches
on my radar and am starting to think that reference counting information
may be a third interesting mode of output...

>>>  
>>> +out:
>>>      if (local_err) {
>>>          error_propagate(errp, local_err);
>>>          object_unref(OBJECT(cpu));
>>
> 
> And here we have something that was already broken: X86CPU instance_init
> calls cpu_exec_init(), the CPU is added to the global CPU list without
> increasing reference counting, and the global list will point to a
> destroyed object if we enter the error path.
> 
> In other words, if anything fails after cpu_exec_init() is called, we're
> screwed. In the future it will be on realize, but we probably need to
> move it closer to the end of realize.

Yeah, most of the error handling kind of assumes that when an error
occurs we report the Error* upwards and exit QEMU, with the user
restarting from scratch. With CPU hotplug that is a nasty assumption.

>> object_unref(NULL) looks unusual but is valid.
> 
> Yes. Makes the code simpler. :)
> 
>>
>> Should we change the return NULL to jump here, too, then?
> 
> Yes, the cpu_x86_create() error check could just do a "goto out".

I assume you are planning to apply this patch through your tree? Will
you submit a v3? Otherwise can you tweak when applying and let me know
so I can cherry-pick? Thanks.

>> OBJECT(cpu)->ref == 0 or 1
>>
>> I wonder whether we need another object_unref(OBJECT(cpu)) for the
>> non-error case, either here or in the callers? Out of scope for this
>> patch, of course.
> 
> So, how I see it: if we are returning a reference to the object, now it
> belongs to the caller, and it should be the caller responsibility to
> call object_unref(). So the non-error object_unref() after
> qdev_set_parent_bus() is not supposed to be in pc_new_cpus(), but in the
> callers.

Yeah, agree.

> Either way we choose, we should document it so callers know who
> owns the reference they get.
> 
> Alternatively, we could simply change pc_new_cpu() to _not_ return a
> pointer to the CPU, and make pc_cpus_init() deal with the APIC MMIO
> mapping using some another approach.

No, that won't work for me.

But my question was rather whether changes for the error case will be
needed? If the reference for /machine/unassigned/device[n] remains
around, then our local object_unref() won't unparent/finalize the
device, meaning it stays around. If we do an additional unref then
unparenting would make the ref count go to -1. So we may need to
unparent it explicitly here in this error path? Hard to test.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu,
Graham Norton; HRB 21284 (AG Nürnberg)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]