qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycl


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
Date: Wed, 05 Sep 2012 12:52:56 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0

On 09/05/2012 11:19 AM, liu ping fan wrote:
> [...]
>>>
>>>>>> We could externalize the refcounting and push it into device code.  This
>>>>>> means:
>>>>>>
>>>>>>    memory_region_init_io(&s->mem, dev)
>>>>>>
>>>>>>    ...
>>>>>>
>>>>>>    object_ref(dev)
>>>>>>    memory_region_add_subregion(..., &dev->mr)
>>>>>>
>>>>>>    ...
>>>>>>
>>>>>>    memory_region_del_subregion(..., &dev->mr)  // implied flush
>>>>>>    object_unref(dev)
>>>>>>
>>>>> I think "object_ref(dev)" just another style to push
>>>>> MemoryRegionOps::object() to device level.  And I think we can bypass
>>>>> it.   The caller (unplug, pci-reconfig ) of
>>>>> memory_region_del_subregion() ensure the @dev is valid.
>>>>> If the implied flush is implemented in synchronize,  _del_subregion()
>>>>> will guarantee no reader for @dev->mr and @dev exist any longer.
>>>>
>>>> The above code has a deadlock.  memory_region_del_subregion() may be
>>>> called under the device lock (since it may be the result of mmio to the
>>>> device), and if the flush uses synchronized_rcu(), it will wait forever
>>>> for the read-side critical section to complete.
>>>>
>>> But if _del_subregion() just wait for mr-X quiescent period, while
>>> calling in mr-Y's read side critical section, then we can avoid
>>> deadlock.  I saw in pci-mapping, we delete mr-X in mr-Y read side.
>>>
>>>>> So I
>>>>> think we can save both object_ref/unref(dev) for memory system.
>>>>> The only problem is that whether we can implement it as synchronous or
>>>>> not,  is it possible that we can launch a  _del_subregion(mr-X) in
>>>>> mr-X's dispatcher?
>>>>
>>>> Yes.  Real cases exist.
>>>
>>> Oh, I find the sample code, then, the deadlock is unavoidable in this 
>>> method.
>>>>
>>>> What alternatives remain?
>>>>
>>> I think a way out may be async+refcnt
>>>
>> If we consider the relationship of MemoryRegionImpl and device as the
>> one between file and file->private_data  in Linux. Then the creation
>> of impl will object_ref(dev) and when impl->ref=0, it will
>> object_unref(dev)
>> But this is an async model, for those client which need to know the
>> end of flush, we can adopt callback just like call_rcu().
>>
>>
> During this thread, it seems that no matter we adopt refcnt on
> MemoryRegionImpl or not, protecting MemoryRegion::opaque during
> dispatch is still required. 

Right.  So MemoryRegionImpl is pointless.

My current preference is MemoryRegionOps::ref(MemoryRegion *mr) (and a
corresponding unref), which has the following requirements:

- if the refcount is nonzero, MemoryRegion::opaque is safe to use
- if the refcount is nonzero, the MemoryRegion itself is stable.

Later we can change other callbacks to also use mr instead of opaque.
Usually the callback will be able to derive the device object from the
region, only special cases will require
memory_region_opaque(MemoryRegion *).  We can even drop the opaque from
memory_region_init_io() and replace it with memory_region_set_opaque(),
to be used in those special cases.

> It is challenging to substitute
> memory_region_init_io() 's 3rd parameter from void* to Object*, owning
> to the  conflict that life-cycle management need the host of opaque,
> while Object and mr need 1:1 map. So I think, I will move forward on
> the way based on MemoryRegionOps::object(). Right?

As outlined above, I now prefer MemoryRegionOps::ref/unref.

Advantages compared to MemoryRegionOps::object():
 - decoupled from the QOM framework

Disadvantages:
 - more boilerplate code in converted devices

Since we are likely to convert few devices to lockless dispatch, I
believe the tradeoff is justified.  But let's hear Jan and the others.

-- 
error compiling committee.c: too many arguments to function



reply via email to

[Prev in Thread] Current Thread [Next in Thread]