qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycl


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH 10/10] qdev: fix create in place obj's life cycle problem
Date: Mon, 03 Sep 2012 12:09:31 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0

On 08/29/2012 08:41 PM, Jan Kiszka wrote:
>>>
>>> memory_region_transaction_commit (or calls that trigger this) will have
>>> to wait. That is required as the caller may start fiddling with the
>>> region right afterward.
>> 
>> However, it may be running within an mmio dispatch, so it may wait forever.
> 
> We won't be able to wait for devices that can acquire the same lock we
> hold while waiting, of course. That's why a lock ordering device-lock ->
> BQL (the one we hold over memory region changes) won't be allowed. I was
> wrong on this before: We must not take course-grained locks while
> holding fine-grained ones, only the other way around. Pretty obvious,
> specifically for realtime / low-latency.

So how do we wait?

Detecting that we're in the same thread and allowing reentrancy in that
case in the option, but I can't say I like it much.

>> 
>> Ignoring this, how does it wait? sleep()? or wait for a completion?
> 
> Ideally via completion.

Then you have to manage the life cycle of the completion object.

> 
>> 
>>>>
>>>> Usually a reference count controls the lifetime of the reference counted
>>>> object, what are you suggesting here?
>>>
>>> To synchronize on reference count going to zero. 
>> 
>> Quite unorthodox.  Do you have real life examples?
> 
> synchronize_rcu.

This whole thing started because synchronize_rcu() will deadlock if
called from a read-side critical section.  The fix was taking the
reference, so that the read-side critical section is confined to the
lookup, not the call.  This way the lock ordering is always device->map.

We could schedule a bh or a thread and call synchronize_rcu() from
there, but then the commit is deferred to an unknown time, whereas we
need to guarantee that by the time the guest is reentered, the
transaction is committed.

> 
>> 
>>> Or all readers leaving
>>> the read-side critical sections.
>> 
>> This is rcu.  But again, we may be in an rcu read-side critical section
>> while needing to wait.  In fact this was what I suggested in the first
>> place, until Marcelo pointed out the deadlock, so we came up with the
>> refcount.
> 
> We must avoid that deadlock scenario. I bended my brain around it, and I
> think that is the only answer. We can if we apply clear rules regarding
> locking and BQL-protected services to those devices that will actually
> use fine-grained locking, and there only for those paths that take
> per-device locks. I'm pretty sure we won't get far with an
> "all-or-nothing" model where every device uses a private lock.

An all-or-nothing model is not proposed.  Unconverted devices will take
the BQL instead of the device lock (hopefully the BQL will be taken for
them during dispatch, so no code changes will be needed).

>> 
>> I don't follow.  Synchronous transactions mean you can't
>> synchronize_rcu() or upgrade the lock or wait for the refcount.  That's
>> the source of the problem!
> 
> Our current device models assume synchronous transactions on the memory
> layer, actually on all layers. The proposals I saw so far try to change
> this. But I'm very skeptical that this would succeed, due to the
> conversion effort and due to the fact that it would give us a tricky  to
> use API.

In what way do transactions become asynchronous?

It's true that io callbacks can be called concurrently, but as soon as
they take the device lock, they are serialized.  The only relaxation is
that a callback may be called after a region has been disabled or
removed from view.

-- 
error compiling committee.c: too many arguments to function



reply via email to

[Prev in Thread] Current Thread [Next in Thread]