qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V3 10/11] vcpu: introduce lockmap


From: Avi Kivity
Subject: Re: [Qemu-devel] [PATCH V3 10/11] vcpu: introduce lockmap
Date: Tue, 11 Sep 2012 15:41:15 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120828 Thunderbird/15.0

On 09/11/2012 03:24 PM, Avi Kivity wrote:
> On 09/11/2012 12:57 PM, Jan Kiszka wrote:
>> On 2012-09-11 11:44, liu ping fan wrote:
>>> On Tue, Sep 11, 2012 at 4:35 PM, Avi Kivity <address@hidden> wrote:
>>>> On 09/11/2012 10:51 AM, Liu Ping Fan wrote:
>>>>> From: Liu Ping Fan <address@hidden>
>>>>>
>>>>> The func call chain can suffer from recursively hold
>>>>> qemu_mutex_lock_iothread. We introduce lockmap to record the
>>>>> lock depth.
>>>>
>>>> What is the root cause?  io handlers initiating I/O?
>>>>
>>> cpu_physical_memory_rw() can be called nested, and when called, it can
>>> be protected by no-lock/device lock/ big-lock.
>>> I think without big lock, io-dispatcher should face the same issue.
>>> As to main-loop, have not carefully consider, but at least, dma-helper
>>> will call cpu_physical_memory_rw() with big-lock.
>> 
>> That is our core problem: inconsistent invocation of existing services
>> /wrt locking. For portio, I was lucky that there is no nesting and I was
>> able to drop the big lock around all (x86) call sites. But MMIO is way
>> more tricky due to DMA nesting.
> 
> Maybe we need to switch to a continuation style.  Instead of expecting
> cpu_physical_memory_rw() to complete synchronously, it becomes an
> asynchronous call and you provide it with a completion.  That means
> devices which use it are forced to drop the lock in between.  Block and
> network clients will be easy to convert since they already use APIs that
> drop the lock (except for accessing the descriptors).
> 
>> We could try to introduce a different version of cpu_physical_memory_rw,
>> cpu_physical_memory_rw_unlocked. But the problem remains that an MMIO
>> request can trigger the very same access in a nested fashion, and we
>> will have to detect this to avoid locking up QEMU (locking up the guest
>> might be OK).
> 
> An async version of c_p_m_rw() will just cause a completion to bounce
> around, consuming cpu but not deadlocking anything.  If we can keep a
> count of the bounces, we might be able to stall it indefinitely or at
> least ratelimit it.
> 

Another option is to require all users of c_p_m_rw() and related to use
a coroutine or thread.  That makes the programming easier (but still
required a revalidation after the dropped lock).

-- 
error compiling committee.c: too many arguments to function



reply via email to

[Prev in Thread] Current Thread [Next in Thread]