qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [RFC PATCH 00/11] aio: Introduce handler type to fix ne


From: Paolo Bonzini
Subject: Re: [Qemu-block] [RFC PATCH 00/11] aio: Introduce handler type to fix nested aio_poll for dataplane
Date: Mon, 27 Jul 2015 15:23:51 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.0.1


On 27/07/2015 08:55, Fam Zheng wrote:
> On Fri, 07/24 09:35, Paolo Bonzini wrote:
>>> That way, aio_context_acquire can be dropped by:
>>>
>>>     /* @pause_owner_thread: a callback which will be called when _main 
>>> thread_
>>>      * wants exclusively operate on the AioContext, for example with a 
>>> nested
>>>      * aio_poll().
>>>      * @resume_owner_thread: a callback which will be called when _main 
>>> thread_
>>>      * has done the exclusive operation.
>>>      */
>>>     AioContext *aio_context_new(AioContextPauseResumeFn *pause_owner_thread,
>>>                                 AioContextPauseResumeFn 
>>> *resume_owner_thread,
>>>                                 void *opaque,
>>>                                 Error **errp):
>>>
>>>     /* Try to pause the owning thread */
>>>     void aio_context_pause(AioContext *ctx, Error **errp);
>>>
>>>     /* Try to resume the paused owning thread, cannot fail */
>>>     void aio_context_resume(AioContext *ctx);
>>>
>>> Then, in iothread.c:
>>>
>>>     iothread->ctx = aio_context_new(iothread_pause, iothread_resume,
>>>                                     iothread, &local_err);
>>>
>>> Where iothread_pause can block the thread with a QemuCond.
>>
>> Condition variables don't mix well with recursive mutexes...  Once we
>> have bottom halves outside the AioContext lock, however, we could use a
>> separate lock for this condition variable.  That would work.
> 
> Yes, I thought so.
> 
>> I like it, but I still ask myself... what's the difference between
>> aio_context_pause/resume and aio_disable/enable_clients? :)  There is
>> still state, just in the iothread rather than in the AioContext.
> 
> I don't know, maybe this will make aio_context_acquire no longer necessary so
> we get virtio-scsi dataplane fixed?

What you would have is:

1) the main thread calls aio_context_pause/resume, which wait for the
dataplane thread to pause

2) a paused I/O thread use a condition variable or something to wait for
the main thread to call aio_context_resume

In the end, this is still a lock.  For example in RFifoLock the critical
sections "look like" they are contained within rfifolock_lock and
rfifolock_unlock, but still RFifoLock is a lock and (as far as deadlocks
are concerned) behaves as if there is a single critical section between
rfifolock_lock and rfifolock_unlock.

So, the bdrv_aio_poll approach works great with the huge critical
sections that we have now.  We're effectively using the AioContext lock
_not_ to protect data, but just to shield iothread's usage of block
devices from the dataplane and vice versa.

That worked well as an initial step towards thread-safety (in fact, the
BQL itself is a single huge lock that protects code).  However, we will
however move towards fine-grained critical sections sooner or later;
besides virtio-scsi dataplane, they are also necessary in order to do
"real" multiqueue with multiple iothreads per BDS.

Once you make your locks protect data, things do become more complex
(more locks, more critical sections, etc.) but they can at the same time
be less scary.  You can reason in terms of invariants that hold at the
end of each critical section, and concurrency is not a big deal anymore
because your new tool (invariants) is both powerful and independent of
concurrency.

You don't have to worry about that now, however.
aio_disable_clients/aio_enable_clients for now can be protected by the
AioContext lock.  Later, it can be protected by whatever fine-grained
lock will protect the list of AioHandlers.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]