Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutin

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutin

From:	Kevin Wolf
Subject:	Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O
Date:	Thu, 27 Jan 2011 10:27:36 +0100
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Fedora/3.0.10-1.fc12 Thunderbird/3.0.10

Am 26.01.2011 18:12, schrieb Anthony Liguori:
> On 01/26/2011 10:38 AM, Avi Kivity wrote:
>> On 01/26/2011 06:28 PM, Anthony Liguori wrote:
>>> On 01/26/2011 10:13 AM, Avi Kivity wrote:
>>>> Serializing against a global mutex has the advantage that it can be 
>>>> treated as a global lock that is decomposed into fine-grained locks.
>>>>
>>>> For example, we can start the code conversion from an explict async 
>>>> model to a threaded sync model by converting the mutex into a 
>>>> shared/exclusive lock.  Operations like read and write take the lock 
>>>> for shared access (and take a fine-grained mutex on the metadata 
>>>> cache entry), while operation like creating a snapshot take the lock 
>>>> for exclusive access.  That doesn't work with freeze/thaw.
>>>
>>> The trouble with this is that you increase the amount of re-entrance 
>>> whereas freeze/thaw doesn't.
>>>
>>> The code from the beginning of the request to where the mutex is 
>>> acquired will be executed for every single request even while 
>>> requests are blocked at the mutex acquisition.
>>
>> It's just a few instructions.
>>
>>>
>>> With freeze/thaw, you freeze the queue and prevent any request from 
>>> starting until you thaw.  You only thaw and return control to allow 
>>> another request to execute when you begin executing an asynchronous 
>>> I/O callback.
>>>
>>
>> What do you actually save?  The longjmp() to the coroutine code, 
>> linking in to the mutex wait queue, and another switch back to the 
>> main coroutine?  Given that we don't expect to block often, it seems 
>> hardly a cost worth optimizing.

And if you cared, you could probably optimize it in the mutex
implementation.

> It's a matter of correctness, not optimization.
> 
> Consider the following example:
> 
> coroutine {
>     l2 = find_l2(offset);
>     // allocates but does not increment max cluster offset
>     l2[l2_offset(offset)] = alloc_cluster();
> 
>     co_mutex_lock(&lock);
>     write_l2(l2);
>     co_mutex_unlock(&lock);
> 
>     l1[l1_offset(offset)] = l2;
> 
>     co_mutex_lock(&lock);
>     write_l1(l1);
>     co_mutex_unlock(&lock);
> 
>     commit_cluster(l2[l2_offset(offset)]);
> }
> 
> This code is incorrect.  The code before the first co_mutex_lock() may 
> be called a dozen times before before anyone finishes this routine.  
> That means the same cluster is reused multiple times.

But this is not a new problem. In qcow2 we have this today:

cluster_offset = alloc_cluster();
bdrv_aio_write()
...
update_l2()

There is already code to handle this case, even though it could probably
be simplified for coroutines. I don't think we want to make the code
worse during the conversion by basically serializing everything.

> This code was correct before we introduced coroutines because we 
> effectively had one big lock around the whole thing.

So what's the lesson to learn? You need to take care when you convert a
big lock into smaller ones?

>>> I think my previous example was wrong, you really want to do:
>>>
>>> qcow2_aio_writev() {
>>>    coroutine {
>>>        freeze();
>>>        sync_io(); // existing qcow2 code
>>>        thaw();
>>>        // existing non I/O code
>>>        bdrv_aio_writev(callback); // no explicit freeze/thaw needed
>>>    }
>>> }
>>>
>>> This is equivalent to our existing code because no new re-entrance is 
>>> introduced.  The only re-entrancy points are in the 
>>> bdrv_aio_{readv,writev} calls.
>>
>> This requires you to know which code is sync, and which code is 
>> async.  My conversion allows you to wrap the code blindly with a 
>> mutex, and have it do the right thing automatically.  This is most 
>> useful where the code can be sync or async depending on data (which is 
>> the case for qcow2).

Well, but in the case of qcow2, you don't want to have a big mutex
around everything. We perfectly know which parts are asynchronous and
which are synchronous, so we'd want to do it finer grained from the
beginning.

Kevin

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O, (continued)
- [Qemu-devel] [RFC][PATCH 01/12] coroutine: Add gtk-vnc coroutines library, Stefan Hajnoczi, 2011/01/22
  - Re: [Qemu-devel] [RFC][PATCH 01/12] coroutine: Add gtk-vnc coroutines library, Avi Kivity, 2011/01/26

Prev by Date: Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O
Next by Date: Re: [Qemu-devel] [RFC][PATCH 05/12] coroutine: Add coroutines
Previous by thread: Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O
Next by thread: Re: [Qemu-devel] [RFC][PATCH 11/12] qcow2: Convert qcow2 to use coroutines for async I/O
Index(es):
- Date
- Thread