qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC for 2.9 1/1] block: add missed aio_context_acquire


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [RFC for 2.9 1/1] block: add missed aio_context_acquire into blk_unref
Date: Mon, 27 Mar 2017 19:17:28 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 03/27/2017 06:45 PM, Max Reitz wrote:
> @Subject: Do you mean "PATCH for-2.9?"? Because "RFC" to me means
> "please do not merge". ;-)
>
> I wouldn't mind a change like this to go into 2.9.
I am quite sure that problem is here but unsure about the place.
Here I use term 'RFC' as 'Can we start the discussion'?


> On 27.03.2017 17:35, Denis V. Lunev wrote:
>> Recently we expirience hang with iothreads enabled with the following
>> call trace:
>> Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
>> 0  ppoll () from /lib64/libc.so.6
>> 2  qemu_poll_ns () at qemu-timer.c:313
>> 3  aio_poll () at aio-posix.c:457
>> 4  bdrv_flush () at block/io.c:2641
>> 5  bdrv_close () at block.c:2143
>> 6  bdrv_delete () at block.c:2352
>> 7  bdrv_unref () at block.c:3429
>> 8  blk_remove_bs () at block/block-backend.c:427
>> 9  blk_delete () at block/block-backend.c:178
>> 10 blk_unref () at block/block-backend.c:226
>> 11 object_property_del_all () at qom/object.c:399
>> 12 object_finalize () at qom/object.c:461
>> 13 object_unref () at qom/object.c:898
>> 14 object_property_del_child () at qom/object.c:422
>> 15 qmp_marshal_device_del () at qmp-marshal.c:1145
>> 16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929
>>
>> Technically bdrv_flush() stucks in
>>     while (rwco.ret == NOT_DONE) {
>>         aio_poll(aio_context, true);
>>     }
>> but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
>> reveals that we do not have performed aio_context_acquire() on this call
>> stack.
>>
>> This patch adds missed lock.
>>
>> Signed-off-by: Denis V. Lunev <address@hidden>
>> CC: Kevin Wolf <address@hidden>
>> CC: Max Reitz <address@hidden>
>> CC: Eric Blake <address@hidden>
>> CC: Markus Armbruster <address@hidden>
>> ---
>>  block/block-backend.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 5742c09..65d5da9 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -273,7 +273,11 @@ void blk_unref(BlockBackend *blk)
>>      if (blk) {
>>          assert(blk->refcnt > 0);
>>          if (!--blk->refcnt) {
>> +            AioContext *ctx = blk_get_aio_context(blk);
>> +
>> +            aio_context_acquire(ctx);
>>              blk_delete(blk);
>> +            aio_context_release(ctx);
> But I don't think this is quite the correct place. The caller of
> blk_unref() should have acquired the AioContext already. As far as I can
> tell in this case that would be originally release_drive() in
> hw/core/qdev-properties-system.c and then blk_detach_dev().
>
> I think the former would be the somehow more correct place but I can
> imagine the latter to be more useful in reality. I'll leave it to you.
>
> (As an alternative, you may of course convince me that this patch is
> indeed correct and should be taken as-is. :-))
ha-ha ;) no, I not going to try to convince you and that is why the
patch was
marked as RFC. I like release_drive() place. This should solve the problem
for me.

Thank you for the quick answer and good proposal.

Den




reply via email to

[Prev in Thread] Current Thread [Next in Thread]