Peng,
In my analysis, the root casue should be the lock: aio_context, qemu main thread do an unnecessary release/aquire action,
That's why IO thread could get the lock it shouldn't hold at this stage.
Thanks,
Michael
At 2021-02-01 20:44:00, "Vladimir Sementsov-Ogievskiy" <vsementsov@virtuozzo.com> wrote:
>Hi!
>
>01.02.2021 15:07, Peng Liang wrote:
>> Hi,
>>
>> I encountered the problem months ago too. Could we move the creation of
>> the block job (block_job_create) before appending the new bs to
>> mirror_top_bs (bdrv_append) as I wrote in [*]? I found that after
>> bdrv_append, qemu will use mirror_top_bs to do write. And when writing,
>> qemu will use bs->opaque, which maybe NULL.
>>
>> [*]
>> http://patchwork.ozlabs.org/project/qemu-devel/patch/20200826131910.1879079-1-liangpeng10@huawei.com/
>>
>
>In this patch you create job over original bs, when jobs are normally created over job-filter bs. I don't know is it wrong, but it at least requires some research, and probably the code that removes the filter should be adjusted somehow. Also, you make bs->opaque be non-zero. But still, job is not fully initialized, and some another problem may occur. So, do we create job prior to filter insertion or after it, parallel io requests to bs should not interrupt mirror_start_job(). So I think Michael's patch is closer to real problem to fix.
>
>
>--
>Best regards,
>Vladimir