[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH] fix: avoid infinite loop when blockjob encounte
From: |
Max Reitz |
Subject: |
Re: [Qemu-block] [PATCH] fix: avoid infinite loop when blockjob encountering failure |
Date: |
Wed, 14 Jun 2017 15:12:40 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 |
Thanks for your patch! The issue can be reproduced as follows:
$ qemu-img create -f qcow2 -b \
"json:{'driver':'raw','file':{
'driver':'blkdebug','inject-error':[{'event':'write_aio'}],
'image':{'driver':'null-co'}}}" \
overlay.qcow2
$ qemu-io -c 'write 0 64k' overlay.qcow2
$ qemu-img commit overlay.qcow2
While your patch fixes that issue, I still have some comments:
On 2017-06-14 08:22, sochin.jiang wrote:
> From: "sochin.jiang" <address@hidden>
>
> img_commit could fall into infinite loop if it's blockjob
This should be "into an infinite loop" and "its" instead if "it's".
>
This empty line should be omitted.
> fail encountering any I/O error. Try to fix it.
Should be "fails on any I/O error" or "fails on encountering any I/O
error". Also, you're not trying to fix it but let's all hope you really
are fixing it. :-)
(So "Fix it." instead of "Try to fix it.")
>
> Signed-off-by: sochin.jiang <address@hidden>
> ---
> qemu-img.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/qemu-img.c b/qemu-img.c
> index 0ad698d..6ba565d 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -895,8 +895,11 @@ static void run_block_job(BlockJob *job, Error **errp)
> aio_poll(aio_context, true);
> qemu_progress_print(job->len ?
> ((float)job->offset / job->len * 100.f) : 0.0f,
> 0);
> - } while (!job->ready);
> + } while (!job->ready && !job->ret);
I think it would be better to test job->completed instead of job->ret.
>
> + if (job->ret) {
> + return;
> + }
We shouldn't just return here but still do all the deinitialization like
call aio_context_release(). I guess the best would be to just skip the
block_job_complete_sync() call if job->completed is true.
> block_job_complete_sync(job, errp);
> aio_context_release(aio_context);
Then, there are three more issues I found while reviewing this patch:
First, if the block job is completed before block_job_complete_sync() is
called (i.e. if an error occurred), it is automatically freed. This is
bad because this means we'll have some instances of use-after-free here.
Therefore, we need to invoke block_job_ref() before run_block_job() and
block_job_unref() afterwards. (And since these functions are currenctly
static in blockjob.c, we'll have to make them global.)
Secondly, run_block_job() doesn't evaluate job->ret. Therefore it will
report success even if the commit failed (it is expecting
block_job_complete_sync() to put an error into errp, but it will not do
that). So we'll have to do that (manually check job->ret and if it's
negative, put an error message into errp; also, assert that
job->cancelled is false).
Thirdly, we have segfault in bdrv_reopen_prepare() if the image has
non-string options... I'll handle this one.
I can also handle the other two issues, if you'd like me to.
Finally, an iotest would be nice (see my reproducer above). But I can
handle that as well, if you decide not to write one.
Max
signature.asc
Description: OpenPGP digital signature