qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] fix: avoid infinite loop when blockjob encounte


From: Max Reitz
Subject: Re: [Qemu-block] [PATCH] fix: avoid infinite loop when blockjob encountering failure
Date: Wed, 14 Jun 2017 15:12:40 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0

Thanks for your patch! The issue can be reproduced as follows:

$ qemu-img create -f qcow2 -b \
    "json:{'driver':'raw','file':{
        'driver':'blkdebug','inject-error':[{'event':'write_aio'}],
        'image':{'driver':'null-co'}}}" \
     overlay.qcow2
$ qemu-io -c 'write 0 64k' overlay.qcow2
$ qemu-img commit overlay.qcow2

While your patch fixes that issue, I still have some comments:

On 2017-06-14 08:22, sochin.jiang wrote:
> From: "sochin.jiang" <address@hidden>
> 
> img_commit could fall into infinite loop if it's blockjob

This should be "into an infinite loop" and "its" instead if "it's".

> 

This empty line should be omitted.

> fail encountering any I/O error. Try to fix it.

Should be "fails on any I/O error" or "fails on encountering any I/O
error". Also, you're not trying to fix it but let's all hope you really
are fixing it. :-)

(So "Fix it." instead of "Try to fix it.")

> 
> Signed-off-by: sochin.jiang <address@hidden>
> ---
>  qemu-img.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-img.c b/qemu-img.c
> index 0ad698d..6ba565d 100644
> --- a/qemu-img.c
> +++ b/qemu-img.c
> @@ -895,8 +895,11 @@ static void run_block_job(BlockJob *job, Error **errp)
>          aio_poll(aio_context, true);
>          qemu_progress_print(job->len ?
>                              ((float)job->offset / job->len * 100.f) : 0.0f, 
> 0);
> -    } while (!job->ready);
> +    } while (!job->ready && !job->ret);

I think it would be better to test job->completed instead of job->ret.

>  
> +    if (job->ret) {
> +        return;
> +    }

We shouldn't just return here but still do all the deinitialization like
call aio_context_release(). I guess the best would be to just skip the
block_job_complete_sync() call if job->completed is true.

>      block_job_complete_sync(job, errp);
>      aio_context_release(aio_context);

Then, there are three more issues I found while reviewing this patch:

First, if the block job is completed before block_job_complete_sync() is
called (i.e. if an error occurred), it is automatically freed. This is
bad because this means we'll have some instances of use-after-free here.
Therefore, we need to invoke block_job_ref() before run_block_job() and
block_job_unref() afterwards. (And since these functions are currenctly
static in blockjob.c, we'll have to make them global.)

Secondly, run_block_job() doesn't evaluate job->ret. Therefore it will
report success even if the commit failed (it is expecting
block_job_complete_sync() to put an error into errp, but it will not do
that). So we'll have to do that (manually check job->ret and if it's
negative, put an error message into errp; also, assert that
job->cancelled is false).

Thirdly, we have segfault in bdrv_reopen_prepare() if the image has
non-string options... I'll handle this one.

I can also handle the other two issues, if you'd like me to.


Finally, an iotest would be nice (see my reproducer above). But I can
handle that as well, if you decide not to write one.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]