[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 2/3] block: wait for job callback in block_job_c
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH 2/3] block: wait for job callback in block_job_cancel_sync |
Date: |
Thu, 19 Apr 2012 11:31:10 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Wed, Apr 18, 2012 at 03:12:02PM +0200, Paolo Bonzini wrote:
> The limitation on not having I/O after cancellation cannot really be
> held. Even streaming has a very small race window where you could
> cancel a job and have it report completion. If this window is hit,
> bdrv_change_backing_file() will yield and possibly cause accesses to
> dangling pointers etc.
>
> So, let's just assume that we cannot know exactly what will happen
> after the coroutine has set busy to false. We can set a very lax
> condition:
>
> - if we cancel the job, the coroutine won't set it to false again
> (and hence will not call co_sleep_ns again).
>
> - block_job_cancel_sync will wait for the coroutine to exit, which
> pretty much ensures no race.
>
> Instead, we put a very strict condition on what to do while busy =
> false. We track the coroutine that executes the job and reenter it
> (thus cancelling a wait for example) before block_job_cancel restarts.
> Thus you cannot really do anything but co_sleep_ns at that time.
>
> This patch also drains the I/O *before* canceling the job, so that
> block_job_cancel is quite sure to find the coroutine in quiescent
> state (busy = false). For mirroring, this means that the job will
> complete itself with a consistent view of the disk.
>
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
> block.c | 41 +++++++++++++++++++++++++++++++++++++++--
> block/stream.c | 7 +++----
> block_int.h | 17 ++++++++++++-----
> 3 files changed, 54 insertions(+), 11 deletions(-)
>
> diff --git a/block.c b/block.c
> index 9c7d896..48f4414 100644
> --- a/block.c
> +++ b/block.c
> @@ -4217,7 +4217,15 @@ int block_job_set_speed(BlockJob *job, int64_t value)
>
> void block_job_cancel(BlockJob *job)
> {
> + /* Complete all guest I/O before cancelling the job, so that if the
> + * job chooses to complete itself it will do so with a consistent
> + * view of the disk.
> + */
> + bdrv_drain_all();
> job->cancelled = true;
> + if (job->co && !job->busy) {
> + qemu_coroutine_enter(job->co, NULL);
> + }
> }
block_job_cancel() is not supposed to block. Now it will wait however
long it takes to complete in-flight I/O. It has become synchronous and
therefore the point of having a completion/callback is gone.
If there really is no asynchronous solution it would be cleaner to throw
away all the cancellation/completion logic and just do it synchronously.
Stefan