[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/1] coroutine-lock: do not touch coroutine afte
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH 1/1] coroutine-lock: do not touch coroutine after another one has been entered |
Date: |
Tue, 30 May 2017 11:35:06 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 |
On 30/05/2017 10:53, Roman Pen wrote:
> The fix is obvious: use temporal queue and do not touch coroutine after
> first qemu_coroutine_enter() is invoked.
>
> The issue is quite rare and happens every ~12 hours on very high IO load
> (building linux kernel with -j512 inside guest) when IO throttling is
> enabled. With the fix applied guest is running ~35 hours and is still
> alive so far.
This is a good one. Thanks! Just a couple notes on English below.
> diff --git a/util/qemu-coroutine-lock.c b/util/qemu-coroutine-lock.c
> index 6328eed26bc6..026f5297f0f9 100644
> --- a/util/qemu-coroutine-lock.c
> +++ b/util/qemu-coroutine-lock.c
> @@ -77,10 +77,23 @@ void coroutine_fn qemu_co_queue_wait(CoQueue *queue,
> CoMutex *mutex)
> void qemu_co_queue_run_restart(Coroutine *co)
> {
> Coroutine *next;
> + QSIMPLEQ_HEAD(, Coroutine) tmp_queue_wakeup =
> + QSIMPLEQ_HEAD_INITIALIZER(tmp_queue_wakeup);
>
> trace_qemu_co_queue_run_restart(co);
> - while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) {
> - QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next);
> +
> + /*
> + * We use temporal queue in order not to touch 'co' after entering
> + * 'next' coroutine. The thing is that 'next' coroutine can resume
> + * current 'co' coroutine and eventually terminate (free) it (see
> + * linux-aio.c: ioq_submit() where qemu_laio_process_completions()
> + * is invoked). The rule of thumb is simple: do not touch coroutine
> + * when you enter another one.
> + */
/* Because "co" has yielded, any coroutine that we wakeup can resume it.
* If this happens and "co" terminates, co->co_queue_wakeup becomes
* invalid memory. Therefore, use a temporary queue and do not touch
* the "co" coroutine as soon as you enter another one.
*/
> + QSIMPLEQ_CONCAT(&tmp_queue_wakeup, &co->co_queue_wakeup);
> +
> + while ((next = QSIMPLEQ_FIRST(&tmp_queue_wakeup))) {
> + QSIMPLEQ_REMOVE_HEAD(&tmp_queue_wakeup, co_queue_next);
> qemu_coroutine_enter(next);
> }
> }
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index 486af9a62275..d020b63742af 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -126,6 +126,13 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine
> *co)
>
> qemu_co_queue_run_restart(co);
>
> + /*
> + * BEWARE: in case of ret == COROUTINE_YIELD here at this point
> + * after qemu_co_queue_run_restart() 'co' can be already
> + * freed by other coroutine, which has entered 'co'. So
> + * be careful and do not touch it.
> + */
/* Beware, if ret == COROUTINE_YIELD and qemu_co_queue_run_restart()
* has started any other coroutine, "co" might have been reentered
* and even freed by now! So be careful and do not touch it.
*/
> switch (ret) {
> case COROUTINE_YIELD:
> return;
>
Paolo