qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug


From: Bin Wu
Subject: Re: [Qemu-devel] [PATCH] fix the co_queue multi-adding bug
Date: Tue, 10 Feb 2015 14:34:47 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 2015/2/9 17:23, Paolo Bonzini wrote:
> 
> 
> On 07/02/2015 10:51, w00214312 wrote:
>> From: Bin Wu <address@hidden>
>>
>> When we test the drive_mirror between different hosts by ndb devices, 
>> we find that, during the cancel phase the qemu process crashes sometimes.
>> By checking the crash core file, we find the stack as follows, which means
>> a coroutine re-enter error occurs:
> 
> This bug probably can be fixed simply by delaying the setting of
> recv_coroutine.
> 
> What are the symptoms if you only apply your "qemu-coroutine-lock: fix
> co_queue multi-adding bug" patch but not "qemu-coroutine: fix
> qemu_co_queue_run_restart error"?

These two patches are used to solve two different problems:
-"qemu-coroutine-lock: fix co_queue multi-adding bug" solves the coroutine
re-enter problem which is found when we send a cancel command after the
drive_mirror is just started.
-"qemu-coroutine: fix qemu_co_queue_run_restart error" solves the segfault
problem during drive_mirror phase of two VMs which copy large files between each
other.

> 
> Can you try the patch below?  (Compile-tested only).
> 
> diff --git a/block/nbd-client.c b/block/nbd-client.c
> index 6e1c97c..23d6a71 100644
> --- a/block/nbd-client.c
> +++ b/block/nbd-client.c
> @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s,
>      QEMUIOVector *qiov, int offset)
>  {
>      AioContext *aio_context;
> -    int rc, ret;
> +    int rc, ret, i;
>  
>      qemu_co_mutex_lock(&s->send_mutex);
> +
> +    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> +        if (s->recv_coroutine[i] == NULL) {
> +            s->recv_coroutine[i] = qemu_coroutine_self();
> +            break;
> +        }
> +    }
> +
> +    assert(i < MAX_NBD_REQUESTS);
> +    request->handle = INDEX_TO_HANDLE(s, i);
>      s->send_coroutine = qemu_coroutine_self();
> +
>      aio_context = bdrv_get_aio_context(s->bs);
>      aio_set_fd_handler(aio_context, s->sock,
>                         nbd_reply_ready, nbd_restart_write, s);
> @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
>  static void nbd_coroutine_start(NbdClientSession *s,
>     struct nbd_request *request)
>  {
> -    int i;
> -
>      /* Poor man semaphore.  The free_sema is locked when no other request
>       * can be accepted, and unlocked after receiving one reply.  */
>      if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
> @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
>      }
>      s->in_flight++;
>  
> -    for (i = 0; i < MAX_NBD_REQUESTS; i++) {
> -        if (s->recv_coroutine[i] == NULL) {
> -            s->recv_coroutine[i] = qemu_coroutine_self();
> -            break;
> -        }
> -    }
> -
> -    assert(i < MAX_NBD_REQUESTS);
> -    request->handle = INDEX_TO_HANDLE(s, i);
> +    /* s->recv_coroutine[i] is set as soon as we get the send_lock.  */
>  }
>  
>  static void nbd_coroutine_end(NbdClientSession *s,
> 
> 
> 

-- 
Bin Wu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]