[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] rdma: Fix cleanup in error paths
From: |
Juan Quintela |
Subject: |
Re: [Qemu-devel] [PATCH] rdma: Fix cleanup in error paths |
Date: |
Wed, 25 Mar 2015 12:20:09 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux) |
Padmanabh Ratnakar <address@hidden> wrote:
> As part of commit e325b49a320b493cc5d69e263751ff716dc458fe,
> order in which resources are destroyed was changed for fixing
> a seg fault. Due to this change, CQ will never get destroyed as
> CQ should be destroyed after QP destruction. Seg fault is caused
> improper cleanup when connection fails. Fixing cleanup after
> connection failure and order in which resources are destroyed
> in qemu_rdma_cleanup() routine.
>
> Signed-off-by: Meghana Cheripady <address@hidden>
> Signed-off-by: Padmanabh Ratnakar <address@hidden>
> ---
> migration/rdma.c | 22 ++++++++--------------
> 1 files changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/migration/rdma.c b/migration/rdma.c
> index e6c3a67..77e3444 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2194,6 +2194,10 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> }
> }
>
> + if (rdma->qp) {
> + rdma_destroy_qp(rdma->cm_id);
> + rdma->qp = NULL;
> + }
Agreed with this change.
> if (rdma->cq) {
> ibv_destroy_cq(rdma->cq);
> rdma->cq = NULL;
> @@ -2206,18 +2210,14 @@ static void qemu_rdma_cleanup(RDMAContext *rdma)
> ibv_dealloc_pd(rdma->pd);
> rdma->pd = NULL;
> }
> - if (rdma->listen_id) {
> - rdma_destroy_id(rdma->listen_id);
> - rdma->listen_id = NULL;
> - }
I am not sure about this one. We have (receiving side)
create listen_id
cm_id = rdma_accept(listen_id)
So, it looks better to do the cleanup in the other order, but notice
that with current code, I think this don't matter (i.e. we can't call
qemu_rdma_cleanup() from other place that accept).
> if (rdma->cm_id) {
> - if (rdma->qp) {
> - rdma_destroy_qp(rdma->cm_id);
> - rdma->qp = NULL;
> - }
This is the "companion" of the 1st chunk, and I agree with the c"
> rdma_destroy_id(rdma->cm_id);
> rdma->cm_id = NULL;
> }
> + if (rdma->listen_id) {
> + rdma_destroy_id(rdma->listen_id);
> + rdma->listen_id = NULL;
> + }
Companion of the second chunk that I can't understand why it is moved.
> if (rdma->channel) {
> rdma_destroy_event_channel(rdma->channel);
> rdma->channel = NULL;
> @@ -2309,8 +2309,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> if (ret) {
> perror("rdma_connect");
> ERROR(errp, "connecting to destination!");
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
This other three are nice, remove code, and make it correct with the
case that qp has to be removed first.
So, should we drop the listen_id part, or there is a reason for it?
Later, Juan.
> @@ -2319,8 +2317,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> perror("rdma_get_cm_event after rdma_connect");
> ERROR(errp, "connecting to destination!");
> rdma_ack_cm_event(cm_event);
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
>
> @@ -2328,8 +2324,6 @@ static int qemu_rdma_connect(RDMAContext *rdma, Error
> **errp)
> perror("rdma_get_cm_event != EVENT_ESTABLISHED after rdma_connect");
> ERROR(errp, "connecting to destination!");
> rdma_ack_cm_event(cm_event);
> - rdma_destroy_id(rdma->cm_id);
> - rdma->cm_id = NULL;
> goto err_rdma_source_connect;
> }
> rdma->connected = true;