[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-stable][PATCH] rdma: fix multiple VMs parallel mi
From: |
Frank Yang |
Subject: |
Re: [Qemu-devel] [Qemu-stable][PATCH] rdma: fix multiple VMs parallel migration |
Date: |
Wed, 04 Sep 2013 11:23:36 +0800 |
User-agent: |
Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/20130801 Thunderbird/17.0.8 |
On 2013-9-3 22:13, Michael R. Hines wrote:
>
> No top-posting, please.
>
> On 09/03/2013 12:20 AM, Frank Yang wrote:
>> Yes, it depends on low-level implementation. During my earlier test,
>> using one CQ to send and receive may cause packet loss with heavy load:
>> the destination thinks it send READY message successfully but the source
>> still waits for it. This situation always happens when the destination polls
>> receive CQE first.
>>
>> So I think using only one CQ may cause packet conflict or something like
>> that,
>> and it should be the driver bug. However, using two CQs fix the problem.
>>
>>
>
> This doesn't seem like a very clear answer ..... are you sure its packet loss?
>
> The queue pairs are supposed to be reliable - I've never experienced a
> situation
> where packets were simply "dropped" for no reason without breaking the
> connection and putting the QP into an error state.
>
> - Michael
>
The fact is
1. The destination polls sending of READY message successfully. Either READY
message is sent successfully indeed and the source does not receive it, or
the
destination dose not send READY message out at all.
2. I've tried to send READY message again by adding some codes during the
migration.
Source can receive the READY message successfully. So the connection is not
broken and the QP works fine.
The packet loss what I'm talking about does not only refer to the loss during
the
transmission. The message may also not be sent out successfully actually.
ibv_poll_cq()
returns with no error, but the source dosen't receive message. For qemu, the
message
it sent is lost.