[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_m
From: |
Prerna Saxena |
Subject: |
Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table. |
Date: |
Sun, 14 Aug 2016 09:42:11 +0000 |
On 14/08/16 8:21 am, "Michael S. Tsirkin" <address@hidden> wrote:
>On Fri, Aug 12, 2016 at 07:16:34AM +0000, Prerna Saxena wrote:
>>
>> On 12/08/16 12:08 pm, "Fam Zheng" <address@hidden> wrote:
>>
>>
>>
>>
>>
>> >On Wed, 08/10 18:30, Michael S. Tsirkin wrote:
>> >> From: Prerna Saxena <address@hidden>
>> >>
>> >> The set_mem_table command currently does not seek a reply. Hence, there is
>> >> no easy way for a remote application to notify to QEMU when it finished
>> >> setting up memory, or if there were errors doing so.
>> >>
>> >> As an example:
>> >> (1) Qemu sends a SET_MEM_TABLE to the backend (eg, a vhost-user net
>> >> application). SET_MEM_TABLE does not require a reply according to the
>> >> spec.
>> >> (2) Qemu commits the memory to the guest.
>> >> (3) Guest issues an I/O operation over a new memory region which was
>> >> configured on (1).
>> >> (4) The application has not yet remapped the memory, but it sees the I/O
>> >> request.
>> >> (5) The application cannot satisfy the request because it does not know
>> >> about those GPAs.
>> >>
>> >> While a guaranteed fix would require a protocol extension (committed
>> >> separately),
>> >> a best-effort workaround for existing applications is to send a
>> >> GET_FEATURES
>> >> message before completing the vhost_user_set_mem_table() call.
>> >> Since GET_FEATURES requires a reply, an application that processes
>> >> vhost-user
>> >> messages synchronously would probably have completed the SET_MEM_TABLE
>> >> before replying.
>> >>
>> >> Signed-off-by: Prerna Saxena <address@hidden>
>> >> Reviewed-by: Michael S. Tsirkin <address@hidden>
>> >> Signed-off-by: Michael S. Tsirkin <address@hidden>
>> >
>> >Sporadic hangs are seen with test-vhost-user after this patch:
>> >
>> >https://travis-ci.org/qemu/qemu/builds
>> >
>> >Reverting seems to fix it for me.
>> >
>> >Is this a known problem?
>> >
>> >Fam
>>
>> Hi Fam,
>> Thanks for reporting the sporadic hangs. I had seen ‘make check’ pass on my
>> Centos 6 environment, so missed this.
>> I am setting up the docker test env to repro this, but I think I can guess
>> the problem :
>>
>> In tests/vhost-user-test.c:
>>
>> static void chr_read(void *opaque, const uint8_t *buf, int size)
>> {
>> ..[snip]..
>>
>> case VHOST_USER_SET_MEM_TABLE:
>> /* received the mem table */
>> memcpy(&s->memory, &msg.payload.memory, sizeof(msg.payload.memory));
>> s->fds_num = qemu_chr_fe_get_msgfds(chr, s->fds,
>> G_N_ELEMENTS(s->fds));
>>
>>
>> /* signal the test that it can continue */
>> g_cond_signal(&s->data_cond);
>> break;
>> ..[snip]..
>> }
>>
>>
>> The test seems to be marked complete as soon as mem_table is copied.
>> However, this patch 3/3 changes the behaviour of the SET_MEM_TABLE vhost
>> command implementation with qemu. SET_MEM_TABLE now sends out a new message
>> GET_FEATURES, and the call is only completed once it receives features from
>> the remote application. (or the test framework, as is the case here.)
>
>Hmm but why does it matter that data_cond is woken up?
Michael, sorry, I didn’t quite understand that. Could you pls explain ?
>
>
>> While the test itself can be modified (Do not signal completion until we’ve
>> sent a follow-up response to GET_FEATURES), I am now wondering if this patch
>> may break existing vhost applications too ? If so, reverting it possibly
>> better.
>
>What bothers me is that the new feature might cause the same
>issue once we enable it in the test.
No it wont. The new feature is a protocol extension, and only works if it has
been negotiated with. If not negotiated, that part of code is never executed.
>
>How about a patch to tests/vhost-user-test.c adding the new
>protocol feature? I would be quite interested to see what
>is going on with it.
Yes that can be done. But you can see that the protocol extension patch will
not change the behaviour of the _existing_ test.
>
>
>> What confuses me is why it doesn’t fail all the time, but only about 20% to
>> 30% time as Fam reports.
>
>And succeeds every time on my systems :(
+1 to that :( I have had no luck repro’ing it.
>
>>
>> Thoughts : Michael, Fam, MarcAndre ?
>>
>> Regards,
>>
Prerna
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., (continued)
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/12
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Marc-André Lureau, 2016/08/12
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/12
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Marc-André Lureau, 2016/08/13
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/13
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/13
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/13
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table.,
Prerna Saxena <=
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Michael S. Tsirkin, 2016/08/14
- Re: [Qemu-devel] [PULL 3/3] vhost-user: Attempt to fix a race with set_mem_table., Maxime Coquelin, 2016/08/31
Re: [Qemu-devel] [PULL 0/3] virtio/vhost: fixes, Peter Maydell, 2016/08/10