[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH v6 00/33] vhost-user reconnect fixes
From: |
marcandre . lureau |
Subject: |
[Qemu-devel] [PATCH v6 00/33] vhost-user reconnect fixes |
Date: |
Wed, 27 Jul 2016 01:14:54 +0400 |
From: Marc-André Lureau <address@hidden>
Hi,
Since 'vhost-user: simple reconnection support' has been merged, it is
possible to disconnect and reconnect a vhost-user backend. However,
many code paths in qemu may trigger assert() when the backend is
disconnected.
There are also code paths that are wrong, see "don't assume opaque is
a fd" patch for an example. Once those patches are reviewed & merged,
they are good candidates for stable too.
Some assert() could simply be replaced by error_report() or silently
fail since they are recoverable cases. Some missing error checks can
also help prevent later issues. The errors are reported up to vhost.c,
as the vhost-user backend alone doesn't handle disconnected state
transparently so far. There are still problematic code paths left
after this series, for example, starting a migration with a
disconnected backend will abort(). It is likely that other problematic
code path exists (vhost_scsi_start failure is fatal, but there are no
vhost-user backend that I know yet).
In many cases, the code assumes get_vhost_net() will be non-NULL after
a succesful connection, so I changed it to stay after a disconnect
until the new connection comes (as suggested by Michael).
Since there is feature checks on reconnection, qemu should wait for
the initial connection feature negotiation to complete. The test added
demonstrates this. Additionally, a regression was found during v2,
which could have been spotted with a multiqueue test, so add a basic
one that would have exhibited the issue.
For convenience, the series is also available on:
https://github.com/elmarco/qemu, branch vhost-user-reconnect
v6:
- fixes after Ilya Maximets review
- add "disconnect on HUP" for cases where peer disconnect doesn't
trigger qemu disconnect event (reconnect won't work)
- add a flags mismatch on reconnect test
v5:
- rebased
- use a VHOST_OPS_DEBUG macro to print vhost_ops errors
- replace assert(foo != NULL) with assert(foo)
- add "RFC: vhost: do not update last avail idx"
v4:
- change notify_migration_done() patch to be VHOST_BACKEND_TYPE_USER
specific, to avoid having to handle the case where the backend
doesn't implement the callback
- change vhost_dev_cleanup() to assert on empty log, instead of
adding a call to vhost_log_put()
- made "keep vhost_net after a disconnection" more robust, got rid of
the acked_features field
- improve commit log, and some patch reorganization for clarity
v3:
- add vhost-user multiqueue test, which would have helped to find the
following fix
- fix waiting on vhost-user connection with multiqueue (found by
Yuanhan Liu)
- merge vhost_user_{read,write}() error checking patches
- add error reporting to vhost_user_read() (similar to
vhost_user_write())
- add a vhost_net_set_backend() wrapper to help with potential crash
- some leak fixes
v2:
- some patch ordering: minor fix, close(fd) fix,
assert/fprintf->error_report, check and return error,
vhost_dev_cleanup() fixes, keep vhost_net after a disconnect, wait
until connection is ready
- merge read/write error checks
- do not rely on link state to check vhost-user init completed
Marc-André Lureau (33):
misc: indentation
vhost-user: minor simplification
vhost-user: disconnect on HUP
vhost: don't assume opaque is a fd, use backend cleanup
vhost: make vhost_log_put() idempotent
vhost: assert the log was cleaned up
vhost: fix cleanup on not fully initialized device
vhost: make vhost_dev_cleanup() idempotent
vhost-net: always call vhost_dev_cleanup() on failure
vhost: fix calling vhost_dev_cleanup() after vhost_dev_init()
vhost: do not assert() on vhost_ops failure
vhost: add missing VHOST_OPS_DEBUG
vhost: use error_report() instead of fprintf(stderr,...)
qemu-char: fix qemu_chr_fe_set_msgfds() crash when disconnected
vhost-user: call set_msgfds unconditionally
vhost-user: check qemu_chr_fe_set_msgfds() return value
vhost-user: check vhost_user_{read,write}() return value
vhost-user: keep vhost_net after a disconnection
vhost-user: add get_vhost_net() assertions
Revert "vhost-net: do not crash if backend is not present"
vhost-net: vhost_migration_done is vhost-user specific
vhost: add assert() to check runtime behaviour
char: add chr_wait_connected callback
char: add and use tcp_chr_wait_connected
vhost-user: wait until backend init is completed
tests: plug some leaks in virtio-net-test
tests: fix vhost-user-test leak
tests: add /vhost-user/connect-fail test
tests: add a simple /vhost-user/multiqueue test
vhost-user: add error report in vhost_user_write()
vhost: add vhost_net_set_backend()
vhost-user-test: add flags mismatch test
RFC: vhost: do not update last avail idx on get_vring_base() failure
hw/net/vhost_net.c | 34 +++-----
hw/virtio/vhost-user.c | 67 ++++++++++-----
hw/virtio/vhost.c | 162 +++++++++++++++++++++++-------------
include/hw/virtio/vhost.h | 4 +
include/sysemu/char.h | 8 ++
net/tap.c | 1 +
net/vhost-user.c | 65 +++++++++------
qemu-char.c | 82 ++++++++++++------
tests/Makefile.include | 2 +-
tests/vhost-user-test.c | 206 +++++++++++++++++++++++++++++++++++++++++++++-
tests/virtio-net-test.c | 12 ++-
11 files changed, 486 insertions(+), 157 deletions(-)
--
2.9.0
- [Qemu-devel] [PATCH v6 00/33] vhost-user reconnect fixes,
marcandre . lureau <=
- [Qemu-devel] [PATCH v6 01/33] misc: indentation, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 02/33] vhost-user: minor simplification, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 03/33] vhost-user: disconnect on HUP, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 04/33] vhost: don't assume opaque is a fd, use backend cleanup, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 05/33] vhost: make vhost_log_put() idempotent, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 06/33] vhost: assert the log was cleaned up, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 07/33] vhost: fix cleanup on not fully initialized device, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 08/33] vhost: make vhost_dev_cleanup() idempotent, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 09/33] vhost-net: always call vhost_dev_cleanup() on failure, marcandre . lureau, 2016/07/26
- [Qemu-devel] [PATCH v6 10/33] vhost: fix calling vhost_dev_cleanup() after vhost_dev_init(), marcandre . lureau, 2016/07/26