qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v6 00/33] vhost-user reconnect fixes


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v6 00/33] vhost-user reconnect fixes
Date: Fri, 29 Jul 2016 06:17:42 +0300

On Wed, Jul 27, 2016 at 01:14:54AM +0400, address@hidden wrote:
> From: Marc-André Lureau <address@hidden>
> 
> Hi,
> 
> Since 'vhost-user: simple reconnection support' has been merged, it is
> possible to disconnect and reconnect a vhost-user backend. However,
> many code paths in qemu may trigger assert() when the backend is
> disconnected.
> 
> There are also code paths that are wrong, see "don't assume opaque is
> a fd" patch for an example. Once those patches are reviewed & merged,
> they are good candidates for stable too.

Thanks, I merged most of these except the new tests.

> Some assert() could simply be replaced by error_report() or silently
> fail since they are recoverable cases. Some missing error checks can
> also help prevent later issues. The errors are reported up to vhost.c,
> as the vhost-user backend alone doesn't handle disconnected state
> transparently so far. There are still problematic code paths left
> after this series, for example, starting a migration with a
> disconnected backend will abort(). It is likely that other problematic
> code path exists (vhost_scsi_start failure is fatal, but there are no
> vhost-user backend that I know yet).
> 
> In many cases, the code assumes get_vhost_net() will be non-NULL after
> a succesful connection, so I changed it to stay after a disconnect
> until the new connection comes (as suggested by Michael).
> 
> Since there is feature checks on reconnection, qemu should wait for
> the initial connection feature negotiation to complete. The test added
> demonstrates this. Additionally, a regression was found during v2,
> which could have been spotted with a multiqueue test, so add a basic
> one that would have exhibited the issue.
> 
> For convenience, the series is also available on:
> https://github.com/elmarco/qemu, branch vhost-user-reconnect
> 
> v6:
> - fixes after Ilya Maximets review
> - add "disconnect on HUP" for cases where peer disconnect doesn't
>   trigger qemu disconnect event (reconnect won't work)
> - add a flags mismatch on reconnect test
> 
> v5:
> - rebased
> - use a VHOST_OPS_DEBUG macro to print vhost_ops errors
> - replace assert(foo != NULL) with assert(foo)
> - add "RFC: vhost: do not update last avail idx"
> 
> v4:
> - change notify_migration_done() patch to be VHOST_BACKEND_TYPE_USER
>   specific, to avoid having to handle the case where the backend
>   doesn't implement the callback
> - change vhost_dev_cleanup() to assert on empty log, instead of
>   adding a call to vhost_log_put()
> - made "keep vhost_net after a disconnection" more robust, got rid of
>   the acked_features field
> - improve commit log, and some patch reorganization for clarity
> 
> v3:
> - add vhost-user multiqueue test, which would have helped to find the
>   following fix
> - fix waiting on vhost-user connection with multiqueue (found by
>   Yuanhan Liu)
> - merge vhost_user_{read,write}() error checking patches
> - add error reporting to vhost_user_read() (similar to
>   vhost_user_write())
> - add a vhost_net_set_backend() wrapper to help with potential crash
> - some leak fixes
> 
> v2:
> - some patch ordering: minor fix, close(fd) fix,
>   assert/fprintf->error_report, check and return error,
>   vhost_dev_cleanup() fixes, keep vhost_net after a disconnect, wait
>   until connection is ready
> - merge read/write error checks
> - do not rely on link state to check vhost-user init completed
> 
> Marc-André Lureau (33):
>   misc: indentation
>   vhost-user: minor simplification
>   vhost-user: disconnect on HUP
>   vhost: don't assume opaque is a fd, use backend cleanup
>   vhost: make vhost_log_put() idempotent
>   vhost: assert the log was cleaned up
>   vhost: fix cleanup on not fully initialized device
>   vhost: make vhost_dev_cleanup() idempotent
>   vhost-net: always call vhost_dev_cleanup() on failure
>   vhost: fix calling vhost_dev_cleanup() after vhost_dev_init()
>   vhost: do not assert() on vhost_ops failure
>   vhost: add missing VHOST_OPS_DEBUG
>   vhost: use error_report() instead of fprintf(stderr,...)
>   qemu-char: fix qemu_chr_fe_set_msgfds() crash when disconnected
>   vhost-user: call set_msgfds unconditionally
>   vhost-user: check qemu_chr_fe_set_msgfds() return value
>   vhost-user: check vhost_user_{read,write}() return value
>   vhost-user: keep vhost_net after a disconnection
>   vhost-user: add get_vhost_net() assertions
>   Revert "vhost-net: do not crash if backend is not present"
>   vhost-net: vhost_migration_done is vhost-user specific
>   vhost: add assert() to check runtime behaviour
>   char: add chr_wait_connected callback
>   char: add and use tcp_chr_wait_connected
>   vhost-user: wait until backend init is completed
>   tests: plug some leaks in virtio-net-test
>   tests: fix vhost-user-test leak
>   tests: add /vhost-user/connect-fail test
>   tests: add a simple /vhost-user/multiqueue test
>   vhost-user: add error report in vhost_user_write()
>   vhost: add vhost_net_set_backend()
>   vhost-user-test: add flags mismatch test
>   RFC: vhost: do not update last avail idx on get_vring_base() failure
> 
>  hw/net/vhost_net.c        |  34 +++-----
>  hw/virtio/vhost-user.c    |  67 ++++++++++-----
>  hw/virtio/vhost.c         | 162 +++++++++++++++++++++++-------------
>  include/hw/virtio/vhost.h |   4 +
>  include/sysemu/char.h     |   8 ++
>  net/tap.c                 |   1 +
>  net/vhost-user.c          |  65 +++++++++------
>  qemu-char.c               |  82 ++++++++++++------
>  tests/Makefile.include    |   2 +-
>  tests/vhost-user-test.c   | 206 
> +++++++++++++++++++++++++++++++++++++++++++++-
>  tests/virtio-net-test.c   |  12 ++-
>  11 files changed, 486 insertions(+), 157 deletions(-)
> 
> -- 
> 2.9.0



reply via email to

[Prev in Thread] Current Thread [Next in Thread]