[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v8 00/13] vhost-user: support any POSIX system (tested on mac
From: |
Stefano Garzarella |
Subject: |
Re: [PATCH v8 00/13] vhost-user: support any POSIX system (tested on macOS, FreeBSD, OpenBSD) |
Date: |
Fri, 5 Jul 2024 10:39:33 +0200 |
On Wed, Jul 03, 2024 at 06:49:30PM GMT, Michael S. Tsirkin wrote:
On Tue, Jun 18, 2024 at 12:00:30PM +0200, Stefano Garzarella wrote:
As discussed with Michael and Markus [1], this version also includes the patch
on which v7 depended to simplify the merge in Michael's tree.
The series is all reviewed, so if there are no new changes required, I would
ask to merge it.
I dropped patches 9 and 10 for now since otherwise make vm-build-freebsd
fails.
Pls figure it out and resend just 9 and 10.
I replicated locally, but I can't understand why it only happens in
certain architectures, in my case on loongarch64, ppc64, and riscv32:
326/846 qemu:qtest+qtest-loongarch64 / qtest-loongarch64/qos-test
ERROR 116.10s killed by signal 6 SIGABRT
337/846 qemu:qtest+qtest-ppc64 / qtest-ppc64/qos-test
ERROR 115.10s killed by signal 6 SIGABRT
339/846 qemu:qtest+qtest-riscv32 / qtest-riscv32/qos-test
ERROR 107.65s killed by signal 6 SIGABRT
I focused on ppc64 running `gmake --output-sync -j6 check-qtest-ppc64`
in the FreeBSD VM and it fails every time. In particular, the test that
fails is the `vhost-user/reconnect` test, in fact disabling it this way,
the qos-test tests always pass:
diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index 0fa8951c9f..c3d686f0ee 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -1118,9 +1119,11 @@ static void register_vhost_user_test(void)
"virtio-net",
test_migrate, &opts);
+#if 0
opts.before = vhost_user_test_setup_reconnect;
qos_add_test("vhost-user/reconnect", "virtio-net",
test_reconnect, &opts);
+#endif
opts.before = vhost_user_test_setup_connect_fail;
qos_add_test("vhost-user/connect-fail", "virtio-net",
Analyzing the test, what happens is that after the disconnection, the
test doesn't receive VHOST_USER_SET_MEM_TABLE message, so the second
`wait_for_fds()` fails after the 5 sec timeout (increasing it doesn't
help), not having received the fds.
diff --git a/tests/qtest/vhost-user-test.c b/tests/qtest/vhost-user-test.c
index 0fa8951c9f..c3d686f0ee 100644
--- a/tests/qtest/vhost-user-test.c
+++ b/tests/qtest/vhost-user-test.c
@@ -976,6 +976,7 @@ static void test_reconnect(void *obj, void *arg,
QGuestAllocator *alloc)
g_source_set_callback(src, reconnect_cb, s, NULL);
g_source_attach(src, s->context);
g_source_unref(src);
+ // THIS one is failing
g_assert(wait_for_fds(s));
wait_for_rings_started(s, 2);
}
This is the test log (note: IIUC QEMU failures happen after the test
exits on the assertion, so so it could mean that the chardev reconnected
correctly):
▶ 28/30
/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/reconnect
- ERROR:../src/tests/qtest/qos-test.c:191:subprocess_run_one_test: child
process
(/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/reconnect/subprocess
[54991]) failed unexpectedly FAIL
▶ 28/30
ERROR
[28-30/30] 🌒 qemu:qtest+qtest-ppc64 / qtest-ppc64/qmp-cmd-test
[28-30/30] 🌓 qemu:qtest+qtest-ppc64 / qtest-ppc64/migration-test
28/30 qemu:qtest+qtest-ppc64 / qtest-ppc64/qos-test
ERROR 21.53s killed by signal 6 SIGABRT
>>> PYTHON=/usr/home/qemu/qemu-test.OD8v2L/build/pyvenv/bin/python3.9
ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1
G_TEST_DBUS_DAEMON=/usr/home/qemu/qemu-test.OD8v2L/src/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-ppc64 MALLOC_PERTURB_=141 QTEST_QEMU_IMG=./qemu-img
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
/usr/home/qemu/qemu-test.OD8v2L/build/tests/qtest/qos-test --tap -k
―――――――――――――――――――――――――――――――――――――――― ✀
――――――――――――――――――――――――――――――――――――――――
stderr:
Vhost user backend fails to broadcast fake RARP
qemu-system-ppc64: -chardev
socket,id=chr-reconnect,path=/tmp/vhost-test-Z5VMQ2/reconnect.sock,server=on:
info: QEMU waiting for connection on:
disconnected:unix:/tmp/vhost-test-Z5VMQ2/reconnect.sock,server=on
**
ERROR:../src/tests/qtest/vhost-user-test.c:255:wait_for_fds: assertion failed:
(s->fds_num)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost VQ 0 ring restore failed: -22: Invalid argument
(22)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost_set_vring_endian failed: Invalid argument (22)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost VQ 1 ring restore failed: -22: Invalid argument
(22)
qemu-system-ppc64: Failed to set msg fds.
qemu-system-ppc64: vhost_set_vring_endian failed: Invalid argument (22)
qemu-system-ppc64: Failed to write msg. Wrote -1 instead of 12.
qemu-system-ppc64: vhost_backend_init failed: Protocol error
qemu-system-ppc64: failed to init vhost_net for queue 0
**
ERROR:../src/tests/qtest/qos-test.c:191:subprocess_run_one_test: child
process
(/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/virtio-net-pci/virtio-net/virtio-net-tests/vhost-user/reconnect/subprocess
[54991]) failed unexpectedly
(test program exited with status code -6)
I would think of some endianness problem, but it's strange that it only
happens in the reconnect test. Next week I'll try to figure out why this
is systematic only on some architectures, does anyone have any ideas?
Thanks,
Stefano