qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-block] Intermittent hang of iotest 194 (bdrv_drain_all after non-s


From: Max Reitz
Subject: [Qemu-block] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)
Date: Thu, 9 Nov 2017 01:48:58 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0

Hi,

More exciting news from the bdrv_drain() front!

I've noticed in the past that iotest 194 sometimes hangs.  I usually run
the tests on tmpfs, but I've just now verified that it happens on my SSD
just as well.

So the reproducer is a plain:

while ./check -raw 194; do; done

(No difference between raw or qcow2, though.)

And then, after a couple of runs (or a couple ten), it will just hang.
The reason is that the source VM lingers around and doesn't quit
voluntarily -- the test itself was successful, but it just can't exit.

If you force it to exit by killing the VM (e.g. through pkill -11 qemu),
this is the backtrace:

#0  0x00007f7cfc297e06 in ppoll () at /lib64/libc.so.6
#1  0x0000563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0,
__nfds=<optimized out>, __fds=<optimized out>) at
/usr/include/bits/poll2.h:77
#2  0x0000563b846bcac9 in qemu_poll_ns (fds=<optimized out>,
nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322
#3  0x0000563b846be711 in aio_poll (address@hidden,
blocking=<optimized out>) at util/aio-posix.c:629
#4  0x0000563b8463afa4 in bdrv_drain_recurse
(address@hidden, address@hidden) at block/io.c:201
#5  0x0000563b8463baff in bdrv_drain_all_begin () at block/io.c:381
#6  0x0000563b8463bc99 in bdrv_drain_all () at block/io.c:411
#7  0x0000563b8459888b in block_migration_cleanup (opaque=<optimized
out>) at migration/block.c:714
#8  0x0000563b845883be in qemu_savevm_state_cleanup () at
migration/savevm.c:1251
#9  0x0000563b845811fd in migration_thread (opaque=0x563b856f1da0) at
migration/migration.c:2298
#10 0x00007f7cfc56f36d in start_thread () at /lib64/libpthread.so.0
#11 0x00007f7cfc2a3e1f in clone () at /lib64/libc.so.6


And when you make bdrv_drain_all_begin() print what we are trying to
drain, you can see that it's the format node (managed by the "raw"
driver in this case).

So I thought, before I put more time into this, let's ask whether the
test author has any ideas. :-)

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]