[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] ping RE: question: I found a qemu crash about migration
From: |
wangjie (P) |
Subject: |
[Qemu-devel] ping RE: question: I found a qemu crash about migration |
Date: |
Thu, 28 Sep 2017 07:38:45 +0000 |
Ping?
From: wangjie (P)
Sent: Tuesday, September 26, 2017 9:10 PM
To: address@hidden; address@hidden; address@hidden
Cc: wangjie (P) <address@hidden>; fuweiwei (C) <address@hidden>;
address@hidden; address@hidden; address@hidden; address@hidden; Wubin (H)
<address@hidden>
Subject: question: I found a qemu crash about migration
Hi,
When I use qemuMigrationRun to migrate both memory and storage with some IO
press in VM, and configured iothreads. We triggered a error reports: (I use
the current qemu master branch)
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",
I reviewed the code, and gdb the coredump file, I think one case can trigger
the error reports
Case:
Migration_thread()
Migration_completion() ----------> last iteration of memory migration
Vm_stop_force_state()--------------> Stop the VM, and call
bdrv_drain_all, but I gdb the core file, and found the cnt of dirty bitmap of
driver-mirror is not 0, and in_flight mirror IO is 16,
Bdrv_inactivate_all()----------------> inactivate images and
set the INACTIVE label.
-> bdrv_co_do_pwritev()-------------->then the mirror IO handled after
will trigger the Assertion `!(bs->open_flags & 0x0800)' and qemu crashed
As we can see from above, Migration_completion call Bdrv_inactivate_all to
inactivate images, but the mirror_run is not done (still has dirty clusters),
the mirror_run IO issued later will triggered error reports: "
bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",
It seems that memory migration and storage mirror is done independently and the
sequence of the two progresses are quite random.
How can I solve this problem, should we not set INACTIVE label for
drive-mirror BlockDriverState?
Qemu Crash bt:
(gdb) bt
#0 0x00007f6b6e2a71d7 in raise () from /usr/lib64/libc.so.6
#1 0x00007f6b6e2a88c8 in abort () from /usr/lib64/libc.so.6
#2 0x00007f6b6e2a0146 in __assert_fail_base () from /usr/lib64/libc.so.6
#3 0x00007f6b6e2a01f2 in __assert_fail () from /usr/lib64/libc.so.6
#4 0x00000000007b9211 in bdrv_co_pwritev (child=<optimized out>,
address@hidden, address@hidden,
address@hidden, flags=0) at block/io.c:1536
#5 0x00000000007a6f02 in blk_co_pwritev (blk=0x2f92750, offset=7034896384,
bytes=65536, qiov=0x7f69cc09b068,
flags=<optimized out>) at block/block_backend.c:851
#6 0x00000000007a6fc1 in blk_aio_write_entry (opaque=0x301dad0) at
block/block_backend.c:1043
#7 0x0000000000835e2a in coroutine_trampoline (i0=<optimized out>,
i1=<optimized out>) at util/coroutine_ucontext.c:79
#8 0x00007f6b6e2b8cf0 in ?? () from /usr/lib64/libc.so.6
#9 0x00007f6a1bcfc780 in ?? ()
#10 0x0000000000000000 in ?? ()
And I see the mirror_run is not done, gdb info as following:
[cid:image001.png@01D3386F.DBC9FF10]
Src VM qemu log:
[cid:image002.png@01D3386F.DBC9FF10]


- [Qemu-devel] ping RE: question: I found a qemu crash about migration,
wangjie (P) <=