[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [Qemu-devel] New iotest repros failures on virtio exter
From: |
John Snow |
Subject: |
Re: [Qemu-block] [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread |
Date: |
Thu, 30 Mar 2017 19:06:08 -0400 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 |
On 03/29/2017 10:01 PM, Ed Swierk via Qemu-devel wrote:
> Parts of qemu's block code have changed a lot in recent months but are
> not well exercised by current tests.
>
> Subtle bugs have crept in causing assertion failures, hangs and other
> crashes in a variety of situations: immediately on start, on first
> guest activity, on external snapshot create or commit, on qmp quit
> command.
>
> Reproducing these bugs has proved tricky, as each may occur only with
> a specific combination of qemu version, block device type (virtio-blk
> or virtio-scsi) and iothread enabled or not. In some cases the bug
> occurs only after several external snapshot operations. And in some
> cases the bug only manifests when a guest is accessing the block
> device simultaneously.
>
> I've written an iotest (number 176, for now) that attempts to cover
> many of these configurations. Currently it only exercises the external
> snapshot create and commit lifted from iotest 118. The new iotest does
> this repeatedly in each of 16 combinations:
> - no guest / guest
> - virtio-blk / virtio-scsi
> - no iothread / iothread
> - single / repeated external snapshot create+commit
>
> I made some minor changes to the test infrastructure so the new iotest
> can deal gracefully with qemu hanging--the test script itself
> shouldn't hang. And in all failure modes the test needs to expose
> enough console output and other information to diagnose the problem.
>
> The main departure from existing iotests is running a real guest. I
> used buildroot to generate a small (~4 MB) Linux kernel with built-in
> initrd containing a busybox-based userland. After the iotest launches
> qemu, the guest loops writing to the block device, while the test
> performs snapshot operations.
>
> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
> 2.9.0-rc2. The latter two fail several test cases, all
> iothread-enabled. Only 2.7.1 passes all the cases.
>
> Here is the code for the new iotest (I didn't dare email patches with
> a 4 MB blob):
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
>
> And here is the buildroot I used to generate the guest Linux kernel+initrd:
> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
>
> Please check out the code and try the new test--particularly anyone
> who can also help figure out these failures. (Note that since half the
> test cases use an iothread, /dev/kvm must be readable and writable.)
>
> * stable-2.8-staging
> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
> after 1 iteration
> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
> quit (intermittent)
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
> after 1 iteration
>
> * 2.9.0-rc2
> - guest, virtio-blk, iothread, single snapshot create+commit:
> "include/block/aio.h:457: aio_enable_external: Assertion
> `ctx->external_disable_cnt > 0' failed." after snapshot create
> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as above
> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as
> above
> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as
> above
> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
> same as above
>
Do you mean to say that all of these 2.9.0-rc2 cases produce the same
aio.h assertion?
> --Ed
>