qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] New iotest repros failures on virtio external snapshot


From: Laszlo Ersek
Subject: Re: [Qemu-devel] New iotest repros failures on virtio external snapshot with iothread
Date: Fri, 31 Mar 2017 01:43:53 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0

On 03/30/17 04:16, Eric Blake wrote:
> On 03/29/2017 09:01 PM, Ed Swierk via Qemu-devel wrote:
>> Parts of qemu's block code have changed a lot in recent months but are
>> not well exercised by current tests.
>>
>> Subtle bugs have crept in causing assertion failures, hangs and other
>> crashes in a variety of situations: immediately on start, on first
>> guest activity, on external snapshot create or commit, on qmp quit
>> command.
>>
>> Reproducing these bugs has proved tricky, as each may occur only with
>> a specific combination of qemu version, block device type (virtio-blk
>> or virtio-scsi) and iothread enabled or not. In some cases the bug
>> occurs only after several external snapshot operations. And in some
>> cases the bug only manifests when a guest is accessing the block
>> device simultaneously.
>>
>> I've written an iotest (number 176, for now) that attempts to cover
> 
> At least one other thread has already proposed a test 176.  It's
> somewhat straightforward to renumber things, but I'm wondering if there
> is some even-more-efficient way of reserving test numbers, perhaps
> through the wiki, since we are finding that test numbers get reserved
> several weeks before actually getting merged into the tree.

UEFI / edk2 solves this problem elegantly by naming everything with
globally unique identifiers, so if you need a new thing, just run
"uuidgen". No coordination required.

In practice it would result in subjects like

[Qemu-devel] [PATCH for-2.9] iotests: Fix test
3dec30b6-f69b-4eb0-8f89-87063433c830

I shall now retreat to my cave.

Laszlo
;)

> 
>> many of these configurations. Currently it only exercises the external
>> snapshot create and commit lifted from iotest 118. The new iotest does
>> this repeatedly in each of 16 combinations:
>> - no guest / guest
>> - virtio-blk / virtio-scsi
>> - no iothread / iothread
>> - single / repeated external snapshot create+commit
>>
>> I made some minor changes to the test infrastructure so the new iotest
>> can deal gracefully with qemu hanging--the test script itself
>> shouldn't hang. And in all failure modes the test needs to expose
>> enough console output and other information to diagnose the problem.
> 
> Some of those changes sound like they are worth posting to the list
> as-is, separate from the actual new test.
> 
>>
>> The main departure from existing iotests is running a real guest. I
>> used buildroot to generate a small (~4 MB) Linux kernel with built-in
>> initrd containing a busybox-based userland. After the iotest launches
>> qemu, the guest loops writing to the block device, while the test
>> performs snapshot operations.
>>
>> I ran the new iotest on 3 qemu versions: 2.7.1, stable-2.8-staging and
>> 2.9.0-rc2. The latter two fail several test cases, all
>> iothread-enabled. Only 2.7.1 passes all the cases.
>>
>> Here is the code for the new iotest (I didn't dare email patches with
>> a 4 MB blob):
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.7
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.8
>> https://github.com/skyportsystems/qemu-1/commits/eswierk-iotests-2.9
>>
>> And here is the buildroot I used to generate the guest Linux kernel+initrd:
>> https://github.com/skyportsystems/buildroot-1/commits/qemu-iotests
>>
>> Please check out the code and try the new test--particularly anyone
>> who can also help figure out these failures. (Note that since half the
>> test cases use an iothread, /dev/kvm must be readable and writable.)
>>
>> * stable-2.8-staging
>> - guest, virtio-blk, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>> - guest, virtio-scsi, iothread, single snapshot create+commit: hang on
>> quit (intermittent)
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: hang
>> after 1 iteration
>>
>> * 2.9.0-rc2
>> - guest, virtio-blk, iothread, single snapshot create+commit:
>> "include/block/aio.h:457: aio_enable_external: Assertion
>> `ctx->external_disable_cnt > 0' failed." after snapshot create
> 
> It would be nice if we could get to the root cause and squash that one
> before 2.9.
> 
>> - guest, virtio-blk, iothread, repeated snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, single snapshot create+commit: same as above
>> - guest, virtio-scsi, iothread, repeated snapshot create+commit: same as 
>> above
>> - no guest, virtio-blk, iothread, repeated snapshot create+commit: same as 
>> above
>> - no guest, virtio-scsi, iothread, single snapshot create+commit: same as 
>> above
>> - no guest, virtio-scsi, iothread, repeated snapshot create+commit:
>> same as above
>>
>> --Ed
>>
>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]