[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH] queue: fix QSLIST_INSERT_HEAD_ATOMIC race |
Date: |
Thu, 12 Mar 2015 13:34:12 +0000 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Tue, Mar 10, 2015 at 04:45:57PM +0100, Paolo Bonzini wrote:
> There is a not-so-subtle race in QSLIST_INSERT_HEAD_ATOMIC.
>
> Because atomic_cmpxchg returns the old value instead of a success flag,
> QSLIST_INSERT_HEAD_ATOMIC was checking for success by comparing against
> the second argument to atomic_cmpxchg. Unfortunately, this only works
> if the second argument is a local or thread-local variable.
>
> If it is in memory, it can be subject to common subexpression elimination
> (and then everything's fine) or reloaded after the atomic_cmpxchg,
> depending on the compiler's whims. If the latter happens, the race can
> happen. A thread can sneak in, doing something on elm->field.sle_next
> after the atomic_cmpxchg and before the comparison. This causes a wrong
> failure, and then two threads are using "elm" at the same time. In the
> case discovered by Christian, the sequence was likely something like this:
>
> thread 1 | thread 2
> QSLIST_INSERT_HEAD_ATOMIC |
> atomic_cmpxchg succeeds |
> elm added to list |
> | steal release_pool
> | QSLIST_REMOVE_HEAD
> | elm removed from list
> | ...
> | QSLIST_INSERT_HEAD_ATOMIC
> | (overwrites sle_next)
> spurious failure |
> atomic_cmpxchg succeeds |
> elm added to list again |
> |
> steal release_pool |
> QSLIST_REMOVE_HEAD |
> elm removed again |
>
> The last three steps could be done by a third thread as well.
> A reproducer that failed in a matter of seconds is as follows:
>
> - the guest has 32 VCPUs on a 28 core host (hyperthreading was enabled),
> memory was 16G just to err on the safe side (the host has 64G, but hey
> at least you need no s390)
>
> - the guest has 24 null-aio virtio-blk devices using dataplane
> (-object iothread,id=ioN -drive if=none,id=blkN,driver=null-aio,size=500G
> -device virtio-blk-pci,iothread=ioN,drive=blkN)
>
> - the guest also has a single network interface. It's only doing loopback
> tests so slirp vs. tap and the model doesn't matter.
>
> - the guest is running fio with the following script:
>
> [global]
> rw=randread
> blocksize=16k
> ioengine=libaio
> runtime=10m
> buffered=0
> fallocate=none
> time_based
> iodepth=32
>
> [virtio1a]
> filename=/dev/block/252\:16
>
> [virtio1b]
> filename=/dev/block/252\:16
>
> ...
>
> [virtio24a]
> filename=/dev/block/252\:384
>
> [virtio24b]
> filename=/dev/block/252\:384
>
> [listen1]
> protocol=tcp
> ioengine=net
> port=12345
> listen
> rw=read
> bs=4k
> size=1000g
>
> [connect1]
> protocol=tcp
> hostname=localhost
> ioengine=net
> port=12345
> protocol=tcp
> rw=write
> startdelay=1
> size=1000g
>
> ...
>
> [listen8]
> protocol=tcp
> ioengine=net
> port=12352
> listen
> rw=read
> bs=4k
> size=1000g
>
> [connect8]
> protocol=tcp
> hostname=localhost
> ioengine=net
> port=12352
> rw=write
> startdelay=1
> size=1000g
>
> Moral of the story: I should refrain from writing more clever stuff.
> At least it looks like it is not too clever to be undebuggable.
>
> Reported-by: Christian Borntraeger <address@hidden>
> Fixes: c740ad92d0d958fa785e5d7aa1b67ecaf30a6a54
> Signed-off-by: Paolo Bonzini <address@hidden>
> ---
> include/qemu/queue.h | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
Reviewed-by: Stefan Hajnoczi <address@hidden>
pgpoDm6zwTxnD.pgp
Description: PGP signature