[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] [PATCH] virtio: add missing mb() on notification
From: |
Michael S. Tsirkin |
Subject: |
[Qemu-devel] [PATCH] virtio: add missing mb() on notification |
Date: |
Sun, 22 Apr 2012 17:54:47 +0300 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
During normal operation, virtio host first writes a used index
and then checks whether it should interrupt the guest
by reading guest avail flag/used event index values.
Guest does the reverse: writes the index/flag,
then checks the used ring.
The ordering is important: if host avail flag read bypasses the used
index write, we could in effect get this timing:
host avail flag/used event index read
guest enable interrupts: this performs
avail flag/used event index write
guest check used ring: ring is empty
host used index write
This timing results in a lost interrupt: guest will never be
notified about the used ring update.
This has actually been observed in the field, when using qemu-kvm such
that the guest vcpu and qemu io run on different host cpus,
but only seems to trigger on some specific hardware,
and only with userspace virtio: vhost has the necessary smp_mb() in
place to prevent the reordering, so the same workload stalls forever
waiting for an interrupt with vhost=off but works fine with vhost=on.
Insert an smp_mb() barrier operation in userspace virtio to
ensure the correct ordering.
Applying this patch fixed the race condition we have observed.
Tested on x86_64. I checked the code generated by the new macro
for i386 and ppc but didn't run virtio.
Signed-off-by: Michael S. Tsirkin <address@hidden>
---
hw/virtio.c | 2 ++
qemu-barrier.h | 23 ++++++++++++++++++++---
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/hw/virtio.c b/hw/virtio.c
index f805790..6449746 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -693,6 +693,8 @@ static bool vring_notify(VirtIODevice *vdev, VirtQueue *vq)
{
uint16_t old, new;
bool v;
+ /* We need to expose used array entries before checking used event. */
+ smp_mb();
/* Always notify when queue is empty (when feature acknowledge) */
if (((vdev->guest_features & (1 << VIRTIO_F_NOTIFY_ON_EMPTY)) &&
!vq->inuse && vring_avail_idx(vq) == vq->last_avail_idx)) {
diff --git a/qemu-barrier.h b/qemu-barrier.h
index c11bb2b..f6722a8 100644
--- a/qemu-barrier.h
+++ b/qemu-barrier.h
@@ -4,7 +4,7 @@
/* Compiler barrier */
#define barrier() asm volatile("" ::: "memory")
-#if defined(__i386__) || defined(__x86_64__)
+#if defined(__i386__)
/*
* Because of the strongly ordered x86 storage model, wmb() is a nop
@@ -13,15 +13,31 @@
* load/stores from C code.
*/
#define smp_wmb() barrier()
+/*
+ * We use GCC builtin if it's available, as that can use
+ * mfence on 32 bit as well, e.g. if built with -march=pentium-m.
+ * However, on i386, there seem to be known bugs as recently as 4.3.
+ * */
+#if defined(_GNUC__) && __GNUC__ >= 4 && __GNUC_MINOR__ >= 4
+#define smp_mb() __sync_synchronize()
+#else
+#define smp_mb() asm volatile("lock; addl $0,0(%%esp) " ::: "memory")
+#endif
+
+#elif defined(__x86_64__)
+
+#define smp_wmb() barrier()
+#define smp_mb() asm volatile("mfence" ::: "memory")
#elif defined(_ARCH_PPC)
/*
- * We use an eieio() for a wmb() on powerpc. This assumes we don't
+ * We use an eieio() for wmb() and mb() on powerpc. This assumes we don't
* need to order cacheable and non-cacheable stores with respect to
* each other
*/
#define smp_wmb() asm volatile("eieio" ::: "memory")
+#define smp_mb() asm volatile("eieio" ::: "memory")
#else
@@ -29,9 +45,10 @@
* For (host) platforms we don't have explicit barrier definitions
* for, we use the gcc __sync_synchronize() primitive to generate a
* full barrier. This should be safe on all platforms, though it may
- * be overkill.
+ * be overkill for wmb().
*/
#define smp_wmb() __sync_synchronize()
+#define smp_mb() __sync_synchronize()
#endif
--
1.7.9.111.gf3fb0
- [Qemu-devel] [PATCH] virtio: add missing mb() on notification,
Michael S. Tsirkin <=