This is somewhat orthogonal to the issue being addressed here, but:
While reading the man page to make sense of this patch, I noticed the following:
> If the futex value does not match val, then the call fails
> immediately with the error EAGAIN.
And qemu_futex_wait does not seem to handle that case. In fact it seems like it would take the default: abort(); code path?
If I've got this right, I'm surprised there aren't spurious abort()s happening, but I suppose QemuEvent and qemu_futex_* are used fairly sparingly and in low-contention areas.
futex(2) - Linux manual page
https://man7.org/linux/man-pages/man2/futex.2.html
> Note that a wake-up can also be caused by common futex usage patterns
> in unrelated code that happened to have previously used the futex
> word's memory location (e.g., typical futex-based implementations of
> Pthreads mutexes can cause this under some conditions). Therefore,
> callers should always conservatively assume that a return value of 0
> can mean a spurious wake-up, and use the futex word's value (i.e.,
> the user-space synchronization scheme) to decide whether to continue
> to block or not.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
In any case, for the specific issue addressed here:
---
include/qemu/futex.h | 9 +++++++++
tests/unit/test-aio-multithread.c | 4 +++-
util/qemu-thread-posix.c | 28 ++++++++++++++++------------
3 files changed, 28 insertions(+), 13 deletions(-)
diff --git a/include/qemu/futex.h b/include/qemu/futex.h
index 91ae88966e12..f57774005330 100644
--- a/include/qemu/futex.h
+++ b/include/qemu/futex.h
@@ -24,6 +24,15 @@ static inline void qemu_futex_wake(void *f, int n)
qemu_futex(f, FUTEX_WAKE, n, NULL, NULL, 0);
}
+/*
+ * Note that a wake-up can also be caused by common futex usage patterns in
+ * unrelated code that happened to have previously used the futex word's
+ * memory location (e.g., typical futex-based implementations of Pthreads
+ * mutexes can cause this under some conditions). Therefore, callers should
+ * always conservatively assume that it is a spurious wake-up, and use the futex
+ * word's value (i.e., the user-space synchronization scheme) to decide whether
+ * to continue to block or not.
+ */
static inline void qemu_futex_wait(void *f, unsigned val)
{
while (qemu_futex(f, FUTEX_WAIT, (int) val, NULL, NULL, 0)) {
diff --git a/tests/unit/test-aio-multithread.c b/tests/unit/test-aio-multithread.c
index 08d4570ccb14..8c2e41545a29 100644
--- a/tests/unit/test-aio-multithread.c
+++ b/tests/unit/test-aio-multithread.c
@@ -305,7 +305,9 @@ static void mcs_mutex_lock(void)
prev = qatomic_xchg(&mutex_head, id);
if (prev != -1) {
qatomic_set(&nodes[prev].next, id);
- qemu_futex_wait(&nodes[id].locked, 1);
+ while (qatomic_read(&nodes[id].locked) == 1) {
+ qemu_futex_wait(&nodes[id].locked, 1);
+ }
}
}
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index b2e26e21205b..eade5311d175 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -428,17 +428,21 @@ void qemu_event_wait(QemuEvent *ev)
assert(ev->initialized);
- /*
- * qemu_event_wait must synchronize with qemu_event_set even if it does
- * not go down the slow path, so this load-acquire is needed that
- * synchronizes with the first memory barrier in qemu_event_set().
- *
- * If we do go down the slow path, there is no requirement at all: we
- * might miss a qemu_event_set() here but ultimately the memory barrier in
- * qemu_futex_wait() will ensure the check is done correctly.
- */
- value = qatomic_load_acquire(&ev->value);
- if (value != EV_SET) {
+ while (true) {
+ /*
+ * qemu_event_wait must synchronize with qemu_event_set even if it does
+ * not go down the slow path, so this load-acquire is needed that
+ * synchronizes with the first memory barrier in qemu_event_set().
+ *
+ * If we do go down the slow path, there is no requirement at all: we
+ * might miss a qemu_event_set() here but ultimately the memory barrier
+ * in qemu_futex_wait() will ensure the check is done correctly.
+ */
+ value = qatomic_load_acquire(&ev->value);
+ if (value == EV_SET) {
+ break;
+ }
+
if (value == EV_FREE) {
/*
* Leave the event reset and tell qemu_event_set that there are
@@ -452,7 +456,7 @@ void qemu_event_wait(QemuEvent *ev)
* like the load above.
*/
if (qatomic_cmpxchg(&ev->value, EV_FREE, EV_BUSY) == EV_SET) {
- return;
+ break;
}
}
--
2.47.1