qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] MTTCG status updates, benchmark results and KVM forum p


From: Emilio G. Cota
Subject: Re: [Qemu-devel] MTTCG status updates, benchmark results and KVM forum plans
Date: Mon, 15 Aug 2016 11:46:26 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

On Mon, Aug 15, 2016 at 11:46:32 +0100, Alex Bennée wrote:
> As far as I'm aware the following work is still ongoing:
> 
> Emilo: cmpxchg atomics
> Alvise: LL/SC modelling

I've been tinkering with an experimental patch to do proper LL/SC. The idea
is to rely on hardware transactional memory, so that stores don't have
to be tracked. The trickiest thing is the fallback path, for which I'm
trying to (ab)use EXCP_ATOMIC to execute exclusively from the ldrex
all the way to the strex.

To test it, I'm using aarch64-linux-user running qht-bench compiled on
an aarch64 machine. I'm running on an Intel Skylake host (Skylake has
no known TSX bugs)

However, I'm finding issues that might not have to do with the
patch itself.

- On the latest MTTCG+cmpxchg tree (45c11751ed7 a.k.a.
  bennee/mttcg/base-patches-v4-with-cmpxchg-atomics-v2), QEMU loops
  forever without making progress in the instruction stream, even
  with taskset -c 0.
- On the cmpxchg tree (rth's atomic-2 branch [1]), it works more
  reliably, although tb_lock is held around tb_find_fast so parallelism isn't
  very high. Still, it sometimes triggers the assert below.
  - Applying the "remove tb_lock around hot path" patch makes it
    easier to trigger this assert in cpu-exec.c:650 (approx.):
            /* Assert that the compiler does not smash local variables. */
            g_assert(cpu == current_cpu)
    I've also seen triggered the assert immediately after that one, as well
    as the rcu_read_unlock depth assert.
  The asserts are usually triggered when all threads exit (by returning
  NULL) at roughly the same time.
  However, they cannot be triggered with taskset -c 0, which makes me
  suspect that somehow start_exclusive isn't working as intended.

Any tips would be appreciated! I'll reply with a patch that uses RTM,
the one below is fallback path all the way, and the best to reproduce
the above.

Thanks,

                Emilio

[1] https://github.com/rth7680/qemu/commits/atomic-2

>From ed6af6eb364e5a36e81d7cc8143c0e9783c50587 Mon Sep 17 00:00:00 2001
From: "Emilio G. Cota" <address@hidden>
Date: Mon, 15 Aug 2016 00:27:42 +0200
Subject: [PATCH] aarch64: use TSX for ldrex/strex (fallback path only)

Signed-off-by: Emilio G. Cota <address@hidden>
---
 linux-user/main.c          |  5 +++--
 target-arm/helper-a64.c    | 23 +++++++++++++++++++++++
 target-arm/helper-a64.h    |  4 ++++
 target-arm/translate-a64.c | 15 +++++++++------
 4 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 9880505..6922faa 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -192,8 +192,9 @@ static void step_atomic(CPUState *cpu)
 
     /* Since we got here, we know that parallel_cpus must be true.  */
     parallel_cpus = false;
-    cpu_exec_step(cpu);
-    parallel_cpus = true;
+    while (!parallel_cpus) {
+        cpu_exec_step(cpu);
+    }
 
     end_exclusive();
 }
diff --git a/target-arm/helper-a64.c b/target-arm/helper-a64.c
index 8ce518b..a97b631 100644
--- a/target-arm/helper-a64.c
+++ b/target-arm/helper-a64.c
@@ -579,3 +579,26 @@ uint64_t HELPER(paired_cmpxchg64_be)(CPUARMState *env, 
uint64_t addr,
 
     return !success;
 }
+
+void HELPER(xbegin)(CPUARMState *env)
+{
+    uintptr_t ra = GETPC();
+
+    if (parallel_cpus) {
+        cpu_loop_exit_atomic(ENV_GET_CPU(env), ra);
+    }
+}
+
+void HELPER(xend)(void)
+{
+    assert(!parallel_cpus);
+    parallel_cpus = true;
+}
+
+uint64_t HELPER(x_ok)(void)
+{
+    if (!parallel_cpus) {
+        return 1;
+    }
+    return 0;
+}
diff --git a/target-arm/helper-a64.h b/target-arm/helper-a64.h
index dd32000..e7ede43 100644
--- a/target-arm/helper-a64.h
+++ b/target-arm/helper-a64.h
@@ -48,3 +48,7 @@ DEF_HELPER_FLAGS_3(crc32_64, TCG_CALL_NO_RWG_SE, i64, i64, 
i64, i32)
 DEF_HELPER_FLAGS_3(crc32c_64, TCG_CALL_NO_RWG_SE, i64, i64, i64, i32)
 DEF_HELPER_FLAGS_4(paired_cmpxchg64_le, TCG_CALL_NO_WG, i64, env, i64, i64, 
i64)
 DEF_HELPER_FLAGS_4(paired_cmpxchg64_be, TCG_CALL_NO_WG, i64, env, i64, i64, 
i64)
+
+DEF_HELPER_1(xbegin, void, env)
+DEF_HELPER_0(x_ok, i64)
+DEF_HELPER_0(xend, void)
diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c
index 450c359..cfcf440 100644
--- a/target-arm/translate-a64.c
+++ b/target-arm/translate-a64.c
@@ -1760,6 +1760,8 @@ static void gen_load_exclusive(DisasContext *s, int rt, 
int rt2,
     TCGv_i64 tmp = tcg_temp_new_i64();
     TCGMemOp be = s->be_data;
 
+    gen_helper_xbegin(cpu_env);
+
     g_assert(size <= 3);
     if (is_pair) {
         TCGv_i64 hitmp = tcg_temp_new_i64();
@@ -1825,6 +1827,9 @@ static void gen_store_exclusive(DisasContext *s, int rd, 
int rt, int rt2,
     tcg_gen_brcond_i64(TCG_COND_NE, addr, cpu_exclusive_addr, fail_label);
 
     tmp = tcg_temp_new_i64();
+    gen_helper_x_ok(tmp);
+    tcg_gen_brcondi_i64(TCG_COND_EQ, tmp, 0, fail_label);
+
     if (is_pair) {
         if (size == 2) {
             TCGv_i64 val = tcg_temp_new_i64();
@@ -1844,16 +1849,14 @@ static void gen_store_exclusive(DisasContext *s, int 
rd, int rt, int rt2,
         }
     } else {
         TCGv_i64 val = cpu_reg(s, rt);
-        tcg_gen_atomic_cmpxchg_i64(tmp, addr, cpu_exclusive_val, val,
-                                   get_mem_index(s),
-                                   size | MO_ALIGN | s->be_data);
-        tcg_gen_setcond_i64(TCG_COND_NE, tmp, tmp, cpu_exclusive_val);
+        tcg_gen_qemu_st_i64(val, addr, get_mem_index(s), s->be_data + size);
     }
 
     tcg_temp_free_i64(addr);
-
-    tcg_gen_mov_i64(cpu_reg(s, rd), tmp);
     tcg_temp_free_i64(tmp);
+
+    tcg_gen_movi_i64(cpu_reg(s, rd), 0);
+    gen_helper_xend();
     tcg_gen_br(done_label);
 
     gen_set_label(fail_label);
-- 
2.7.4




reply via email to

[Prev in Thread] Current Thread [Next in Thread]