qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 5/5] target/riscv: add vector amo operations


From: LIU Zhiwei
Subject: Re: [PATCH v4 5/5] target/riscv: add vector amo operations
Date: Fri, 28 Feb 2020 17:19:39 +0800
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2



On 2020/2/28 13:38, Richard Henderson wrote:
On 2/25/20 2:35 AM, LIU Zhiwei wrote:
+    if (s->sew < 2) {
+        return false;
+    }
This could just as easily be in amo_check?
Yes, it can be done in amo_check.
+
+    if (tb_cflags(s->base.tb) & CF_PARALLEL) {
+#ifdef CONFIG_ATOMIC64
+        fn = fns[0][seq][s->sew - 2];
+#else
+        gen_helper_exit_atomic(cpu_env);
+        s->base.is_jmp = DISAS_NORETURN;
+        return true;
+#endif
Why are you raising exit_atomic without first checking that s->sew == 3?
Yes, it should be

    if (s->sew == 3) {

#ifdef CONFIG_ATOMIC64
        fn = fns[0][seq][1];
#else
        gen_helper_exit_atomic(cpu_env);
        s->base.is_jmp = DISAS_NORETURN;
        return true;
#endif

    } else {
#ifdef TARGET_RISCV64

    fn = fns[0][seq][0];
#else
    fn = fns[0][seq];
#endif

  }
Is it OK?
We
can do 32-bit atomic operations always.
Good.

+    } else {
+        fn = fns[1][seq][s->sew - 2];
+    }
+    if (fn == NULL) {
+        return false;
+    }
+
+    return amo_trans(a->rd, a->rs1, a->rs2, data, fn, s);
+}
+
+static bool amo_check(DisasContext *s, arg_rwdvm* a)
+{
+    return (vext_check_isa_ill(s, RVV | RVA) &&
+            (a->wd ? vext_check_overlap_mask(s, a->rd, a->vm) : 1) &&
+            vext_check_reg(s, a->rd, false) &&
+            vext_check_reg(s, a->rs2, false));
+}
I guess the "If SEW is greater than XLEN, an illegal instruction exception is
raised" requirement is currently in the column of NULLs in the !CONFIG_RISCV64
block.  But it might be better to have it explicit and save the column of NULLs.
maybe  adds

(1 << s->sew) <= sizeof(target_ulong) &&

in amo_check
It makes sense to me to do both sew checks together, whether in amo_check or in
amo_op.

+#define GEN_VEXT_AMO_NOATOMIC_OP(NAME, ETYPE, MTYPE, H, DO_OP, SUF)      \
+static void vext_##NAME##_noatomic_op(void *vs3, target_ulong addr,      \
+        uint32_t wd, uint32_t idx, CPURISCVState *env, uintptr_t retaddr)\
+{                                                                        \
+    ETYPE ret;                                                           \
+    target_ulong tmp;                                                    \
+    int mmu_idx = cpu_mmu_index(env, false);                             \
+    tmp = cpu_ld##SUF##_mmuidx_ra(env, addr, mmu_idx, retaddr);          \
+    ret = DO_OP((ETYPE)(MTYPE)tmp, *((ETYPE *)vs3 + H(idx)));            \
+    cpu_st##SUF##_mmuidx_ra(env, addr, ret, mmu_idx, retaddr);           \
+    if (wd) {                                                            \
+        *((ETYPE *)vs3 + H(idx)) = (target_long)(MTYPE)tmp;              \
The target_long cast is wrong; should be ETYPE.
"If the AMO memory element width is less than SEW, the value returned from memory
 is sign-extended to fill SEW"

So just use (target_long) to sign-extended. As you see, instructions like

vamominud

have the uint64_t as ETYPE.  And it can't sign-extend the value from memory by (ETYPE)(MTYPE)tmp.
You can use cpu_ldX/stX_data (no mmu_idx or retaddr argument).  There should be
no faults, since you've already checked for read+write.
Good idea.
+/* atomic opreation for vector atomic insructions */
+#ifndef CONFIG_USER_ONLY
+#define GEN_VEXT_ATOMIC_OP(NAME, ETYPE, MTYPE, MOFLAG, H, AMO)           \
+static void vext_##NAME##_atomic_op(void *vs3, target_ulong addr,        \
+        uint32_t wd, uint32_t idx, CPURISCVState *env)                   \
+{                                                                        \
+    target_ulong tmp;                                                    \
+    int mem_idx = cpu_mmu_index(env, false);                             \
+    tmp = helper_atomic_##AMO##_le(env, addr, *((ETYPE *)vs3 + H(idx)),  \
+            make_memop_idx(MO_ALIGN | MOFLAG, mem_idx));                 \
+    if (wd) {                                                            \
+        *((ETYPE *)vs3 + H(idx)) = (target_long)(MTYPE)tmp;              \
+    }                                                                    \
+}
+#else
+#define GEN_VEXT_ATOMIC_OP(NAME, ETYPE, MTYPE, MOFLAG, H, AMO)           \
+static void vext_##NAME##_atomic_op(void *vs3, target_ulong addr,        \
+        uint32_t wd, uint32_t idx, CPURISCVState *env)                   \
+{                                                                        \
+    target_ulong tmp;                                                    \
+    tmp = helper_atomic_##AMO##_le(env, addr, *((ETYPE *)vs3 + H(idx))); \
+    if (wd) {                                                            \
+        *((ETYPE *)vs3 + H(idx)) = (target_long)(MTYPE)tmp;              \
+    }                                                                    \
+}
+#endif
This is not right.  It is not legal to call these helpers from another helper
-- they will use the wrong GETPC() and will not unwind properly.
Oh. I didn't notice it.
+static inline void vext_amo_atomic(void *vs3, void *v0, target_ulong base,
+        void *vs2, CPURISCVState *env, uint32_t desc,
+        vext_get_index_addr get_index_addr,
+        vext_amo_atomic_fn atomic_op,
+        vext_ld_clear_elem clear_elem,
+        uint32_t esz, uint32_t msz, uintptr_t ra)
+{
+    uint32_t i;
+    target_long addr;
+    uint32_t wd = vext_wd(desc);
+    uint32_t vm = vext_vm(desc);
+    uint32_t mlen = vext_mlen(desc);
+    uint32_t vlmax = vext_maxsz(desc) / esz;
+
+    for (i = 0; i < env->vl; i++) {
+        if (!vm && !vext_elem_mask(v0, mlen, i)) {
+            continue;
+        }
+        probe_read_access(env, get_index_addr(base, i, vs2), msz, ra);
+        probe_write_access(env, get_index_addr(base, i, vs2), msz, ra);
+    }
You probably need to check for aligned address here too, probably first so an
unaligned fault has priority over an invalid page fault.
OK, I will use  cpu_unaligned_access() to check unaligned access.
The missing aligned address check is the only remaining exception that the
helper_atomic_* functions would raise, since you have properly checked for
read+write.  So it might be possible to get away with using the helpers, but I
don't like it.
Do you mean write my own helpers to implement atomic operations?

What's the meaning of " but I don't like it. "?
But I do think it would be better to write your own helpers for the atomic
paths.  They need not check quite so much, since we have already done the
validation above.  You pretty much only need to use tlb_vaddr_to_host.

If that gets too ugly, we can talk about rearranging
accel/tcg/atomic_template.h so that it could be reused.
Good idea.  Perhaps use tlb_vaddr_to_host instead of atomic_mmu_lookup
to define another macro like GEN_ATOMIC_HELPER?
Alternately, we could simply *always* use the non-atomic helpers, and raise
exit_atomic if PARALLEL.
Yes, it's the simplest way.
However I prefer try to define something like GEN_ATOMIC_HELPER in vector_helper.c.
+static inline void vext_amo_noatomic(void *vs3, void *v0, target_ulong base,
+        void *vs2, CPURISCVState *env, uint32_t desc,
+        vext_get_index_addr get_index_addr,
+        vext_amo_noatomic_fn noatomic_op,
+        vext_ld_clear_elem clear_elem,
+        uint32_t esz, uint32_t msz, uintptr_t ra)
Without the retaddr argument to the noatomic_fn, as described above,
vext_amo_noatomic and vext_amo_atomic are identical.
Yes.

It's very kind of you to spend so much time to review.
Thanks very much.

Best Regards,
Zhiwei

r~




reply via email to

[Prev in Thread] Current Thread [Next in Thread]