DEF_HELPER_3(lmw, void, env, tl, i32)
-DEF_HELPER_3(stmw, void, env, tl, i32)
+DEF_HELPER_FLAGS_3(stmw, TCG_CALL_NO_WG, void, env, tl, i32)
DEF_HELPER_4(lsw, void, env, tl, i32, i32)
DEF_HELPER_5(lswx, void, env, tl, i32, i32, i32)
-DEF_HELPER_4(stsw, void, env, tl, i32, i32)
-DEF_HELPER_3(dcbz, void, env, tl, i32)
-DEF_HELPER_2(icbi, void, env, tl)
+DEF_HELPER_FLAGS_4(stsw, TCG_CALL_NO_WG, void, env, tl, i32, i32)
+DEF_HELPER_FLAGS_3(dcbz, TCG_CALL_NO_WG, void, env, tl, i32)
+DEF_HELPER_FLAGS_2(icbi, TCG_CALL_NO_WG, void, env, tl)
DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32)
#if defined(TARGET_PPC64)
If my understanding is right, the above is correct, as none of these
instructions will write to the env, though they can read from it and/
or generate faults.
Sadly I haven't observed any performance improvement as a result in
a few micro-benchmarks I cooked up.