qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions


From: Alex Bennée
Subject: Re: [Qemu-devel] [RFC PATCH] tcg: Optimize fence instructions
Date: Tue, 19 Jul 2016 16:56:11 +0100
User-agent: mu4e 0.9.17; emacs 25.0.95.10

Pranith Kumar <address@hidden> writes:

> This patch applies on top of the fence generation patch series.
>
> This commit optimizes fence instructions. Two optimizations are
> currently implemented. These are:
>
> 1. Unnecessary duplicate fence instructions
>
>    If the same fence instruction is detected consecutively, we remove
>    one instance of it.
>
>    ex: mb; mb => mb, strl; strl => strl
>
> 2. Merging weaker fence with subsequent/previous stronger fence
>
>    load-acquire/store-release fence can be combined with a full fence
>    without relaxing the ordering constraint.
>
>    ex: a) ld; ldaq; mb => ld; mb
>        b) mb; strl; st => mb; st

What test cases do you have for this?

Currently the litmus tests don't fire (as they have exactly what they
need). Most programs don't seem to trigger multiple barriers.

>
> Signed-off-by: Pranith Kumar <address@hidden>
> ---
>  tcg/optimize.c | 59 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tcg/tcg.h      |  1 +
>  2 files changed, 60 insertions(+)
>
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index c0d975b..a655829 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -569,6 +569,63 @@ static bool swap_commutative2(TCGArg *p1, TCGArg *p2)
>      return false;
>  }
>
> +/* Eliminate duplicate and unnecessary fence instructions */
> +void tcg_optimize_mb(TCGContext *s)
> +{
> +    int oi, oi_next;
> +    TCGArg prev_op_mb = -1;
> +    TCGOp *prev_op;
> +
> +    for (oi = s->gen_first_op_idx; oi >= 0; oi = oi_next) {
> +        TCGOp *op = &s->gen_op_buf[oi];
> +        TCGArg *args = &s->gen_opparam_buf[op->args];
> +        TCGOpcode opc = op->opc;
> +
> +        switch (opc) {
> +        case INDEX_op_mb:
> +        {
> +            TCGBar curr_mb_type = args[0] & 0xF0;
> +            TCGBar prev_mb_type = prev_op_mb & 0xF0;
> +
> +            if (curr_mb_type == prev_mb_type ||
> +                (curr_mb_type == TCG_BAR_STRL && prev_mb_type == 
> TCG_BAR_SC)) {
> +                /* Remove the current weaker barrier op. The previous
> +                 * barrier is stronger and sufficient.
> +                 * mb; strl => mb; st
> +                 */
> +                tcg_op_remove(s, op);
> +            } else if (curr_mb_type == TCG_BAR_SC &&
> +                       prev_mb_type == TCG_BAR_LDAQ) {
> +                /* Remove the previous weaker barrier op. The current
> +                 * barrier is stronger and sufficient.
> +                 * ldaq; mb => ld; mb
> +                 */
> +                tcg_op_remove(s, prev_op);
> +            } else if (curr_mb_type == TCG_BAR_STRL &&
> +                       prev_mb_type == TCG_BAR_LDAQ) {
> +                /* Consecutive load-acquire and store-release barriers
> +                 * can be merged into one stronger SC barrier
> +                 * ldaq; strl => ld; mb; st
> +                 */
> +                args[0] = (args[0] & 0x0F) | TCG_BAR_SC;
> +                tcg_op_remove(s, prev_op);
> +            }
> +            prev_op_mb = args[0];
> +            prev_op = op;
> +            break;
> +        }
> +        case INDEX_op_insn_start:
> +            break;
> +        default:
> +            prev_op_mb = -1;
> +            prev_op = NULL;
> +            break;
> +        }
> +
> +        oi_next = op->next;
> +    }
> +}
> +
>  /* Propagate constants and copies, fold constant expressions. */
>  void tcg_optimize(TCGContext *s)
>  {
> @@ -1327,4 +1384,6 @@ void tcg_optimize(TCGContext *s)
>              break;
>          }
>      }
> +
> +    tcg_optimize_mb(s);
>  }
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index 60ebc2b..2d7a46d 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -904,6 +904,7 @@ void tcg_gen_callN(TCGContext *s, void *func,
>                     TCGArg ret, int nargs, TCGArg *args);
>
>  void tcg_op_remove(TCGContext *s, TCGOp *op);
> +void tcg_optimize_mb(TCGContext *s);
>  void tcg_optimize(TCGContext *s);
>
>  /* only used for debugging purposes */


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]