qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC v2 PATCH] tcg: Optimize fence instructions


From: Pranith Kumar
Subject: Re: [Qemu-devel] [RFC v2 PATCH] tcg: Optimize fence instructions
Date: Wed, 10 Aug 2016 13:58:29 -0400

Hi Richard,

On Wed, Aug 10, 2016 at 6:44 AM, Richard Henderson <address@hidden> wrote:
> On 08/10/2016 04:42 AM, Pranith Kumar wrote:
>>
>> +/* Eliminate duplicate and unnecessary fence instructions */
>> +void tcg_optimize_mb(TCGContext *s)
>> +{
>> +    int oi, oi_next;
>> +    TCGArg prev_op_mb = -1;
>> +    TCGOp *prev_op = NULL;
>> +
>> +    for (oi = s->gen_op_buf[0].next; oi != 0; oi = oi_next) {
>> +        TCGOp *op = &s->gen_op_buf[oi];
>> +        TCGArg *args = &s->gen_opparam_buf[op->args];
>> +        TCGOpcode opc = op->opc;
>> +
>> +        switch (opc) {
>> +        case INDEX_op_mb:
>> +        {
>> +            TCGBar curr_mb_type = args[0] & 0xF0;
>> +            TCGBar prev_mb_type = prev_op_mb & 0xF0;
>> +
>> +            if (curr_mb_type == prev_mb_type ||
>> +                (curr_mb_type == TCG_BAR_STRL && prev_mb_type ==
>> TCG_BAR_SC)) {
>
>
> As I said before, I don't think you should be doing any of this if
> prev_op_mb == -1.  I think that's the first thing you should test within
> INDEX_op_mb.

If prev_op_mb is -1, none of the conditions evaluate to true and we
will fall through. Did you want me to check this as an optimization or
am I missing your point?

>
>> +                /* Remove the current weaker barrier op. The previous
>> +                 * barrier is stronger and sufficient.
>> +                 * mb; strl => mb; st
>> +                 */
>> +                tcg_op_remove(s, op);
>> +                op = prev_op;
>
>
> As I said before, surely you have to merge the low 4 bits as well.

Since we are removing the weaker barrier and leaving the stronger
barrier, I think just updating args[0] to have the stronger barrier
semantics should be enough.

                tcg_op_remove(s, op);
                op = prev_op;
                args[0] = prev_op_mb;
                break;

Is there something I am missing when you say merge?

>
>> +        case INDEX_op_ld8u_i32:
>> +        case INDEX_op_ld8u_i64:
>> +        case INDEX_op_ld8s_i32:
>> +        case INDEX_op_ld8s_i64:
>> +        case INDEX_op_ld16u_i32:
>> +        case INDEX_op_ld16u_i64:
>> +        case INDEX_op_ld16s_i32:
>> +        case INDEX_op_ld16s_i64:
>> +        case INDEX_op_ld_i32:
>> +        case INDEX_op_ld32u_i64:
>> +        case INDEX_op_ld32s_i64:
>> +        case INDEX_op_ld_i64:
>> +        case INDEX_op_st8_i32:
>> +        case INDEX_op_st8_i64:
>> +        case INDEX_op_st16_i32:
>> +        case INDEX_op_st16_i64:
>> +        case INDEX_op_st_i32:
>> +        case INDEX_op_st32_i64:
>> +        case INDEX_op_st_i64:
>
>
> None of these are guest memory operations.  These are host access to the ENV
> structure, primarily for cpu registers that aren't common enough to warrant
> a TCG temporary.
>
> You wanted to be checking for
>
>   INDEX_op_qemu_ld_i32
>   INDEX_op_qemu_ld_i64
>   INDEX_op_qemu_st_i32
>   INDEX_op_qemu_st_i64
>
> which are the guest memory operations.
>

Yes, I got confused. I will fix this.


>> +        case INDEX_op_call:
>> +            prev_op_mb = -1;
>> +            prev_op = NULL;
>> +            break;
>
>
> Probably you could skip the calls to "pure" helpers.  I.e. those with
> TCG_CALL_NO_RWG_SE.  But it's probably not that important and we can leave
> it for another day.
>
>
> r~



-- 
Pranith



reply via email to

[Prev in Thread] Current Thread [Next in Thread]