qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw ins


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw instruction
Date: Thu, 6 Jun 2019 13:34:17 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0

On 6/6/19 5:15 AM, Stefan Brankovic wrote:
> +    for (i = 0; i < 2; i++) {
> +        if (i == 0) {
> +            /* Get high doubleword element of vB in avr. */
> +            get_avr64(avr, VB, true);
> +        } else {
> +            /* Get low doubleword element of vB in avr. */
> +            get_avr64(avr, VB, false);
> +        }

Better as simply get_avr64(avr, VB, i);

> +        /*
> +         * Perform count for every word element using tcg_gen_clzi_i64.
> +         * Since it counts leading zeros on 64 bit lenght, we have to move
> +         * ith word element to highest 32 bits of tmp, or it with mask(so we 
> get
> +         * all ones in lowest 32 bits), then perform tcg_gen_clzi_i64 and 
> move
> +         * it's result in appropriate word element of result.
> +         */
> +        tcg_gen_shli_i64(tmp, avr, 32);
> +        tcg_gen_or_i64(tmp, tmp, mask);
> +        tcg_gen_clzi_i64(result, tmp, 64);
> +
> +        tcg_gen_or_i64(tmp, avr, mask);
> +        tcg_gen_clzi_i64(tmp, tmp, 64);

s/64/32.

> +        tcg_gen_deposit_i64(result, result, tmp, 32, 32);

That said, it's probably better to treat this as 4 words, not 2 doublewords.

        for (i = 0; i < 4; i++) {
            tcg_gen_ld_i32(tmp, cpu_env, avr_full_offset(VB) + i * 4);
            tcg_gen_clzi_i32(tmp, tmp, 32);
            tcg_gen_st_i32(tmp, cpu_env, avr_full_offset(VT) + i * 4);
        }


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]