[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw ins
From: |
Richard Henderson |
Subject: |
Re: [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw instruction |
Date: |
Thu, 6 Jun 2019 13:34:17 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 |
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
> + for (i = 0; i < 2; i++) {
> + if (i == 0) {
> + /* Get high doubleword element of vB in avr. */
> + get_avr64(avr, VB, true);
> + } else {
> + /* Get low doubleword element of vB in avr. */
> + get_avr64(avr, VB, false);
> + }
Better as simply get_avr64(avr, VB, i);
> + /*
> + * Perform count for every word element using tcg_gen_clzi_i64.
> + * Since it counts leading zeros on 64 bit lenght, we have to move
> + * ith word element to highest 32 bits of tmp, or it with mask(so we
> get
> + * all ones in lowest 32 bits), then perform tcg_gen_clzi_i64 and
> move
> + * it's result in appropriate word element of result.
> + */
> + tcg_gen_shli_i64(tmp, avr, 32);
> + tcg_gen_or_i64(tmp, tmp, mask);
> + tcg_gen_clzi_i64(result, tmp, 64);
> +
> + tcg_gen_or_i64(tmp, avr, mask);
> + tcg_gen_clzi_i64(tmp, tmp, 64);
s/64/32.
> + tcg_gen_deposit_i64(result, result, tmp, 32, 32);
That said, it's probably better to treat this as 4 words, not 2 doublewords.
for (i = 0; i < 4; i++) {
tcg_gen_ld_i32(tmp, cpu_env, avr_full_offset(VB) + i * 4);
tcg_gen_clzi_i32(tmp, tmp, 32);
tcg_gen_st_i32(tmp, cpu_env, avr_full_offset(VT) + i * 4);
}
r~
- [Qemu-devel] [PATCH 7/8] target/ppc: Optimize emulation of vclzh and vclzb instructions, (continued)
- [Qemu-devel] [PATCH 7/8] target/ppc: Optimize emulation of vclzh and vclzb instructions, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 3/8] target/ppc: Optimize emulation of vpkpx instruction, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 5/8] target/ppc: Optimize emulation of vclzd instruction, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 8/8] target/ppc: Refactor emulation of vmrgew and vmrgow instructions, Stefan Brankovic, 2019/06/06
- [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw instruction, Stefan Brankovic, 2019/06/06
- Re: [Qemu-devel] [PATCH 6/8] target/ppc: Optimize emulation of vclzw instruction,
Richard Henderson <=
- Re: [Qemu-devel] [PATCH 0/8] Optimize emulation of ten Altivec instructions: lvsl,, Richard Henderson, 2019/06/06
- Re: [Qemu-devel] [PATCH 0/8] Optimize emulation of ten Altivec instructions: lvsl,, Howard Spoelstra, 2019/06/06