qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [PATCH 15/17] ppc: store CR registers in 32


From: Richard Henderson
Subject: Re: [Qemu-ppc] [Qemu-devel] [PATCH 15/17] ppc: store CR registers in 32 1-bit registers
Date: Tue, 09 Sep 2014 09:03:43 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.7.0

On 09/04/2014 11:27 AM, Tom Musta wrote:
>> -    tcg_gen_trunc_tl_i32(cpu_crf[crf], cpu_so);
>> +    tcg_gen_trunc_tl_i32(cpu_cr[crf * 4 + CRF_SO], cpu_so);
> 
> This looks correct to me but is causing problems.  The above statement seems 
> to get dropped in the generated asm ... at least on a PPC host:
> 
> IN:
> 0x00000000100005b4:  cmpw    cr3,r30,r29
> 
> OUT: [size=160]
> 0x6041ad30:  lwz     r14,-4(r27)
> 0x6041ad34:  cmpwi   cr7,r14,0
> 0x6041ad38:  bne-    cr7,0x6041adbc
> 0x6041ad3c:  ld      r14,240(r27)   <<< r30
> 0x6041ad40:  ld      r15,232(r27)   <<< r31
> 0x6041ad44:  cmpw    cr7,r14,r15    <<< this is the TCG_COND_LTx code
> 0x6041ad48:  li      r16,1
> 0x6041ad4c:  li      r0,0
> 0x6041ad50:  isel    r16,r16,r0,28
> 0x6041ad54:  stw     r16,576(r27)   <<< store cpu_cr[LT]
> 0x6041ad58:  cmpw    cr7,r14,r15
> 0x6041ad5c:  li      r16,1
> 0x6041ad60:  li      r0,0
> 0x6041ad64:  isel    r16,r16,r0,29
> 0x6041ad68:  stw     r16,580(r27)   <<< store cpu_cr[GT]
> 0x6041ad6c:  cmplw   cr7,r14,r15
> 0x6041ad70:  li      r14,1
> 0x6041ad74:  li      r0,0
> 0x6041ad78:  isel    r14,r14,r0,30
> 0x6041ad7c:  stw     r14,584(r27)   <<< store cpu_cr[EQ]
> 0x6041ad80:  .long 0x0
> 0x6041ad84:  .long 0x0
> 
> Richard:  any ideas or hints on how to proceed?

Check the op dumps and make sure it's there.  If it is, but is getting
discarded somewhere further down the pipeline, then try and get me a testcase.


> This is a very nice cleanup ... but it oversteers just a little.  For some CR 
> logical instructions, the generated code can produce non-zero bits in the i32 
> cr variable in places other than the LSB.
> For example, consider crnand, which produces the following on a PPC host:
> 
> IN:
> 0x0000000010000578:  crnand  4*cr7+so,4*cr7+lt,4*cr7+eq
> 
> OUT: [size=112]
> 0x6041a630:  lwz     r14,-4(r27)
> 0x6041a634:  cmpwi   cr7,r14,0
> 0x6041a638:  bne-    cr7,0x6041a68c
> 0x6041a63c:  lwz     r14,640(r27)
> 0x6041a640:  lwz     r15,648(r27)
> 0x6041a644:  nand    r14,r14,r15
> 0x6041a648:  andi.   r14,r14,1
> 0x6041a64c:  stw     r14,652(r27)
> 0x6041a650:  .long 0x0
> 0x6041a654:  .long 0x0
> 0x6041a658:  .long 0x0
> 0x6041a65c:  .long 0x0
> 
> The host nand operation will always produce an i32 value that has 1s in bits 
> 0-30, since they are presumably zero.  A brute-force fix would be to add a 
> tcg_gen_andi_i32(D,D,1) to your macro.  But I think this is required only for 
> a subset of the
> instructions (crnand, crnor, creqv, crorc).

Note that since most hosts don't have nand, the combination

  nand x,y,z
  and  x.x,1

would be better represented with

  and  x,y,z
  xor  x,x,1


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]