qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes


From: malc
Subject: Re: [Qemu-devel] [PATCH 0/7] tcg: conditional set and move opcodes
Date: Fri, 18 Dec 2009 18:40:26 +0300 (MSK)

On Thu, 17 Dec 2009, Richard Henderson wrote:

> On 12/17/2009 07:32 AM, malc wrote:
> > > These new opcodes are considered "required" by the backend,
> > > because expanding them at the tcg level breaks the basic block.
> > > There might be some way to emulate within tcg internals, but
> > > that doesn't seem worthwhile, as essentially all hosts have
> > > some form of support for these.
> ..
> >   c. Historically things like that were made conditional with
> >      a generic fallback (bswap, neg, not, rot, etc)
> 
> I answered this one above.  A generic fallback would break the
> basic block, which would break TCGs simple register allocation.
> 
> >   b. Documentation for movcond has a typo, t0 is assigned not t1
> 
> Oops.  Will fix.
> 
> >   d. Documentation for setcond2 is missing
> 
> Ah, I see that brcond2 is missing as well; I'll fix that too.
> 
> > It would also be interesting to learn what impact adding those two
> > has on performance, any results?
> 
> Hmph, not as much as I would have liked.  I suppose Intel is getting pretty
> darned good with its branch prediction.  It shaved about 3 minutes off
> 183.equake from what I posted earlier this week; that's something around a 7%
> improvement, assuming it's not just all noise (I havn't run that test enough
> times to see what the variation is).
> 

After fixing a bug (crop was done after reading the cr) i run some
openssl speed benchmarks, and, at least here on an MPC7447A, got a
speed degradation, tiny but consistent. Took a very quick glance at
the generated code and the first thing i saw was this:

----------------
IN: 
0x40082295:  movzbl (%eax),%eax
0x40082298:  cmp    $0x3d,%al
0x4008229a:  setne  %dl
0x4008229d:  test   %al,%al
0x4008229f:  je     0x400822d2

OP after liveness analysis:
 mov_i32 tmp2,eax
 qemu_ld8u tmp0,tmp2,$0xffffffff
 mov_i32 eax,tmp0
 movi_i32 tmp1,$0x3d
 mov_i32 tmp0,eax
 nopn $0x2,$0x2
 sub_i32 cc_dst,tmp0,tmp1
 movi_i32 tmp13,$0xff
 and_i32 tmp4,cc_dst,tmp13
 movi_i32 tmp13,$0x0
 setcond_i32 tmp0,tmp4,tmp13,ne
 movi_i32 tmp14,$0xff
 and_i32 tmp13,tmp0,tmp14

....

OUT: [size=204]
0x601051b0:  lwz     r14,0(r27)
0x601051b4:  lbzx    r14,0,r14
0x601051b8:  mr      r15,r14
0x601051bc:  addi    r15,r15,-61
0x601051c0:  andi.   r15,r15,255
0x601051c4:  cmpwi   cr6,r15,0
0x601051c8:  crnot   4*cr7+eq,4*cr6+eq
0x601051cc:  mfcr    r0
0x601051d0:  rlwinm  r15,r0,31,31,31
0x601051d4:  andi.   r15,r15,255

...

So the fact that setcond produces 0/1 was never communicated to the
tcg, not that i would claim that it's possible at all...

[..snip..]

-- 
mailto:address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]