This series implements some basic machine-independent
optimizations. They
simplify code and allow liveness analysis do it's work better.
Suppose we have following ARM code:
movw r12, #0xb6db
movt r12, #0xdb6d
In TCG before optimizations we'll have:
movi_i32 tmp8,$0xb6db
mov_i32 r12,tmp8
mov_i32 tmp8,r12
ext16u_i32 tmp8,tmp8
movi_i32 tmp9,$0xdb6d0000
or_i32 tmp8,tmp8,tmp9
mov_i32 r12,tmp8
And after optimizations we'll have this:
movi_i32 r12,$0xdb6db6db
Here are performance evaluation results on SPEC CPU2000 integer
tests in
user-mode emulation on x86_64 host. There were 5 runs of each
test on
reference data set. The tables below show runtime in seconds for
all these
runs.