qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/6] Implement constant folding and copy propaga


From: Aurelien Jarno
Subject: Re: [Qemu-devel] [PATCH 0/6] Implement constant folding and copy propagation in TCG
Date: Fri, 20 May 2011 21:35:08 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

On Fri, May 20, 2011 at 04:39:27PM +0400, Kirill Batuzov wrote:
> This series implements some basic machine-independent optimizations.  They
> simplify code and allow liveness analysis do it's work better.
> 
> Suppose we have following ARM code:
> 
>  movw    r12, #0xb6db
>  movt    r12, #0xdb6d
> 
> In TCG before optimizations we'll have:
> 
>  movi_i32 tmp8,$0xb6db
>  mov_i32 r12,tmp8
>  mov_i32 tmp8,r12
>  ext16u_i32 tmp8,tmp8
>  movi_i32 tmp9,$0xdb6d0000
>  or_i32 tmp8,tmp8,tmp9
>  mov_i32 r12,tmp8
> 
> And after optimizations we'll have this:
> 
>  movi_i32 r12,$0xdb6db6db
> 
> Here are performance evaluation results on SPEC CPU2000 integer tests in
> user-mode emulation on x86_64 host.  There were 5 runs of each test on
> reference data set.  The tables below show runtime in seconds for all these
> runs.

How are the tests done? Are they done with linux-user, or running the
executables in qemu-system-xxx?

> ARM guest without optimizations:
> Test name       #1       #2       #3       #4       #5    Median
> 164.gzip    1403.612 1403.499 1403.52  1208.55  1403.583 1403.52
> 175.vpr     1237.729 1238.008 1238.019 1176.852 1237.902 1237.902
> 176.gcc      929.511  928.867  929.048  928.927  928.792  928.927
> 181.mcf      196.371  196.335  196.172  197.057  196.196  196.335
> 186.crafty  1547.101 1547.293 1547.133 1547.037 1547.044 1547.101
> 197.parser  3804.336 3804.429 3804.412 3804.45  3804.301 3804.412
> 252.eon     2760.414 2760.45  2473.608 2760.606 2760.216 2760.414
> 253.perlbmk 2557.966 2558.971 2559.731 2479.299 2556.835 2557.966
> 256.bzip2   1296.412 1296.215 1296.63  1296.489 1296.092 1296.412
> 300.twolf   2919.496 2919.444 2919.529 2919.384 2919.404 2919.444
>       
> ARM guest with optimizations:
> Test name       #1       #2       #3       #4       #5    Median    Gain
> 164.gzip    1345.416 1401.741 1377.022 1401.737 1401.773 1401.737   0.13%
> 175.vpr     1116.75  1243.213 1243.32  1243.316 1243.144 1243.213  -0.43%
> 176.gcc      897.045  909.568  850.1    909.65   909.57   909.568   2.08%
> 181.mcf      199.058  198.717  198.28   198.866  197.955  198.717  -1.21%
> 186.crafty  1525.667 1526.663 1525.981 1525.995 1526.164 1525.995   1.36%
> 197.parser  3749.453 3749.522 3749.413 3749.5   3749.484 3749.484   1.44%
> 252.eon     2730.593 2746.525 2746.495 2746.493 2746.62  2746.495   0.50%
> 253.perlbmk 2577.341 2521.057 2578.461 2578.721 2581.313 2578.461  -0.80%
> 256.bzip2   1184.498 1190.116 1294.352 1294.554 1294.637 1294.352   0.16%
> 300.twolf   2894.264 2894.133 2894.398 2894.103 2894.146 2894.146   0.87%
> 
> 
> x86_64 guest without optimizations:
> Test name       #1       #2       #3       #4       #5    Median
> 164.gzip     858.118  858.151  858.09   858.139  858.122  858.122
> 175.vpr      956.361  956.465  956.521  956.438  956.705  956.465
> 176.gcc      647.275  647.465  647.186  647.294  647.268  647.275
> 181.mcf      219.239  221.964  220.244  220.74   220.559  220.559
> 186.crafty  1128.027 1128.071 1128.028 1128.115 1128.123 1128.071
> 197.parser  1815.669 1815.651 1815.669 1815.711 1815.759 1815.669
> 253.perlbmk 1777.143 1777.749 1667.508 1777.051 1778.391 1777.143
> 254.gap     1062.808 1062.758 1062.801 1063.099 1062.859 1062.808
> 255.vortex  1930.693 1930.706 1930.579 1930.7   1930.566 1930.693
> 256.bzip2   1014.566 1014.702 1014.6   1014.274 1014.421 1014.566
> 300.twolf   1342.653 1342.759 1344.092 1342.641 1342.794 1342.759
>      
> x86_64 guest with optimizations:
> Test name       #1       #2       #3       #4       #5    Median    Gain
> 164.gzip     857.485  857.457  857.475  857.509  857.507  857.485   0.07%
> 175.vpr      963.255  962.972  963.27   963.124  963.686  963.255  -0.71%
> 176.gcc      644.123  644.055  644.145  643.818  635.773  644.055   0.50%
> 181.mcf      216.215  217.549  218.744  216.437  217.83   217.549   1.36%
> 186.crafty  1128.873 1128.792 1128.871 1128.816 1128.823 1128.823  -0.07%
> 197.parser  1814.626 1814.503 1814.552 1814.602 1814.748 1814.602   0.06%
> 253.perlbmk 1758.056 1751.963 1753.267 1765.27  1759.828 1758.056   1.07%
> 254.gap     1064.702 1064.712 1064.629 1064.657 1064.645 1064.657  -0.17%
> 255.vortex  1760.638 1936.387 1937.871 1937.471 1760.496 1936.387  -0.29%
> 256.bzip2   1007.658 1007.682 1007.316 1007.982 1007.747 1007.682   0.68%
> 300.twolf   1334.139 1333.791 1333.795 1334.147 1333.732 1333.795   0.67%
> 
> ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not
> work under QEMU for some unrelated reason.
> 
> Kirill Batuzov (6):
>   Add TCG optimizations stub
>   Add copy and constant propagation.
>   Do constant folding for basic arithmetic operations.
>   Do constant folding for boolean operations.
>   Do constant folding for shift operations.
>   Do constant folding for unary operations.
> 
>  Makefile.target |    2 +-
>  tcg/optimize.c  |  539 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tcg/tcg.c       |    6 +
>  tcg/tcg.h       |    3 +
>  4 files changed, 549 insertions(+), 1 deletions(-)
>  create mode 100644 tcg/optimize.c
> 
> -- 
> 1.7.4.1
> 
> 
> 

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
address@hidden                 http://www.aurel32.net



reply via email to

[Prev in Thread] Current Thread [Next in Thread]