qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/6] Implement constant folding and copy prop


From: Blue Swirl
Subject: Re: [Qemu-devel] [PATCH v3 0/6] Implement constant folding and copy propagation in TCG
Date: Sat, 30 Jul 2011 13:52:57 +0300

Thanks, applied all.

On Thu, Jul 7, 2011 at 3:37 PM, Kirill Batuzov <address@hidden> wrote:
> This series implements some basic machine-independent optimizations.  They
> simplify code and allow liveness analysis do it's work better.
>
> Suppose we have following ARM code:
>
>  movw    r12, #0xb6db
>  movt    r12, #0xdb6d
>
> In TCG before optimizations we'll have:
>
>  movi_i32 tmp8,$0xb6db
>  mov_i32 r12,tmp8
>  mov_i32 tmp8,r12
>  ext16u_i32 tmp8,tmp8
>  movi_i32 tmp9,$0xdb6d0000
>  or_i32 tmp8,tmp8,tmp9
>  mov_i32 r12,tmp8
>
> And after optimizations we'll have this:
>
>  movi_i32 r12,$0xdb6db6db
>
> Here are performance evaluation results on SPEC CPU2000 integer tests in
> user-mode emulation on x86_64 host.  There were 5 runs of each test on
> reference data set.  The tables below show runtime in seconds for all these
> runs.
>
> ARM guest without optimizations:
> Test name       #1       #2       #3       #4       #5    Median
> 164.gzip    1408.891 1402.323 1407.623 1404.955 1405.396 1405.396
> 175.vpr     1245.31  1248.758 1247.936 1248.534 1247.534 1247.936
> 176.gcc      912.561  809.481  847.057 912.636   912.544  912.544
> 181.mcf      198.384  197.841  199.127 197.976   197.29   197.976
> 186.crafty  1545.881 1546.051 1546.002 1545.927 1545.945 1545.945
> 197.parser  3779.954 3779.878 3779.79  3779.94  3779.88  3779.88
> 252.eon     2563.168 2776.152 2776.395 2776.577 2776.202 2776.202
> 253.perlbmk 2591.781 2504.078 2507.07  2591.337 2463.401 2507.07
> 256.bzip2   1306.197 1304.639 1184.853 1305.141 1305.606 1305.141
> 300.twolf   2918.984 2918.926 2918.93  2918.97  2918.914 2918.93
>
> ARM guest with optimizations:
> Test name       #1       #2       #3       #4       #5    Median    Gain
> 164.gzip    1401.198 1376.337 1401.117 1401.23  1401.246 1401.198   0.30%
> 175.vpr     1247.964 1151.468 1247.76  1154.419 1242.017 1242.017   0.47%
> 176.gcc      896.882  918.546  918.297  851.465  918.39   918.297  -0.63%
> 181.mcf      198.19   197.399  198.421  198.663  198.312  198.312  -0.17%
> 186.crafty  1520.425 1520.362 1520.477 1520.445 1520.957 1520.445   1.65%
> 197.parser  3770.943 3770.927 3770.578 3771.048 3770.904 3770.927   0.24%
> 252.eon     2752.371 2752.111 2752.005 2752.214 2752.109 2752.111   0.87%
> 253.perlbmk 2577.462 2578.588 2493.567 2578.571 2578.318 2578.318  -2.84%
> 256.bzip2   1296.198 1271.128 1296.044 1296.321 1296.147 1296.147   0.69%
> 300.twolf   2888.984 2889.023 2889.225 2889.039 2889.05  2889.039   1.02%
>
>
> x86_64 guest without optimizations:
> Test name       #1       #2       #3       #4       #5    Median
> 164.gzip     857.654  857.646  857.678  798.119  857.675  857.654
> 175.vpr      959.265  959.207  959.185  959.461  959.332  959.265
> 176.gcc      625.722  637.257  646.638  646.614  646.56   646.56
> 181.mcf      221.666  220.194  220.079  219.868  221.5    220.194
> 186.crafty  1129.531 1129.739 1129.573 1129.588 1129.624 1129.588
> 197.parser  1809.517 1809.516 1809.386 1809.477 1809.427 1809.477
> 253.perlbmk 1774.944 1776.046 1769.865 1774.052 1775.236 1774.944
> 254.gap     1061.033 1061.158 1061.064 1061.047 1061.01  1061.047
> 255.vortex  1871.261 1914.144 1914.057 1914.086 1914.127 1914.086
> 256.bzip2    918.916 1011.828 1011.819 1012.11  1011.932 1011.828
> 300.twolf   1332.797 1330.56  1330.687 1330.917 1330.602 1330.687
>
> x86_64 guest with optimizations:
> Test name       #1       #2       #3       #4       #5    Median    Gain
> 164.gzip     806.198  854.159  854.184  854.168  854.187  854.168   0.41%
> 175.vpr      955.905  950.86   955.876  876.397  955.957  955.876   1.82%
> 176.gcc      641.663  640.189  641.57   641.552  641.514  641.552   0.03%
> 181.mcf      217.619  218.627  218.699  217.977  216.955  217.977   1.18%
> 186.crafty  1123.909 1123.852 1123.917 1123.781 1123.805 1123.852   0.51%
> 197.parser  1813.94  1814.643 1815.286 1814.445 1813.72  1814.445  -0.27%
> 253.perlbmk 1791.536 1795.642 1793.0   1797.486 1791.401 1793.0    -1.02%
> 254.gap     1070.605 1070.216 1070.637 1070.168 1070.491 1070.491  -0.89%
> 255.vortex  1918.764 1918.573 1917.411 1918.287 1918.735 1918.573  -0.23%
> 256.bzip2   1017.179 1017.083 1017.283 1016.913 1017.189 1017.179  -0.53%
> 300.twolf   1321.072 1321.109 1321.019 1321.072 1321.004 1321.072   0.72%
>
> ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not
> work under QEMU for some unrelated reason.
>
> Changes:
> v1 -> v2
>  - State and Vals arrays merged to an array of structures.
>  - Added reference counting of temp's copies. This helps to reset temp's state
>   faster in most cases.
>  - Do not make copy propagation through operations with TCG_OPF_CALL_CLOBBER 
> or
>   TCG_OPF_SIDE_EFFECTS flag.
>  - Split some expression simplifications into independent switch.
>  - Let compiler handle signed shifts and sign/zero extends in it's
>   implementation defined way.
>
> v2 -> v3
>  - Elements of equiv class are placed in a double-linked circular list so it's
>   easier to choose a new representative.
>  - CASE_OP_32_64 macro is used to reduce amount of ifdefdsi. Checkpatch is not
>   happy about this change but I do not think spaces would be appropriate here.
>  - Some constraints during copy propagation are relaxed.
>  - Functions tcg_opt_gen_mov and tcg_opt_gen_movi are introduced to reduce 
> code
>   duplication.
>
> Kirill Batuzov (6):
>  Add TCG optimizations stub
>  Add copy and constant propagation.
>  Do constant folding for basic arithmetic operations.
>  Do constant folding for boolean operations.
>  Do constant folding for shift operations.
>  Do constant folding for unary operations.
>
>  Makefile.target |    2 +-
>  tcg/optimize.c  |  568 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  tcg/tcg.c       |    6 +
>  tcg/tcg.h       |    3 +
>  4 files changed, 578 insertions(+), 1 deletions(-)
>  create mode 100644 tcg/optimize.c
>
> --
> 1.7.4.1
>
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]