qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH v2 0/6] Implement constant folding and copy propagat


From: Kirill Batuzov
Subject: [Qemu-devel] [PATCH v2 0/6] Implement constant folding and copy propagation in TCG
Date: Thu, 9 Jun 2011 14:45:38 +0400

This series implements some basic machine-independent optimizations.  They
simplify code and allow liveness analysis do it's work better.

Suppose we have following ARM code:

 movw    r12, #0xb6db
 movt    r12, #0xdb6d

In TCG before optimizations we'll have:

 movi_i32 tmp8,$0xb6db
 mov_i32 r12,tmp8
 mov_i32 tmp8,r12
 ext16u_i32 tmp8,tmp8
 movi_i32 tmp9,$0xdb6d0000
 or_i32 tmp8,tmp8,tmp9
 mov_i32 r12,tmp8

And after optimizations we'll have this:

 movi_i32 r12,$0xdb6db6db

Here are performance evaluation results on SPEC CPU2000 integer tests in
user-mode emulation on x86_64 host.  There were 5 runs of each test on
reference data set.  The tables below show runtime in seconds for all these
runs.

ARM guest without optimizations:
Test name       #1       #2       #3       #4       #5    Median
1164.gzip    1402.874 1379.836 1417.294 1417.466 1420.494 1417.294
175.vpr     1246.994 1245.201 1251.247 1250.812 1249.648 1249.648
176.gcc      912.617  912.646  913.649  913.443  913.637  913.443
181.mcf      198.141  198.648  196.275  196.9    198.195  198.141
186.crafty  1546.115 1545.978 1548.002 1547.723 1547.799 1547.723
197.parser  3780.037 3780.017 3773.602 3773.535 3773.579 3773.602
252.eon     2776.173 2776.205 2778.144 2778.119 2778.048 2778.048
253.perlbmk 2592.829 2558.778 2594.292 2594.147 2594.408 2594.147
256.bzip2   1198.577 1306.549 1310.027 1310.033 1311.768 1310.027
300.twolf   2918.948 2919.119 2925.63  2926.117 2925.812 2925.63
      
ARM guest with optimizations:
Test name       #1       #2       #3       #4       #5    Median    Gain
1164.gzip    1399.441 1399.356 1416.72  1416.728 1416.728 1416.72   0.04%
175.vpr     1237.045 1143.302 1236.568 1236.503 1236.497 1236.503   1.05%
176.gcc      919.443  919.588  919.675  919.939  906.544  919.588  -0.67%
181.mcf      198.034  198.894  195.263  195.481  195.584  195.584   1.29%
186.crafty  1522.338 1520.968 1521.359 1521.222 1521.355 1521.355   1.70%
197.parser  3787.424 3787.306 3790.889 3791.066 3791.165 3790.889  -0.46%
252.eon     2749.335 2749.254 2750.692 2750.615 2750.678 2750.615   0.99%
253.perlbmk 2479.28  2568.318 2566.599 2566.574 2566.499 2566.574   1.06%
256.bzip2   1297.906 1276.943 1301.607 1301.957 1301.601 1301.601   0.64%
300.twolf   2887.985 2888.23  2882.813 2882.955 2882.533 2882.955   1.46%
 

x86_64 guest without optimizations:
Test name       #1       #2       #3       #4       #5    Median
164.gzip     857.69   857.671  857.661  857.615  857.645  857.661
175.vpr      959.342  959.309  959.274  914.857  959.214  959.274
176.gcc      646.671  646.626  609.978  646.604  646.64   646.626
181.mcf      221.225  221.377  219.661  221.949  220.563  221.225
186.crafty  1129.716 1129.689 1129.636 1129.536 1129.602 1129.636
197.parser  1809.341 1809.494 1809.341 1809.369 1809.256 1809.341
253.perlbmk 1788.619 1679.546 1729.817 1787.017 1785.432 1785.432
254.gap     1061.071 1061.088 1061.072 1061.057 1061.063 1061.071
255.vortex  1914.02  1913.973 1914.048 1742.677 1914.072 1914.02
256.bzip2   1011.95  1011.86  1011.996 1012.023 1012.144 1011.996
300.twolf   1331.837 1330.556 1330.518 1330.554 1330.58  1330.556
      
x86_64 guest with optimizations:
Test name       #1       #2       #3       #4       #5    Median    Gain
164.gzip     863.013  863.008  863.027  863.042  848.468  863.013  -0.62%
175.vpr      970.454  970.685  971.395  970.667  970.68   970.68   -1.19%
176.gcc      644.71   644.698  644.652  636.313  644.711  644.698   0.30%
181.mcf      216.047  219.63   217.556  218.116  219.185  218.116   1.41%
186.crafty  1129.916 1130.078 1129.925 1129.93  1129.893 1129.925  -0.03%
197.parser  1829.2   1829.294 1829.347 1829.381 1829.394 1829.347  -1.11%
253.perlbmk 1769.039 1767.712 1738.613 1769.017 1768.858 1768.858   0.93%
254.gap     1062.494 1062.454 1062.522 1062.407 1062.488 1062.488  -0.13%
255.vortex  1929.135 1928.734 1930.285 1902.448 1928.92  1928.92   -0.78%
256.bzip2   1015.546 1015.64  1015.492 1015.758 1016.62  1015.64   -0.36%
300.twolf   1325.163 1325.249 1325.385 1325.098 1325.116 1325.163   0.41%

ARM guests for 254.gap and 255.vortex and x86_64 guest for 252.eon does not
work under QEMU for some unrelated reason.

Changes:
v1 -> v2
 - State and Vals arrays merged to an array of structures.
 - Added reference counting of temp's copies. This helps to reset temp's state
   faster in most cases.
 - Do not make copy propagation through operations with TCG_OPF_CALL_CLOBBER or
   TCG_OPF_SIDE_EFFECTS flag.
 - Split some expression simplifications into independent switch.
 - Let compiler handle signed shifts and sign/zero extends in it's
   implementation defined way.

Kirill Batuzov (6):
  Add TCG optimizations stub
  Add copy and constant propagation.
  Do constant folding for basic arithmetic operations.
  Do constant folding for boolean operations.
  Do constant folding for shift operations.
  Do constant folding for unary operations.

 Makefile.target |    2 +-
 tcg/optimize.c  |  633 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg.c       |    6 +
 tcg/tcg.h       |    3 +
 4 files changed, 643 insertions(+), 1 deletions(-)
 create mode 100644 tcg/optimize.c

-- 
1.7.4.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]