avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] '-morder' option with Avr-libc: comparison table


From: Dmitry K.
Subject: Re: [avr-gcc-list] '-morder' option with Avr-libc: comparison table
Date: Mon, 14 Jan 2008 17:16:22 +1000
User-agent: KMail/1.5

On Monday 14 January 2008 10:55, Andy wrote:
> Great work!
>
> morder1 and my order were very close. I think the difference only
> becomes apparent when operands are 4 bytes or longer. morder2 is very
> bad as early assignment of a byte in an odd register will bump
> assignment of  wider operands. So you get a few extra moves. (which are
> more of a problem on At90s8515, than mega)

Hi.
There are a few functions, where yours order is considerable
better than '-morder1'. (Although in summary order1 is
slightly better.) Below I include the full reports for both.

> So I would expect floating point only to show difference between my
> order and morder1. Also note, mcall prolog will hide stack usage effect
> on text (push/pop) size- since any number of push/pop >0 has same size.
> (I tend to leave off mcall prolog as saftey check on stack impact)
>
> How did you determine stack usage ?

This is a real stack usage regardless of with or without
'-mcall-prologues' option -- a simulation is used to
determine.  The algorithm is:
  - fill the stack with some value
  - run the target function
  - find the first corrupted byte in stack (depth)
  - repeat above with another value (to avoid possible
  collision)
  - link the program with dummy function ('reti')
  - repeat above
  - subtract this depthes and add 2 (stack usage of
  dummy function)
The result stack usage:
  - includes all internally called functions
  - does not include stack for arguments (printf, scanf).

> -frename-registers is not well described - no idea what it does! I tend
> to use -Os as benchmark - which excludes this.

Precise, all test cases include '-Os' in option list,
the '-frename-registers' was in addition.

Regards,
Dmitry.


The full report of case: '-morder1' is added:

AVR:   at90s8515__________________________  atmega8____________________________ 
GCC:   3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X  3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X 
-------------------------------------------------------------------------------
bsearch("z",s,sizeof(s),1,cmp)
Flash:   276   270   266   266   266   268    214   212   208   208   208   204 
Stack:    16    16    16    16    16    16     16    16    16    16    16    16 
 Time:   534   533   530   530   530   530    327   334   331   331   331   327 
-------------------------------------------------------------------------------
dtostre(1.2345,s,6,0)
Flash:  1000   996  1102  1100  1090  1172    932   928  1018  1004   998  1078 
Stack:    15    15    15    19    19    17     15    15    15    19    19    17 
 Time:  1197  1196  1284  1296  1296  1290   1058  1057  1118  1136  1136  1140 
-------------------------------------------------------------------------------
dtostrf(1.2345,15,6,s)
Flash:  1666  1688  1648  1666  1632  1692   1514  1530  1522  1544  1506  1570 
Stack:    35    38    39    38    37    39     35    38    39    38    37    39 
 Time:  1668  1619  1653  1604  1595  1616   1479  1436  1480  1442  1432  1455 
-------------------------------------------------------------------------------
free(p)
Flash:   550   558   552   552   562   572    498   506   510   510   516   516 
Stack:     4     4     4     4     4     4      4     4     4     4     4     4 
 Time:   220   227   227   227   229   228    200   207   210   210   211   209 
-------------------------------------------------------------------------------
malloc(1)
Flash:   550   558   552   552   562   572    498   506   510   510   516   516 
Stack:     2     4     4     4     4     4      2     4     4     4     4     4 
 Time:   184   191   193   193   195   194    166   173   177   177   178   176 
-------------------------------------------------------------------------------
qsort(s,sizeof(s),1,cmp)
Flash:  1332  1306  1220  1222  1242  1488   1074  1070   994   996  1008  1262 
Stack:    36    36    36    36    38    40     36    36    36    36    38    40 
 Time: 22017 21944 20182 20474 20914 20949  16965 16955 16002 16294 16678 16854 
-------------------------------------------------------------------------------
rand()
Flash:   528   528   498   492   508   498    498   498   478   480   484   456 
Stack:    18    18    18    18    18    18     18    18    18    18    18    18 
 Time:  1493  1493  1484  1484  1488  1484   1491  1491  1482  1482  1484  1475 
-------------------------------------------------------------------------------
realloc((void*)0,1)
Flash:  1180  1190  1156  1140  1172  1172   1052  1060  1052  1044  1056  1052 
Stack:    20    22    20    18    22    20     20    22    20    18    22    20 
 Time:   300   307   301   293   311   304    277   284   280   272   289   279 
-------------------------------------------------------------------------------
sprintf_min(s,"%d",12345)
Flash:  1266  1232  1280  1200  1204  1174   1126  1106  1150  1074  1076  1046 
Stack:    51    51    54    53    59    53     51    51    54    53    59    53 
 Time:  1826  1821  1803  1809  1844  1807   1686  1686  1672  1677  1710  1673 
-------------------------------------------------------------------------------
sprintf(s,"%d",12345)
Flash:  1698  1670  1696  1632  1664  1606   1518  1496  1518  1454  1490  1422 
Stack:    54    54    57    57    58    57     54    54    57    57    58    57 
 Time:  1631  1626  1637  1616  1608  1619   1544  1545  1554  1535  1527  1535 
-------------------------------------------------------------------------------
sprintf_flt(s,"%e",1.2345)
Flash:  3398  3312  3292  3254  3330  3206   3088  3032  2998  2960  3038  2936 
Stack:    61    61    63    64    66    65     61    61    63    64    66    65 
 Time:  2509  2494  2500  2482  2513  2503   2280  2276  2282  2263  2297  2292 
-------------------------------------------------------------------------------
sscanf_min("12345","%d",&i)
Flash:  1456  1444  1474  1474  1482  1500   1336  1328  1342  1344  1352  1366 
Stack:    49    49    53    53    59    55     49    49    53    53    59    55 
 Time:  1679  1657  1625  1628  1623  1690   1388  1372  1344  1347  1341  1397 
-------------------------------------------------------------------------------
sscanf("12345","%d",&i)
Flash:  1796  1768  1890  1872  1832  1874   1638  1614  1688  1674  1652  1674 
Stack:    50    50    54    54    61    56     50    50    54    54    61    56 
 Time:  1713  1701  1699  1694  1739  1761   1430  1419  1422  1417  1451  1473 
-------------------------------------------------------------------------------
sscanf_flt("1.2345","%e",&x)
Flash:  4122  4066  4188  4110  4210  4468   3790  3744  3796  3756  3848  4088 
Stack:   124   124   126   128   140   132    124   124   126   128   140   132 
 Time:  3029  3019  3022  3031  3092  3171   2656  2651  2647  2659  2715  2784 
-------------------------------------------------------------------------------
strtod("1.2345",&p)
Flash:  1534  1516  1666  1622  1622  1902   1434  1420  1526  1496  1496  1760 
Stack:    20    20    20    22    22    20     20    20    20    22    22    20 
 Time:  1238  1234  1271  1268  1266  1294    977   975  1004  1005  1003  1032 
-------------------------------------------------------------------------------
strtol("12345",&p,0)
Flash:   772   802   748   840   896   804    746   766   736   798   800   748 
Stack:    16    16    17    17    25    21     16    20    21    21    21    21 
 Time:   896   918   863   915   999   904    675   692   649   689   700   663 
-------------------------------------------------------------------------------
strtoul("12345",&p,0)
Flash:   742   740   786   752   804   866    716   720   820   794   778   824 
Stack:    16    16    19    19    25    27     16    20    25    25    25    27 
 Time:   889   887   904   886   951  1019    668   673   740   726   726   772 
===============================================================================
Summary
Flash: 23866 23644 24014 23746 24078 24834  21672 21536 21866 21646 21822 22518 
Stack:   587   594   615   620   673   644    587   602   625   630   669   644 
 Time: 43023 42867 41178 41430 42193 42363  35267 35226 34394 34662 35209 35536 


The full report of case: Experimental order of Andrew Hutchinson:

AVR:   at90s8515__________________________  atmega8____________________________ 
GCC:   3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X  3.3.6 3.4.6 4.0.4 4.1.2 4.2.2 4.3.X 
-------------------------------------------------------------------------------
bsearch("z",s,sizeof(s),1,cmp)
Flash:   280   274   266   266   266   268    216   214   208   208   208   212 
Stack:    16    16    16    16    16    16     16    16    16    16    16    16 
 Time:   536   535   530   530   530   530    328   335   331   331   331   331 
-------------------------------------------------------------------------------
dtostre(1.2345,s,6,0)
Flash:   998   996  1108  1098  1084  1088    930   928  1024  1002   994   994 
Stack:    15    15    16    19    19    19     15    15    16    19    19    19 
 Time:  1196  1196  1291  1298  1296  1297   1057  1057  1124  1137  1136  1136 
-------------------------------------------------------------------------------
dtostrf(1.2345,15,6,s)
Flash:  1632  1684  1650  1688  1650  1702   1474  1526  1526  1560  1520  1576 
Stack:    37    38    39    38    39    39     37    38    39    38    39    39 
 Time:  1682  1619  1677  1627  1625  1638   1487  1436  1496  1458  1455  1470 
-------------------------------------------------------------------------------
free(p)
Flash:   544   542   556   556   556   576    494   492   512   512   510   524 
Stack:     4     4     4     4     4     6      4     4     4     4     4     6 
 Time:   228   223   227   227   229   238    209   205   210   210   211   218 
-------------------------------------------------------------------------------
malloc(1)
Flash:   544   542   556   556   556   576    494   492   512   512   510   524 
Stack:     4     4     4     4     4     6      4     4     4     4     4     6 
 Time:   194   189   193   193   195   204    176   172   177   177   178   185 
-------------------------------------------------------------------------------
qsort(s,sizeof(s),1,cmp)
Flash:  1360  1328  1228  1230  1250  1538   1092  1084  1002  1004  1016  1284 
Stack:    36    36    36    36    38    40     36    36    36    36    38    40 
 Time: 22627 22558 20782 21074 21514 20454  17570 17558 16602 16894 17278 15982 
-------------------------------------------------------------------------------
rand()
Flash:   528   528   498   492   508   498    498   498   478   480   484   456 
Stack:    18    18    18    18    18    18     18    18    18    18    18    18 
 Time:  1493  1493  1484  1484  1488  1484   1491  1491  1482  1482  1484  1475 
-------------------------------------------------------------------------------
realloc((void*)0,1)
Flash:  1172  1164  1176  1152  1158  1194   1050  1038  1058  1046  1044  1076 
Stack:    22    20    20    20    20    24     22    20    20    20    20    24 
 Time:   310   297   301   301   303   322    287   275   280   280   281   298 
-------------------------------------------------------------------------------
sprintf_min(s,"%d",12345)
Flash:  1274  1240  1284  1204  1204  1174   1130  1110  1152  1076  1076  1046 
Stack:    51    51    54    53    59    53     51    51    54    53    59    53 
 Time:  1838  1833  1813  1819  1844  1807   1692  1692  1677  1682  1710  1673 
-------------------------------------------------------------------------------
sprintf(s,"%d",12345)
Flash:  1706  1678  1700  1636  1664  1606   1522  1500  1520  1456  1490  1422 
Stack:    54    54    57    57    58    57     54    54    57    57    58    57 
 Time:  1643  1638  1647  1626  1608  1619   1550  1551  1559  1540  1527  1535 
-------------------------------------------------------------------------------
sprintf_flt(s,"%e",1.2345)
Flash:  3422  3332  3296  3258  3324  3200   3100  3042  2996  2958  3032  2930 
Stack:    61    61    63    64    66    65     61    61    63    64    66    65 
 Time:  2523  2508  2511  2493  2512  2502   2287  2283  2287  2268  2296  2291 
-------------------------------------------------------------------------------
sscanf_min("12345","%d",&i)
Flash:  1464  1452  1496  1494  1494  1518   1340  1332  1358  1360  1360  1380 
Stack:    49    49    53    53    59    55     49    49    53    53    59    55 
 Time:  1679  1657  1625  1628  1623  1690   1388  1372  1344  1347  1341  1397 
-------------------------------------------------------------------------------
sscanf("12345","%d",&i)
Flash:  1810  1782  1906  1888  1844  1886   1648  1624  1702  1688  1660  1682 
Stack:    50    50    54    54    61    56     50    50    54    54    61    56 
 Time:  1713  1701  1689  1694  1739  1756   1430  1419  1412  1417  1451  1468 
-------------------------------------------------------------------------------
sscanf_flt("1.2345","%e",&x)
Flash:  4152  4090  4222  4134  4222  4492   3810  3760  3822  3774  3858  4110 
Stack:   124   124   130   128   140   132    124   124   130   128   140   132 
 Time:  3031  3021  3041  3035  3094  3173   2658  2653  2670  2663  2717  2788 
-------------------------------------------------------------------------------
strtod("1.2345",&p)
Flash:  1534  1516  1672  1622  1622  1902   1434  1420  1528  1496  1496  1764 
Stack:    20    20    24    22    22    20     20    20    24    22    22    20 
 Time:  1238  1234  1288  1268  1266  1294    977   975  1025  1005  1003  1034 
-------------------------------------------------------------------------------
strtol("12345",&p,0)
Flash:   772   802   748   840   900   804    746   766   736   798   800   748 
Stack:    16    16    17    17    25    21     16    20    21    21    21    21 
 Time:   896   918   863   915   999   904    675   692   649   689   700   663 
-------------------------------------------------------------------------------
strtoul("12345",&p,0)
Flash:   742   740   786   752   800   866    716   720   816   790   774   824 
Stack:    16    16    19    19    25    27     16    20    25    25    25    27 
 Time:   889   887   904   886   951  1019    668   673   740   726   726   772 
===============================================================================
Summary
Flash: 23934 23690 24148 23866 24102 24888  21694 21546 21950 21720 21832 22552 
Stack:   593   592   624   622   673   654    593   600   634   632   669   654 
 Time: 43716 43507 41866 42098 42816 41931  35930 35839 35065 35306 35825 34716 

End of list





reply via email to

[Prev in Thread] Current Thread [Next in Thread]