|
From: | Daniel J Sebald |
Subject: | Re: For loop benchmarks between Octave versions |
Date: | Wed, 18 Jun 2014 13:08:32 -0500 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16 |
Rik,Below are some test results for your benchmark. It pretty much reflects what you've found in terms of trends, with some discrepancy.
John/Rik,I've found a new account/permission error in compilation for the latest development release. (See *** in results below.) I've reported a bug on Savannah (#42581).
Comments:1) Yes, something happened along the way to slow down looping or (more likely) variable (re)assignment. However, it looks like a gradual thing up until the GUI release and then a big jump associated with the GUI release, but I wouldn't conclude that the slow down is attributed to the GUI release given the sporadic sampling of versions.
2) I'd say look as far back as 3.5/3.6 because there was a very good improvement in efficiency with version 3.4.3 that was lost already with 3.6.4.
3) There is a loss associated with GUI/non-GUI within the same version. I suspect that is simply the fact that the core is running in a thread and the process must distribute time between the GUI and worker threads.
4) We could possibly devise a pass-fail speed test with this for-loop. If there were some system command or script that could give a reasonably good measure of system speed (e.g., the processor speed or something), one could compare the for-loop result against that. If the ratio is greater than some hardcoded value, it might throw an error. Seems like a lot of effort, though, just to catch an unknowing introduction of some loss in performance.
Dan Command from Rik: --- Test Code --- a = 1; b = 1; tic; for i=1:1000; for j=1:1000; a = a + b; end; end; toc --- End Test Code --- Options: CFLAGS=-O2 Compiler: gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) Machine: i686, 3.16GHz Kernel: Linux 2.6.35.14-106.fc14.x86_64 As root: Version Average 3.2.4 0.870757 0.862346 0.873173 | 0.86876 3.4.3 0.640826 0.647751 0.647613 | 0.64540 3.6.4 0.936848 0.936887 0.959168 | 0.94430 3.8.0 1.07859 1.06831 1.09116 | 1.0794 3.8.1 1.04754 1.02603 1.08625 | 1.0533 gui-release* (no-gui-libs) 2.27344 2.27286 2.26936 | 2.2719 (no-gui)** dev* (no-gui-libs) 2.36133 2.36267 2.36704 | 2.3637 (no-gui)** *No desktop account for root. **Same reason as * because possible plotting required. As non-root: Version Average 3.2.4 0.881909 0.879953 0.882921 | 0.88159 3.4.3 0.741799 0.743709 0.724818 | 0.73678 3.6.4 0.912584 0.890552 0.904618 | 0.90258 3.8.0 1.12991 1.12122 1.13149 | 1.1275 3.8.1 1.09307 1.09207 1.09 | 1.0917 gui-release 2.76753 2.75898 2.77501 | 2.7672 (no-gui-libs) 2.34878 2.3494 2.35117 | 2.3498 (no-gui) 2.66844 2.66183 2.68225 | 2.6708 dev 2.75023 2.75967 2.74137 | 2.7504 (no-gui-libs)*** (no-gui) 2.68714 2.6861 2.69383 | 2.6890 ***Unable to open file from non-user account with error:/usr/local/src/octave/octave-for_loop_test/build-dev/run-octave --no-gui-libs /usr/bin/ld: cannot open output file /usr/local/src/octave/octave-for_loop_test/build-dev/src/.libs/5665-lt-octave-cli: Permission denied
On 06/06/2014 11:16 AM, Rik wrote:
6/6/14 All, I followed up one of the recent posts about differing performance between Octave versions. Alas, I think it is quite true. We don't yet have 100% test coverage for behavior, but it seems like another wish list item would be a set of benchmarking tests so that we don't accidentally introduce large performance regressions. The test code in question is a double for loop. We expect for loops to be inefficient and slow, but at least they shouldn't get any worse than they already are. --- Test Code --- a = 1; b = 1; tic; for i=1:1000; for j=1:1000; a = a + b; end; end; toc --- End Test Code --- I fixed the CPU frequency before running the tests to avoid differences due to frequency scaling. I experimented a bit with taskset to ensure the Octave process stayed on a single core, but it didn't make a difference so I stopped using it (I would expect this since the Linux kernel also tries for performance reasons not to shift running jobs off of a core). All of the versions are locally compiled on a Linux machine using roughly the same options of '-O2'. Results: Version 3.2.4 3.4.3 3.6.4 3.8.0 3.8.1 gui-release (no-gui-libs) dev (no-gui-libs) dev (gui) 2.24418 1.71425 2.95948 3.4683 3.15712 6.50268 7.00682 8.09868 2.24436 1.71648 2.97603 3.43708 3.15531 6.47645 6.93558 8.0038 2.25401 1.7231 2.96663 3.46252 3.19536 6.46978 6.94941 8.07056 Mean 2.24752 1.71794 2.96738 3.45597 3.16926 6.48297 6.96394 8.05768 There was a doubling in execution time from 3.4.3 to 3.6.4, mostly unchanged from 3.6.4 to 3.8.1, and then another doubling on the gui-release branch. The gui-release branch and other releases going forward is compiled with atomic refcounting which is necessary for the GUI, but does have a performance penalty. I'm re-compiling the gui-release branch with atomic refcounting disabled to see how much it improves things, but I don't have high hopes. The last time I benchmarked this it was only a 10% difference, rather than the 100% difference that the tests show. --Rik
[Prev in Thread] | Current Thread | [Next in Thread] |