|
From: | Michael D. Godfrey |
Subject: | Re: For loop benchmarks between Octave versions |
Date: | Tue, 24 Jun 2014 01:15:15 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 |
On 06/23/2014 11:05 PM, Daniel J Sebald wrote:
Of course, an historical log is written with the relevant information (TIP, date, configuration,On 06/23/2014 04:38 PM, Michael D. Godfrey wrote:On 06/23/2014 03:46 PM, Mike Miller wrote:To get this to work in a useful way at least the following will be needed:On Mon, Jun 23, 2014 at 12:35:25 +0100, Carnë Draug wrote:> On 6 June 2014 17:16, Rik<address@hidden> wrote:>> 6/6/14 >> >> All, >>>> I followed up one of the recent posts about differing performance between >> Octave versions. Alas, I think it is quite true. We don't yet have 100% >> test coverage for behavior, but it seems like another wish list item would >> be a set of benchmarking tests so that we don't accidentally introduce>> large performance regressions. >>>> The test code in question is a double for loop. We expect for loops to be >> inefficient and slow, but at least they shouldn't get any worse than they>> already are.> > Just an idea, but would it be possible to have hydra [1] run such > checks with each commit?We certainly could run some benchmark test with each build (each hydra build ≠ each hg commit). The first step would be to commit something to the repo, a benchmark make target maybe? Once we have something automated that works, how do we compare it over time? I don't know that the Octave hydra build is done on the same (real or virtual) hardware each time, or what other builds or resource problems it might be contending with at the same time. Even if we could be sure that the timing output were reliable and comparable and the hydra build host never changed, is someone going to look at the logs every day? I only look at them when I get a notification that a build failed. That doesn't even include test failures. -- mike1. Suitable scripts to test performance. 2. A table of machines and OS's that are considered typical or at least accessible to developers. This table would include hardware/software configuration and the current "base" performance values. 3. Each test machine should be configured to report its ID when queried by the performance script. Then it would report the current performance and its percent of the base value. 4. If the script is run on a new machine it should offer choices of known machines for the user to choose the one closest o the new machine. This is just a start, but having ANY basis for tracking performance is better than the current state. MichaelDon't forget relative performance tests. Look back at the plot/table from my previous post on the subject. Comparing the for-loop with the instruction "a=a+b" against the for-loop with the instruction "1" showed a big jump in relative CPU consumption somewhere along the way. The question is whether that number is stable enough across compilers and hardware to be useful.Dan
etc.,...) so that the history can be viewed. Michael
[Prev in Thread] | Current Thread | [Next in Thread] |