freepooma-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

timers and performance measurement under Linux


From: James Crotinger
Subject: timers and performance measurement under Linux
Date: Sun, 5 Aug 2001 21:37:26 -0600

How important is it that the timer used by the benchmark class measure CPU time? I'm not sure how we've even been using this class for parallel codes - in fact, I was thinking that back when we were just testing with threads, we were measuring wall-clock time. But the default implementation of Utilities/Clock.h uses the "clock()" call, which is supposed to measure "elapsed processor time", though I'm not completely sure what that means for a multithreaded program on a multi-processor. On the SGI we were using the high-performance timers, which I believe access the CPUs hardware performance registers, so I would think that that was CPU time as well. Anyway, the problem is that clock() only has 10 ms granularity under Linux and that's really crappy if you're trying to measure scaling with problem size. I've written a version that uses gettimeofday, which has a granularity of something like 15 microseconds under Linux (may depend on processor speed?), but this is definitely not a measure of CPU time. For serial programs this just means that you need to test on an unloaded machine to have the results make sense. For parallel programs, this means that you need to interpret the results differently. Anyway, I'm using this and it seems generally useful, so I'd like to check it in, but was wondering if the clock-time versus CPU-time measurement was a problem for anyone. I suppose we could put an accessor on Clock that would allow the user to find out what type of time was being presented, and Benchmark could examine this and calculate appropriately, if necessary.

As a side comment, I'm using KCC on one of our 1 GHz PIII Linux boxes. ABCTest (which is doing a simple non-stencil calculation, so this is as fast as it gets) is only getting 45 MFlops for the C version, which is the fastest. This seems pretty pathetic. What have other Linux users seen on what types of boxes? Also, what is available on Linux for profiling? Is prof/gprof it? Does KAI sell anything commercial? Does the PIII have any hardware performance monitoring and is there access to it? Does VTune (under Windows) inline sufficiently well to do performance testing there (allegedly good profiling tools, so it might be OK for the serial stuff, but I don't know if it is capable of "inlining everything").

  Jim


reply via email to

[Prev in Thread] Current Thread [Next in Thread]