octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Profiling Octave


From: Clinton Chee
Subject: Profiling Octave
Date: Thu, 25 Mar 2004 11:27:32 +1100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624 Netscape/7.1

Hi Paul, Ben, others,

From Paul's recent email see below, I get the impression that octave can be compiled with gprof (-pg). However, when I build octave, it compiles successfully (I think) with the -pg but when I run it, I get:

/home/chee/programs/octave-2.1.57$>src/octave
panic: Profiling timer expired -- stopping myself...
Profiling timer expired


The platform I did this on is a Linux Cluser with the gcc as shown below:

/home/chee/programs/octave-2.1.57$>gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/3.2.2/specs
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --host=i386-redhat-linux
Thread model: posix
gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)


What do I need to do to get the -pg switch to work. Note that I can build octave and run it if I didn't include the -pg switch.
Same applies for the -p (profile) switch.

Cheers,
Clinton


Paul Thomas wrote:

Dear All,

Please find enclosed an intermediate progress report on Ben Diedrich's and my attempts to profile octave and to find out why the gcc-3.3 build under Cygwin is so slow. We are making progress in so far as we have profiles for good and bad versions but their interpretation is not obvious! I hope that the interaction with a gcc expert ( a "gnuru"?) will push us in the right direction. As soon as we come to any conclusions, we will post them here.

Paul Thomas

------------------------------------------------------------------------

Subject:
[Bug c++/14563] octave built under Cygwin very slow
From:
"paulthomas2 at wanadoo dot fr" <address@hidden>
Date:
24 Mar 2004 09:52:20 -0000
To:
address@hidden


------- Additional Comments From paulthomas2 at wanadoo dot fr  2004-03-24 
09:52 -------
Subject: Re:  octave built under Cygwin very slow

bangerth at dealii dot org wrote:

------- Additional Comments From bangerth at dealii dot org  2004-03-14 20:33 
-------
One of the things you could do is to compile the octave version you use with both 3.2.3 and 3.3.3, in each case with profiling information (i.e. with -pg) and check the output to see whether there is one single function that now takes significantly longer (and thus moved up in the output of gprof, in the list of functions sorted by execution time). If there should be one such function, it might be worth to compile just this one function (or file) with the other compiler, to verify that it indeed is the problem. This way we could at least localize the problem a little better. This would also be of great value to us, since we have no clue about the octave functions, and it is hard for us to look at profiling output without knowing what all these functions do etc. If you have identified the function that has the slowdown (assuming that it is a single function), then one would rip it out of the program and place it (and whatever else it needs) into a small file where main() simply calls this function a number of times with dummy arguments. This way we would have a simpler and much smaller testcase. W.


Ben Diedrich and I have done as you suggested. It has to be said that neither of us have tried profiling anything, let alone a code as large and as complicated as octave. I did the gcc-3.2 and Ben did the gcc-3.3. Our machines are nearly identical, both with Windows 2000 and with the same version of Cygwin.

We ran a small test programme in octave that shows a factor of approximately 6 difference in execution time, when octave is built with gcc-3.2 (~1.8-2s) or gcc-3.3 (10-12s). When compiled and linked with -pg set, the execution time increased to 9s with gcc-3.2 (Ben, what is the corresponding figure for gcc-3.3?). Switching off the -O2 flags, increased the time by a further 3s but resulted in a much more detailed profile. In the enclosed, we increased the number of loops by a factor of 10, in order to get a reasonable resolution on the less frequently visited functions. So remember that the corresponding wall-clock time is 90s for cycle 2.

I have added the most pertinent part of the graphical output. If you want to see the entirity of the gprof output, I can easily forward it to you for both builds.

To first order, there is an enormous amount of unaccounted time but both builds are more or less identical in their time in octave functions. The only significant difference that I can see is the appearance in the gcc-3.3 version of
                                                <spontaneous>
[7]     22.3    5.62    0.00                 _Unwind_SjLj_Register [7]

                                                <spontaneous>
[9]     12.4    3.11    0.00                 _Unwind_SjLj_Unregister [9]

which together take a significant amout of time. What are these calls and could they be the culprits?

I will be spending a bit of time over the next few days to get a better feel for profiling and its relationship to the execution times without -pg set. Whilst the time spent in various routines does change in correct proportion to the content of the octave code, the absolute magnitudes are way out.

I hope that this helps to give you a clue. Any advice that you can offer on the profiling would be gratefully received.

Paul Thomas
Profiles for octave-2.1.50 built using (i) gcc-3.2 and (ii) gcc-3.3

Paul Thomas and Ben Diedrich   03/23/04

Both run with test programme, entered from octave command line:

a = cputime ; tot = 0 ; x = [1:1e6] ; for i = 1:1e6 ; tot = tot + x(i) ; end ; 
disp(cputime-a)

The individual lines in the graphical output can be identified with features in 
the test programme and the times correspond
well, in proportion, with timing tests done in octave. The absolute times are way out.
granularity: each sample hit covers 4 byte(s) for 0.04% of 25.18 seconds

index % time    self  children    called     name

Built with gcc-3.2
                
[3]     70.7    2.33    6.14       4+16000183 <cycle 2 as a whole> [3]
               0.27    2.15 1000017             tree_index_expression::rvalue(int) 
<cycle 2> [8]
               0.29    1.06 3000033             tree_identifier::rvalue(int) 
<cycle 2> [12]
               0.24    0.58 1000015             tree_simple_assignment::rvalue() 
<cycle 2> [17]
               0.22    0.51 1000022             tree_statement::eval(bool, int, bool) 
<cycle 2> [19]
               0.19    0.47 1000009+4           tree_binary_expression::rvalue() 
<cycle 2> [20]
               0.28    0.24 1000003             
tree_argument_list::convert_to_const_vector(octave_value const*) <cycle 2> [23]
               0.21    0.26 3000032             tree_identifier::rvalue() <cycle 
2> [27]
               0.14    0.23 1000015             tree_simple_assignment::rvalue(int) 
<cycle 2> [32]
               0.17    0.13 1000006             tree_statement_list::eval(bool, int) 
<cycle 2> [36]
               0.09    0.17 1000005             make_value_list(tree_argument_list*, 
string_vector const&, octave_value const*) <cycle 2> [40]
               0.15    0.09 1000014             tree_index_expression::rvalue() 
<cycle 2> [45]
               0.02    0.19 1000000             
tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&, bool&) 
<cycle 2> [47]
               0.06    0.06       1             tree_simple_for_command::eval() 
<cycle 2> [64]
               0.00    0.00       2             
octave_user_function::do_multi_index_op(int, octave_value_list const&) <cycle 
2> [184]
               0.00    0.00       3             octave_value::do_multi_index_op(int, 
octave_value_list const&) <cycle 2> [208]
               0.00    0.00       2             
tree_parameter_list::convert_to_const_vector(tree_va_return_list*) <cycle 2> 
[242]
               0.00    0.00       2             tree_if_command::eval() <cycle 
2> [1079]
               0.00    0.00       2             tree_if_command_list::eval() 
<cycle 2> [1084]
               0.00    0.00       2             tree_if_clause::eval() <cycle 
2> [1077]
---

Built with gcc-3.3

[3]     42.7    3.07    7.68       3+16000180 <cycle 2 as a whole> [3]
               0.36    2.66 1000017             tree_index_expression::rvalue(int) 
<cycle 2> [10]
               0.36    1.27 3000032             tree_identifier::rvalue(int) 
<cycle 2> [14]
               0.41    0.88 1000015             tree_simple_assignment::rvalue() 
<cycle 2> [17]
               0.25    0.66 1000021             tree_statement::eval(bool, int, bool) 
<cycle 2> [21]
               0.13    0.64 1000009+4           tree_binary_expression::rvalue() 
<cycle 2> [25]
               0.26    0.31 1000015             tree_simple_assignment::rvalue(int) 
<cycle 2> [29]
               0.29    0.28 1000003             
tree_argument_list::convert_to_const_vector(octave_value const*) <cycle 2> [30]
               0.27    0.29 3000032             tree_identifier::rvalue() <cycle 
2> [31]
               0.30    0.13 1000005             tree_statement_list::eval(bool, int) 
<cycle 2> [35]
               0.17    0.17 1000005             make_value_list(tree_argument_list*, 
string_vector const&, octave_value const*) <cycle 2> [44]
               0.05    0.24 1000000             
tree_simple_for_command::do_for_loop_once(octave_lvalue&, octave_value const&, bool&) 
<cycle 2> [49]
               0.10    0.10 1000014             tree_index_expression::rvalue() 
<cycle 2> [63]
               0.12    0.06       1             tree_simple_for_command::eval() 
<cycle 2> [69]
               0.00    0.00       2             
octave_user_function::do_multi_index_op(int, octave_value_list const&) <cycle 
2> [186]
               0.00    0.00       2             
tree_parameter_list::convert_to_const_vector(tree_va_return_list*) <cycle 2> 
[233]
               0.00    0.00       2             octave_value::do_multi_index_op(int, 
octave_value_list const&) <cycle 2> [1058]
               0.00    0.00       2             tree_if_command::eval() <cycle 
2> [1068]
               0.00    0.00       2             tree_if_command_list::eval() 
<cycle 2> [1074]
               0.00    0.00       2             tree_if_clause::eval() <cycle 
2> [1066]


Lines that do not appear in the version built with gcc-3.2

                                                <spontaneous>
[7]     22.3    5.62    0.00                 _Unwind_SjLj_Register [7]

                                                <spontaneous>
[9]     12.4    3.11    0.00                 _Unwind_SjLj_Unregister [9]

                                                <spontaneous>
[11]    10.2    2.56    0.00                 operator new(unsigned int) [11]



--


----------------------------------------------------------------------------
Clinton Chee - PhD
Computational Scientist
High Performance Computing Support Unit
University of New South Wales
SYDNEY 2052 AUSTRALIA chee at parallel stop hpc stop unsw stop edu stop au
Tel: 61 2 9385 6915 Mob: 61 4 0205 8983
----------------------------------------------------------------------------




reply via email to

[Prev in Thread] Current Thread [Next in Thread]