octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Using OpenMP in Octave


From: Kai Habel
Subject: Re: Using OpenMP in Octave
Date: Mon, 29 Mar 2010 23:01:53 +0200

-------- Original-Nachricht --------
> Datum: Mon, 29 Mar 2010 11:50:21 +0200
> Von: Jaroslav Hajek <address@hidden>
> An: "Søren Hauberg" <address@hidden>
> CC: octave maintainers mailing list <address@hidden>
> Betreff: Re: Using OpenMP in Octave

> On Mon, Mar 29, 2010 at 7:56 AM, Søren Hauberg <address@hidden> wrote:
> > man, 29 03 2010 kl. 01:37 +0200, skrev David Bateman:
> >> I've had a short discussion with Jaroslav and John off list about
> >> implementing OpenMP multi-threading in Octave and want to bring it back
> >> to the list. The use of OpenMP with Octave 3.4 will still be too
> >> experimental and so if we include the code now I propose to make it off
> >> by default and the changeset I committed on Saturday adds the autoconf
> >> code to probe for OpenMP support, but only if the "--enable-openmp"
> >> configure option is used. Currently it only probes for OpenMP support
> >> for gcc and msvc (though the msvc code is untested).
> >
> > Interesting.
> >
> >> However, the value 1000 is arbitrary and a little benchmarking is
> >> needed. I attach a first experimental changeset for those who want to
> >> experiment. Configured with "--enable-openmp" and a recent tip this
> code
> >> sucessfully run through "make check", but I don't know if the choice of
> >> array size to switch between single and multithread code is optimal.
> >>
> >> A couple of interesting tests might be
> >>
> >> n  = 300; a = ones(n,n,n);
> >> tic; sum(a,1); toc
> >> tic; sum(a,2); toc
> >> tic; sum(a,3); toc
> >> n = 999; a = (1+1i)*ones (n,n); tic; a = real(a); toc
> >> n = 1001; a = (1+1i)*ones (n,n); tic; a = real(a); toc
> >>
> >> before and after the change. Unfortunately I'm developing on a atom and
> >> so I won't personally see much gain from this multi-threading
> >
> > I tried your changeset and ran the above test with no noticeable
> > difference in speed (0.0114329 seconds for n = 999 and 0.013583 seconds
> > for n = 1001) on a dual core laptop.
> >
> > I tried to increase n to 10000 and still saw no noticeable difference
> > between using my ordinary 3.3.51+ installation and the OpenMP version.
> > Do I need to do anything to activate OpenMP when running Octave?
> >
> > Søren
> >
> >
> 
> Hi Soren, try this one instead. I started from David's patch,
> simplified some things and encapsulated the tuning constants (max num
> of threads and minimum size limit) into get/set functions. I removed
> the reductions parallelizations for the time being (those were not
> quite correct).
> 
> Here's a benchmark:
> 
> n = 5e6;
> a = rand (n, 1);
> b = rand (n, 1);
> disp ("operations");
> tic; for i = 1:10, -a ; endfor; toc
> tic; for i = 1:10, a+b ; endfor; toc
> tic; for i = 1:10, a-b ; endfor; toc
> tic; for i = 1:10, a.*b ; endfor; toc
> tic; for i = 1:10, a./b ; endfor; toc
> tic; for i = 1:10, a+=b ; endfor; toc
> tic; for i = 1:10, a.*=b ; endfor; toc
> 
> disp ("mappers");
> tic; for i = 1:10, exp(a); endfor; toc
> tic; for i = 1:10, sin(a); endfor; toc
> tic; for i = 1:10, erf(a); endfor; toc
> tic; for i = 1:10, erfinv(a); endfor; toc
> 
> at my Core 2 Duo, g++ -O3 -march=native,
> with a recent tip, I get:
> 
> address@hidden:~/devel/octave/main> octave -q ttmt.m
> operations
> Elapsed time is 0.279794 seconds.
> Elapsed time is 0.348097 seconds.
> Elapsed time is 0.352154 seconds.
> Elapsed time is 0.348534 seconds.
> Elapsed time is 0.371341 seconds.
> Elapsed time is 0.222977 seconds.
> Elapsed time is 0.221841 seconds.
> mappers
> Elapsed time is 1.40801 seconds.
> Elapsed time is 1.34872 seconds.
> Elapsed time is 1.50267 seconds.
> Elapsed time is 3.00263 seconds.
> 
> 
> with the new patch, I get:
> 
> address@hidden:~/devel/octave/main> ./run-octave -q ttmt.m
> operations
> Elapsed time is 0.222325 seconds.
> Elapsed time is 0.292047 seconds.
> Elapsed time is 0.293674 seconds.
> Elapsed time is 0.291939 seconds.
> Elapsed time is 0.287067 seconds.
> Elapsed time is 0.227352 seconds.
> Elapsed time is 0.215154 seconds.
> mappers
> Elapsed time is 0.774295 seconds.
> Elapsed time is 0.736404 seconds.
> Elapsed time is 0.817736 seconds.
> Elapsed time is 1.62565 seconds.
> 
[cut]

Here is another data point for a quad core cpu:

...
model name      : AMD Phenom(tm) 9500 Quad-Core Processor
stepping        : 2                                      
cpu MHz         : 2200.000                               
cache size      : 512 KB              
...

current tip:
octave:21> mt_bench
operations
Elapsed time is 0.419734 seconds.
Elapsed time is 0.496845 seconds.
Elapsed time is 0.496757 seconds.
Elapsed time is 0.497891 seconds.
Elapsed time is 0.615082 seconds.
Elapsed time is 0.211215 seconds.
Elapsed time is 0.211108 seconds.
mappers
Elapsed time is 2.34408 seconds.
Elapsed time is 2.0489 seconds.
Elapsed time is 2.16343 seconds.
Elapsed time is 4.96898 seconds.

Jaroslav's patch:
octave:6> mt_bench
operations
Elapsed time is 0.30868 seconds.
Elapsed time is 0.235097 seconds.
Elapsed time is 0.240763 seconds.
Elapsed time is 0.25971 seconds.
Elapsed time is 0.255006 seconds.
Elapsed time is 0.168414 seconds.
Elapsed time is 0.147303 seconds.
mappers
Elapsed time is 0.652316 seconds.
Elapsed time is 0.602726 seconds.
Elapsed time is 0.646873 seconds.
Elapsed time is 1.21081 seconds.

Kai


reply via email to

[Prev in Thread] Current Thread [Next in Thread]