Re: NaN-toolbox much faster now

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NaN-toolbox much faster now

From:	Jaroslav Hajek
Subject:	Re: NaN-toolbox much faster now
Date:	Sat, 14 Mar 2009 11:39:18 +0100

On Thu, Mar 12, 2009 at 5:13 PM, Alois Schlögl <address@hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> The following improvements have been included in the NaN-toolbox.
>
> - - sumskipnan_mex.mex has been optimized for speed (minimizing cache
> missing, reducing loop overhead)
>
> - - a flag is set if some NaN occures in the data. The flag can be checked
> (and reset) with function FLAG_NANS_OCCURED(). This enables a flexible
> control on checks for NaN. (You can check after every call, or only at
> the end of your script).
>
> - - the performance of var, std, and meansq has been improved.
>
> A performance between the NaN-toolbox and corresponding standard octave
> functions (see script below) show the following results (time in [s]):
>
>
> with NaN-tb     w/o NaN-tb      ratio
>    0.25884    3.56726   13.78183       mean(x,1)/nanmean(x,1)
>    0.36784    3.32899    9.05020       mean(x,2)/nanmean(x,2)
>    0.30019    6.62467   22.06789       std(x,0,1)
>    0.40114    2.23262    5.56561       std(x,0,2)
>    0.28681    6.40276   22.32407       var(x,0,1)
>    0.40269    2.18056    5.41505       var(x,0,2)
>    0.28175    4.05612   14.39598       meansq(x,1)
>    0.40703    4.19346   10.30248       meansq(x,2)
>    0.25930    0.19884    0.76683       sumskipnan(x,1)/sum(x,1)
>    0.30624    0.24179    0.78955       sumskipnan(x,2)/sum(x,2)
>
>
> A performance improvement by factors as high as 22 can be seen, and
> sumskipnan() is only about 25% slower than sum().
>
> Of course, sumskipnan could also improve the speed of functions like
> nanmean, nanstd, etc. Maybe you want to consider including sumskipnan in
> standard octave.
>

I repeated your experiment using current Octave tip (-O3
-march=native, Core 2 Duo @ 2.83GHz):

   mean(x,1)  mean(x,2)  std(x,0,1)  std(x,0,2)  var(x,0,1)
var(x,0,2) meansq(x,1)  meansq(x,2)  sum(skipnan)(x,1)
sum(skipnan)(x,2)

tic-toc time

   0.108911   0.132629   0.114568   0.163950   0.112384   0.163973
0.112379   0.163682   0.096581   0.101545
   0.090389   0.091657   0.915853   0.955799   0.883821   0.921007
0.110276   0.114233   0.082247   0.089742

tic-toc ratio

   0.82993   0.69108   7.99397   5.82982   7.86431   5.61683   0.98129
  0.69790   0.85159   0.88376

cputime

   0.108007   0.136008   0.112007   0.164011   0.112007   0.164010
0.116007   0.160010   0.100006   0.100007
   0.088005   0.088005   0.900056   0.956060   0.884055   0.924058
0.092006   0.116007   0.080005   0.092006

cputime ratio

   0.81481   0.64706   8.03571   5.82924   7.89285   5.63416   0.79311
  0.72500   0.80000   0.92000

It can be seen that the penalty for skipping NaNs is mostly within
20-30%, smaller for column-oriented reductions.
The speed-up factors 5 and 7 for std and var are caused by the
single-sweep computation done in sumskipnan.
This becomes apparent when a less random data are supplied, and the
NaN toolbox reverts to a backup algorithm (which is what Octave always
does) - relative error at the order of 10^-4:

tic-toc time

   0.108613   0.132721   1.362765   1.500724   1.366353   1.499243
0.115758   0.163625   0.097873   0.102086
   0.089788   0.089979   0.876386   0.914380   0.880742   0.913636
0.094084   0.091950   0.082200   0.089619

tic-toc ratio

   0.82668   0.67796   0.64309   0.60929   0.64459   0.60940   0.81277
  0.56196   0.83986   0.87788

cputime

   0.108007   0.132008   1.364085   1.500094   1.368086   1.500093
0.116007   0.164011   0.096006   0.104006
   0.092006   0.088005   0.876055   0.916057   0.880055   0.916057
0.092006   0.092006   0.084005   0.088005

cputime ratio

   0.85185   0.66666   0.64223   0.61067   0.64327   0.61067   0.79311
  0.56097   0.87500   0.84615

Here the std/var computations are slown down by some 35-45%. This is
less favorable, though certainly no disaster.
I think the Octave statistics subcommunity should discuss what would
they appreciate best. Is anyone depending on the speed of std/var?
Opinions about skipping NaNs? Given Octave's NA support, it may be
better to just skip NAs, like R does.

There were also suggestions to move the statistics functions
completely out of Octave. Personally, I'd vote to retain just the
stuff from statistics/base, because I sometimes use functions thereof
despite not being a statistician.

regards

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

n = 8e3;
randn("state", 123);
#x = randn(n);
x = 1 + randn(n) * 1e-4;
#k=1;
k=2;
load data

t=cputime();tic; m = mean(x,1); T(k,1)=toc;V(k,1)=cputime()-t;
t=cputime();tic; m = mean(x,2); T(k,2)=toc;V(k,2)=cputime()-t;

t=cputime();tic; m = std(x,0,1); T(k,3)=toc;V(k,3)=cputime()-t;
t=cputime();tic; m = std(x,0,2); T(k,4)=toc;V(k,4)=cputime()-t;

t=cputime();tic; m = var(x,0,1); T(k,5)=toc;V(k,5)=cputime()-t;
t=cputime();tic; m = var(x,0,2); T(k,6)=toc;V(k,6)=cputime()-t;

t=cputime();tic; m = meansq(x,1); T(k,7)=toc;V(k,7)=cputime()-t;
t=cputime();tic; m = meansq(x,2); T(k,8)=toc;V(k,8)=cputime()-t;

if (k == 1)
  t=cputime();tic; m = sumskipnan(x,1); T(k,9)=toc;V(k,9)=cputime()-t;
  t=cputime();tic; m = sumskipnan(x,2); T(k,10)=toc;V(k,10)=cputime()-t;
else
  t=cputime();tic; m = sum(x,1); T(k,9)=toc;V(k,9)=cputime()-t;
  t=cputime();tic; m = sum(x,2); T(k,10)=toc;V(k,10)=cputime()-t;
endif

save data T V

[Prev in Thread]

Current Thread

[Next in Thread]

NaN-toolbox much faster now, Alois Schlögl, 2009/03/12
- Re: NaN-toolbox much faster now, Jaroslav Hajek <=
  - Re: NaN-toolbox much faster now, Alois Schlögl, 2009/03/16
    - Re: NaN-toolbox much faster now, Jaroslav Hajek, 2009/03/16
    - Re: NaN-toolbox much faster now, Alois Schlögl, 2009/03/17
    - Re: NaN-toolbox much faster now, Jaroslav Hajek, 2009/03/17
    - Re: NaN-toolbox much faster now, Alois Schlögl, 2009/03/17
    - Re: NaN-toolbox much faster now, John W. Eaton, 2009/03/17
    - Re: NaN-toolbox much faster now, Alois Schlögl, 2009/03/18
    - Re: NaN-toolbox much faster now, Jason Riedy, 2009/03/18
    - Re: NaN-toolbox much faster now, Alois Schlögl, 2009/03/19
    - Message not available
    - Re: NaN-toolbox much faster now, Jason Riedy, 2009/03/23

Prev by Date: Re: after 3.2
Next by Date: Re: basic implementation for isosurface, isocolors, isonormals
Previous by thread: NaN-toolbox much faster now
Next by thread: Re: NaN-toolbox much faster now
Index(es):
- Date
- Thread