octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

FYI: nth_element


From: Jaroslav Hajek
Subject: FYI: nth_element
Date: Thu, 15 Oct 2009 09:15:21 +0200

hi all,

Octave has just got a new function: nth_element:
http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1

the docstring:

 -- Built-in Function:  nth_element (X, N)
 -- Built-in Function:  nth_element (X, N, DIM)
     Select the n-th smallest element of a vector, using the ordering
     defined by `sort'.  In other words, the result is equivalent to
     `sort(X)(N)'.  N can also be a contiguous range, either ascending
     `l:u' or descending `u:-1:l', in which case a range of elements is
     returned.  If X is an array, `nth_element' operates along the
     dimension defined by DIM, or the first non-singleton dimension if
     DIM is not given.

     nth_element encapsulates the C++ STL algorithms nth_element and
     partial_sort.  On average, the complexity of the operation is
     O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'.  This
     function is intended for cases where the ratio K/M is small;
     otherwise, it may be better to use `sort'.

     See also: sort, min, max

In short, it allows extracting a small portion of sort(X) without
actually doing the full sort.
This is sometimes useful for statistics when computing quantiles and
the like... statisticians can surely tell better.

Maybe it can be used to boost some existing functions. As an example,
it can be readily used to speed up the "median" function:
http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e

the following benchmark demonstrates the speed-up:

for n = [2000, 2001]
  a = rand (n);
  disp (n);
  tic; median (a(:)); toc
  tic; median (a, 1); toc
  tic; median (a, 2); toc
endfor

using a recent tip, Core 2 Duo @ 2.83 GHz, g++ -O3 -march=native, I get:

 2000
Elapsed time is 0.62825 seconds.
Elapsed time is 0.34634 seconds.
Elapsed time is 0.40085 seconds.
 2001
Elapsed time is 0.62713 seconds.
Elapsed time is 0.34236 seconds.
Elapsed time is 0.39419 seconds.

with the new patch, I get

 2000
Elapsed time is 0.061557 seconds.
Elapsed time is 0.049 seconds.
Elapsed time is 0.074331 seconds.
 2001
Elapsed time is 0.047512 seconds.
Elapsed time is 0.047687 seconds.
Elapsed time is 0.075987 seconds.

Contributing tests would be warmly welcome.
There's a number of corner cases, esp. if NaNs come into play, so I
expect bugs to be still there.

enjoy!

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]