[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
FYI: nth_element
From: |
Jaroslav Hajek |
Subject: |
FYI: nth_element |
Date: |
Thu, 15 Oct 2009 09:15:21 +0200 |
hi all,
Octave has just got a new function: nth_element:
http://hg.savannah.gnu.org/hgweb/octave/rev/aea3a3a950e1
the docstring:
-- Built-in Function: nth_element (X, N)
-- Built-in Function: nth_element (X, N, DIM)
Select the n-th smallest element of a vector, using the ordering
defined by `sort'. In other words, the result is equivalent to
`sort(X)(N)'. N can also be a contiguous range, either ascending
`l:u' or descending `u:-1:l', in which case a range of elements is
returned. If X is an array, `nth_element' operates along the
dimension defined by DIM, or the first non-singleton dimension if
DIM is not given.
nth_element encapsulates the C++ STL algorithms nth_element and
partial_sort. On average, the complexity of the operation is
O(M*log(K)), where `M = size(X, DIM)' and `K = length (N)'. This
function is intended for cases where the ratio K/M is small;
otherwise, it may be better to use `sort'.
See also: sort, min, max
In short, it allows extracting a small portion of sort(X) without
actually doing the full sort.
This is sometimes useful for statistics when computing quantiles and
the like... statisticians can surely tell better.
Maybe it can be used to boost some existing functions. As an example,
it can be readily used to speed up the "median" function:
http://hg.savannah.gnu.org/hgweb/octave/rev/b7b89061bd0e
the following benchmark demonstrates the speed-up:
for n = [2000, 2001]
a = rand (n);
disp (n);
tic; median (a(:)); toc
tic; median (a, 1); toc
tic; median (a, 2); toc
endfor
using a recent tip, Core 2 Duo @ 2.83 GHz, g++ -O3 -march=native, I get:
2000
Elapsed time is 0.62825 seconds.
Elapsed time is 0.34634 seconds.
Elapsed time is 0.40085 seconds.
2001
Elapsed time is 0.62713 seconds.
Elapsed time is 0.34236 seconds.
Elapsed time is 0.39419 seconds.
with the new patch, I get
2000
Elapsed time is 0.061557 seconds.
Elapsed time is 0.049 seconds.
Elapsed time is 0.074331 seconds.
2001
Elapsed time is 0.047512 seconds.
Elapsed time is 0.047687 seconds.
Elapsed time is 0.075987 seconds.
Contributing tests would be warmly welcome.
There's a number of corner cases, esp. if NaNs come into play, so I
expect bugs to be still there.
enjoy!
--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
- FYI: nth_element,
Jaroslav Hajek <=
Re: FYI: nth_element, Jordi GutiƩrrez Hermoso, 2009/10/17