octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [changeset] histc


From: Jaroslav Hajek
Subject: Re: [changeset] histc
Date: Mon, 9 Mar 2009 13:04:13 +0100

On Sun, Mar 8, 2009 at 7:49 PM, John W. Eaton <address@hidden> wrote:
> On  8-Mar-2009, Jaroslav Hajek wrote:
>
> | I don't think so. After all, 3.2 is mainly in John's hands, so if he's
> | OK with it, it's fine. I wonder, John, would you be fine with me
> | optimizing accumarray and improving histc as well.
>
> It's not a problem to add an isolated function like this now.  No
> other parts of Octave depend on it, so I don't see how adding it can
> cause a regression.  Also, it is a .m file, so if there is a problem
> with it that is not discovered until after the release, it can be
> fixed even by those people who can't easily rebuild Octave.
>
> So do whatever you like about it.  But for those people who need the
> function, it is better to have a slow version than not have it at all.
>
> jwe
>

OK, so I optimized accumarray for the default summation case, and
optimized histc using lookup & accumarray.

Short benchmark for accumarray:
m = 1e2;
ns = [1e5 1e6 1e7];

for n = ns
  idx = ceil (m*rand (n, 1));
  vals = rand (n, 1);
  disp (n);
  tic; A = accumarray (idx, vals); toc
endfor

with recent tip, I get:
 100000
Elapsed time is 0.03557897 seconds.
 1000000
Elapsed time is 0.28006887 seconds.
 10000000
Elapsed time is 3.13704205 seconds.

with the optimized accumarray, I get:
 100000
Elapsed time is 0.004490137 seconds.
 1000000
Elapsed time is 0.021013975 seconds.
 10000000
Elapsed time is 0.199796915 seconds.

as expected, the speed-up factor increases with problem size: 7.9, 13.3, 15.7.
(it's O(N*log(N)) vs. O(N)).

a short benchmark for histc (classifying 4 millions normal random
numbers into equidistant bins):

n = 4e6;
randn ("state", 1)
data = randn (n, 1);
for ns = [1 2 3 4 5 20 100 500]
  bins = linspace (-3, 3, ns);
  disp (ns);
  tic; cnt = histc (data, bins); toc
endfor


with the previous implementation:

address@hidden:~/devel/octave/patches> ./run-octave -q tt3.m
 1
Elapsed time is 0.036 seconds.
 2
Elapsed time is 0.12 seconds.
 3
Elapsed time is 0.22 seconds.
 4
Elapsed time is 0.29 seconds.
 5
Elapsed time is 0.38 seconds.
 20
Elapsed time is 1.67 seconds.
 100
Elapsed time is 8.505 seconds.
 500
Elapsed time is 42.66 seconds.

with the new implementation:

address@hidden:~/devel/octave/patches> ./run-octave -q tt3.m
 1
Elapsed time is 0.037 seconds.
 2
Elapsed time is 0.12 seconds.
 3
Elapsed time is 0.22 seconds.
 4
Elapsed time is 0.3 seconds.
 5
Elapsed time is 0.29 seconds.
 20
Elapsed time is 0.355 seconds.
 100
Elapsed time is 0.4199 seconds.
 500
Elapsed time is 0.4822 seconds.


Again you see that the speed-up increases dramatically with number of
bins. This is O(M*N) vs. O(M*log(N) + N). (M data size, N number of
bins). Actually for very small bin counts (< 4) the lookup approach is
somewhat slower, so I used a split-up strategy that reverts to the
Soren's implementation if bins <= 3.

regards

-- 
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]