octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FYI: smarter indexing by logical masks


From: Jaroslav Hajek
Subject: Re: FYI: smarter indexing by logical masks
Date: Sat, 28 Nov 2009 07:23:58 +0100



On Sat, Nov 28, 2009 at 6:26 AM, Judd Storrs <address@hidden> wrote:
This is pretty cool!

Is it possible to somehow pre-compute/convert the mask? I've often
used find() to pre-convert a mask into index form because I think that
improves performance when the mask is used multiple times (but it may
have been a habit that I picked up when I was working primarily in IDL
where this is part of the collective folk wisdom). e.g. things along
these lines:

mask = find(a > 0) ;
a(mask) = b(mask) ;


Not really. Let me explain how this works (not in 3.2.x though). When a matrix is first used in index _expression_, the internal conversion to index_vector is, if successful, cached for subsequent uses. This optimizes usages like
c(mask) = a(mask) + b(mask)
because the conversion of mask has only to be done once. But beware that this is still evaluated by pieces, i.e.

aa = a(mask);
bb = b(mask);
cc = aa + bb;
c(mask) = cc;

This explains why find is *apparently* faster in your benchmark, because find creates the result with the index vector *already cached*. Try putting the first tic; prior to
mask = , and you will see. Also, in the last (full) section you've missed an assignment to mask:
"a > 0.0;"  should be "mask = a > 0.0;"

Here's what I get with a corrected benchmark (the 4e7 case is too much for my home laptop):

n = 100000
sparse mask:        1st: 0.001931, 2nd: 0.000562
sparse mask(find):  1st: 0.001640, 2nd: 0.000611
sparse mask(where): 1st: 0.002379, 2nd: 0.000562
dense mask:         1st: 0.004646, 2nd: 0.003817
dense mask(find):   1st: 0.005676, 2nd: 0.003681
dense mask(where):  1st: 0.005042, 2nd: 0.003642
full mask:          1st: 0.002669, 2nd: 0.001587
full mask(find):    1st: 0.006236, 2nd: 0.004371
full mask(where):   1st: 0.005657, 2nd: 0.003588
n = 1000000
sparse mask:        1st: 0.017749, 2nd: 0.009287
sparse mask(find):  1st: 0.018799, 2nd: 0.008243
sparse mask(where): 1st: 0.016918, 2nd: 0.007967
dense mask:         1st: 0.044704, 2nd: 0.038776
dense mask(find):   1st: 0.058628, 2nd: 0.039231
dense mask(where):  1st: 0.054648, 2nd: 0.035887
full mask:          1st: 0.028402, 2nd: 0.017976
full mask(find):    1st: 0.057978, 2nd: 0.042992
full mask(where):   1st: 0.057311, 2nd: 0.038398
n = 10000000
sparse mask:        1st: 0.178092, 2nd: 0.092343
sparse mask(find):  1st: 0.182628, 2nd: 0.079860
sparse mask(where): 1st: 0.165752, 2nd: 0.079588
dense mask:         1st: 0.442568, 2nd: 0.392207
dense mask(find):   1st: 0.587378, 2nd: 0.398332
dense mask(where):  1st: 0.601536, 2nd: 0.397812
full mask:          1st: 0.256911, 2nd: 0.192204
full mask(find):    1st: 0.625493, 2nd: 0.435221
full mask(where):   1st: 0.639314, 2nd: 0.434564

so you can see find is actually slower. Not by much because the time is drowned in the other work; there's the mask creation, two index expressions, one addition and one indexed assignment. It's mainly visible in the dense and full mask cases. In fact, "find" not only *always* converts to index vector, it also creates a *double* array corresponding to the result (to form a valid octave_value). Hence, in development version, using raw logical masks is more efficient in (almost?) all cases.
 
I've been playing with the new code a bit and I think the find() trick
defeats the new optimization? I tried to create a simple where.oct
that converts its input into the new smart mask but I think I've
missed something

#include <octave/oct.h>

DEFUN_DLD(where, args, ,
         "Convert mask to index")
{
 idx_vector idx = args(0).index_vector() ;
 return octave_value(idx);
}

Spoiler alert: something's wrong here because 'where' is slower than
using the raw boolean matrix. Here's the modified benchmark code I've
been using:


Yeah. Regarding this, I'm not yet fully sure. My rationale was that when you assign an index_vector to octave_value,
you expect to get a numeric array, not a logical one. So, if the idx_vector is actually a mask, it is converted to an index array. Hence, your where function actually simulates "find" (its simplest case). Giving it a second thought, maybe it should be possible to return a logical array with cached index; still, I believe one should never create a numeric array with a cached logical mask. That would be a bit weird. Hence, "find" will need to do the conversion.
 
here's the corrected benchmark:

for n = [1e5 1e6 1e7]
 printf ("n = %d\n", n);
 a = rand(n,1);
 b = rand(n,1);
 c = zeros(n,1);

 tic; mask = a > 0.9;
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("sparse mask:        1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = find(a > 0.9);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("sparse mask(find):  1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = where(a > 0.9);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("sparse mask(where): 1st: %f, 2nd: %f\n", t1, t2);

 tic; mask = a > 0.1;
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("dense mask:         1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = find(a > 0.1);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("dense mask(find):   1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = where(a > 0.1);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("dense mask(where):  1st: %f, 2nd: %f\n", t1, t2);

 tic; mask = a > 0.0;
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("full mask:          1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = find(a > 0.0);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("full mask(find):    1st: %f, 2nd: %f\n", t1, t2);
 tic; mask = where(a > 0.0);
 c(mask) = a(mask) + b(mask); t1 = toc;
 tic; c(mask) = a(mask) + b(mask); t2 = toc;
 fprintf ("full mask(where):   1st: %f, 2nd: %f\n", t1, t2);

endfor

best regards

--
RNDr. Jaroslav Hajek
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

reply via email to

[Prev in Thread] Current Thread [Next in Thread]