[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #62495] [octave forge] (statistics) pdist 'cos
From: |
Michael Leitner |
Subject: |
[Octave-bug-tracker] [bug #62495] [octave forge] (statistics) pdist 'cosine' metric - internal expansion causes out of memory error |
Date: |
Fri, 20 May 2022 17:38:48 -0400 (EDT) |
Follow-up Comment #2, bug #62495 (project octave):
Yes, there is a more efficient version: replace the cosine block by
++
w = sqrt (sumsq (X, 1));
y = 1 - ((X'*X)./(w.*w'))(((1:rows(x))'>(1:rows(x)))(:))';
--
where you wouldn't even need the Xi and Yi, or if you rather want to do it
more obviously, use instead
++
w = sqrt (sumsq (X, 1));
y = 1 - ((X'*X)./(w.*w'))(sub2ind(rows(x)([1 1]),Xi,Yi))';
--
The point is that "summing the pointwise products of rows" is in effect just
matrix multiplication. Admittedly, if you expand it on first sight you have
only half as much to compute, because you need only the half of X'*X. But I
seem to remember that the octave interpreter actually recognizes such a
pattern and uses a more efficient BLAS routine, that's why I didn't write x*X.
Of course, most of the other distances have the same problem. Unfortunately
"summing the square or some other function of the pointwise differences of
rows" is not a primitive BLAS routine, therefore I do not see how these can be
optimized. If the second dimension of x should be large (like in the problem
of the OP, and only there memory can become a problem) of course you can just
do a loop and sequentially treat the columns of x. But this would need
heuristic tests when it is indicated to do it in a loop and lead to much more
complicated code.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?62495>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/