octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: performance problem with unique on sparse arrays


From: Jaroslav Hajek
Subject: Re: performance problem with unique on sparse arrays
Date: Tue, 2 Mar 2010 16:15:47 +0100

On Mon, Mar 1, 2010 at 9:43 PM, David Bateman <address@hidden> wrote:
> Jaroslav Hajek wrote:
>>
>> It would be nice to let the Sparse indexing benefit from similar
>> improvements as the Array indexing, and also clean up the interface in
>> a similar manner. Since sparse indexing is only 1D or 2D, reductions
>> are not needed here, but the innermost loop specializations would
>> likely make the code both cleaner and more efficient.
>> I'd like to do some work here after 3.4, but I would appreciate
>> assistance, because the coding will probably be more difficult (except
>> for the lack of n-d case, which will make it easier).
>>
>
> I understand the sparse indexing pretty well and would be happy to help out.
> The existing index, maybe_delete_elements and assign methods of the
> Sparse<T> class were written by taking the Array<T> methods from version
> 2.1.63 (at least I think it was about then) and then just converting them
> ad-hoc to Sparse versions that did exactly the same thing.  I suppose the
> same process needs to be done with a recent version of your Array<T>
> indexing changes and will probably be easier as these functions are now much
> simpler. A nice thing to add if we do that  is to make
>
> sm = spalloc (r, c, n);
> for i = 1 : r
>  for j = 1 : c
>    y = foo (i, j);
>    if (y)
>      sm(i, j) = y;
>    endif
>  endfor
> endfor
>
> not reallocate sm on every call to the Sparse<T>::assign method as it
> currently does.

Something like this works (very limited) for dense arrays, i.e.
a = []
for i = 1:n
 a(i) = i^2;
endfor

will only resize the array approximately k-times, k ~ log2 (min (1000,
n)) + max (n/1000-1, 0).
I think the feature has limited usage, though. It's far better to use
vectorized code if possible, not just because of the reallocations.

Maybe a sparse matrix could be initially stored as a map from indices
to values, allowing efficient in-place updates, and converted to the
canonical CSC form later when needed.

> Fixing the indexing such that something like
>
> sm = sprandn (1e6,1e6,1e-6); sm(sm<1)= 0;
>
> does cause a memory overflow with sparse logical indexing would be really
> nice as well.
>

I think we dicussed this as well. This would require the sparse matrix
to have a "default value" field.


I had some ideas about sparse matrices as well. For instance, a sparse
matrix could look like this:

template <class T>
class Sparse
{
  SparsityPattern sp;
  Array<T> values;
};

Where SparsityPattern is a reference-counted class that only stores
the ridx and cidx components. The advantage of this approach is that
stuff like abs(M), 2*M, sqrt(M) or real(M) where M is a sparse Matrix,
can be handled very efficiently by simply forwarding the operation to
the array part.

Maybe this could also avoid some code duplication.


-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz



reply via email to

[Prev in Thread] Current Thread [Next in Thread]