Re: performance problem with unique on sparse arrays

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: performance problem with unique on sparse arrays

From:	Jaroslav Hajek
Subject:	Re: performance problem with unique on sparse arrays
Date:	Mon, 1 Mar 2010 10:16:08 +0100

On Mon, Mar 1, 2010 at 4:14 AM, David Bateman <address@hidden> wrote:
> John W. Eaton wrote:
>>
>> The unique function performs poorly on sparse arrays:
>>
>>  octave:1> x = sprand (1e5, 1, 0.3);
>>  octave:2> tic; unique (x); toc  Elapsed time is 10.8679 seconds.
>>  octave:3> tic; unique (full (x)); toc
>>  Elapsed time is 0.036339 seconds.
>>
>> I "fixed" this problem with the following patch so that if unique is not
>> operating on rows and we don't need the index vectors, it just
>> converts the nonzero elements to a full vector instead of working on
>> the sparse array directly:
>>
>>  http://hg.savannah.gnu.org/hgweb/octave/rev/1f11fabfa349
>>
>> The real performance problem in unique appears to be the following two
>> operations:
>>
>>  match = (y(1:n-1) == y(2:n));
>>  y(idx) = [];
>>
>> It would be good to find a way to improve these, then maybe my quick
>> fix to unique could be removed and performance would also be improved
>> for the case when index values are requested.
>>
>> Comments?
>>
>> jwe
>>
>>
>
> sub-dividing your example a bit with the example
>
> octave:1> n=1e5;d=0.01;y=floor(10*sprand(n,1,d));tic; x1 = y(1:n-1); toc;
> tic; x2 = y(2:n); toc; tic; match = (x1 == x2);toc; tic; idx= find(match);
> toc; tic; y(idx) = [];toc
> Elapsed time is 1.09358 seconds.
> Elapsed time is 1.06628 seconds.
> Elapsed time is 0.003949 seconds.
> Elapsed time is 0.011372 seconds.
> Elapsed time is 0.291622 seconds.
>
> it seems that is that the "Sparse<T>::index (const idx_vector&, int) const"
> method is slow but the maybe_delete_elements method is a bit slow as well.
> Note that Jaroslav accelerated the Array<T>::index methods in 3.2.x by
> reworking the idx_vector class to do the indexing. The Sparse<T> class
> didn't get the same treatment.
>

It would be nice to let the Sparse indexing benefit from similar
improvements as the Array indexing, and also clean up the interface in
a similar manner. Since sparse indexing is only 1D or 2D, reductions
are not needed here, but the innermost loop specializations would
likely make the code both cleaner and more efficient.
I'd like to do some work here after 3.4, but I would appreciate
assistance, because the coding will probably be more difficult (except
for the lack of n-d case, which will make it easier).

regards

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

[Prev in Thread]

Current Thread

[Next in Thread]

Re: performance problem with unique on sparse arrays, Jaroslav Hajek <=
- Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/01
  - Re: performance problem with unique on sparse arrays, Jaroslav Hajek, 2010/03/02
  - Re: performance problem with unique on sparse arrays, Jaroslav Hajek, 2010/03/19
    - Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/20
- Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/01
  - Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/01
    - Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/01
    - Re: performance problem with unique on sparse arrays, John W. Eaton, 2010/03/01
    - Re: performance problem with unique on sparse arrays, David Bateman, 2010/03/01

Prev by Date: Re: C++ complex class use in str2double.cc
Next by Date: Re: C++ complex class use in str2double.cc
Previous by thread: Re: C++ complex class use in str2double.cc
Next by thread: Re: performance problem with unique on sparse arrays
Index(es):
- Date
- Thread