octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

FYI: optimized sparse matrix assembly


From: Jaroslav Hajek
Subject: FYI: optimized sparse matrix assembly
Date: Wed, 31 Mar 2010 10:39:31 +0200

hi all,

I contributed a new implementation of sparse matrix assembly from
indices, i.e. sparse (i,j,x) and sparse (i,j,x,m,n).

Summary: the old ctors taking Array<octave_idx_type> and Array<double>
are gone; instead the polymorphic idx_vector is used.
The algorithm is specialized for sparse column vectors and for scalar
assemblies (x is scalar).
Rather than one big sort, the general algorithm uses a two-pass bucket
sort by columns first, then performs stable subsorts on index pairs.
The new algorithm has complexity O(nc + nnz + nzr*log(nzr)), where nzr
is the maximum number of nonzero entries in a column.

A benchmark script follows:

m = 10000;
n = 10000;
a = sprand (m, n, 0.01);
b = sprand (m, n, 0.01);
[ia, ja, va] = find (a);
[ib, jb, vb] = find (b);
idx = randperm (length (va) + length (vb));
i = [ia; ib](idx); j = [ja; jb](idx); v = [va; vb](idx);
clear a b ia ib ja jb va vb
disp ("assembly");
tic; sparse (i, j, v, m, n); toc
tic; sparse (i, j, 1, m, n); toc
tic; sparse (i, 1, v, m, n); toc
tic; sparse (i, 1, 1, m, n); toc
disp ("assembly with sorted columns");
[j, idx] = sort (j);
i = i(idx);
v = v(idx);
tic; sparse (i, j, v, m, n); toc
tic; sparse (i, j, 1, m, n); toc
tic; sparse (i, 1, v, m, n); toc
tic; sparse (i, 1, 1, m, n); toc
disp ("fully sorted assembly");
[ji, idx] = sortrows ([j, i]);
j = ji(:,1);
i = ji(:,2);
v = v(idx);
tic; sparse (i, j, v, m, n); toc
tic; sparse (i, j, 1, m, n); toc
i = sort (i);
tic; sparse (i, 1, v, m, n); toc
tic; sparse (i, 1, 1, m, n); toc


with a recent tip, I get:

address@hidden:~/devel/octave/main> octave -q ttsa.m
assembly
Elapsed time is 1.01363 seconds.
Elapsed time is 0.95161 seconds.
Elapsed time is 0.696647 seconds.
Elapsed time is 0.639646 seconds.
assembly with sorted columns
Elapsed time is 0.287234 seconds.
Elapsed time is 0.274021 seconds.
Elapsed time is 0.700802 seconds.
Elapsed time is 0.646648 seconds.
fully sorted assembly
Elapsed time is 0.0910211 seconds.
Elapsed time is 0.0856769 seconds.
Elapsed time is 0.0671172 seconds.
Elapsed time is 0.060761 seconds.

with the new patch:

address@hidden:~/devel/octave/main> ./run-octave -q ttsa.m
assembly
Elapsed time is 0.16398 seconds.
Elapsed time is 0.104801 seconds.
Elapsed time is 0.0794449 seconds.
Elapsed time is 0.0155661 seconds.
assembly with sorted columns
Elapsed time is 0.116946 seconds.
Elapsed time is 0.0918679 seconds.
Elapsed time is 0.080018 seconds.
Elapsed time is 0.0154951 seconds.
fully sorted assembly
Elapsed time is 0.067704 seconds.
Elapsed time is 0.037581 seconds.
Elapsed time is 0.0223641 seconds.
Elapsed time is 0.0157969 seconds.


this also closes somewhat the gap between sparse and dense accumarray
sum, as described here:
http://n4.nabble.com/FYI-new-accumarray-optimizations-td1653219.html

enjoy

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]