octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Backslash inversion vs inv-function


From: John W. Eaton
Subject: Backslash inversion vs inv-function
Date: Sat, 19 Jan 2008 12:11:54 -0500

On 18-Jan-2008, Rolf Fabian wrote:

| Because of its underlying algorithm, I expected that 
| inversion of a (nonsingular) square matrix should
| be faster using backslash operator  'y2 = x\eye(n)'
| instead of Octave's  builtin 'y1 = inv(x)'  function.
| 
| Thus I conducted  dimension n - depending tests
| using random matrices input in order to sched
| some light on this issue.
| 
| octave-3.0.0.exe:48> n=5; x=randn(n); tic; y1 = inv(x); toc,
|                                       tic; y2 = x\eye(n); toc, 
| max(max(abs(y1-y2)))
| 
|              Elapsed time is sec.
|    n         inv(x)          x\eye(n)   Speed Gain Fac   max(max(abs(result
| difference)))
|    5         0.05531     0.000258        214              1.1102e-016
|   10         0.06063     0.000313       194              4.4409e-016
|   25         0.05705     0.000705         81              6.9944e-015
|   50         0.05889     0.002197         27              1.2212e-015
|  100        0.07195     0.01153          6.2              1.8952e-015
|  150        0.08721     0.03425          2.6              2.2649e-014
|  200        0.1271       0.06117         2.1               3.9413e-015
|  300        0.2892       0.1991           1.5               8.5820e-014
|  400        0.6255       0.609             1.03             8.9484e-014
|  500        1.176         1.263             0.93            2.2163e-014
|  600        2.05           2.197            0.93             7.0499e-015
|  700        3.228         3.458            0.93             2.4092e-014
|  800        4.855         5.137            0.95             2.2243e-014
|  900        6.873         7.272            0.95             3.0781e-014
| 1000       9.506         9.988            0.95             2.8283e-014
| 
| I've done this on a relative old laptop 
| (Windows XP, SP2, Intel III Mobile CPU 933 MHz 512 MB RAM)
| so do not wonder about absolute performance times.
| 
| >From the the table above it can be seen that the results confirm my
| expectations.
| Up to about 400x400 sized matrices the speed gain (column 4) can be
| enormous.
| Consequently replacing 'inv(x)' in existing code by 'x\eye(n)' might be an
| option and should be considered.

Are you using ATLAS or some other tuned BLAS?  If so, is it
appropriate for your hardware?

For timing purposes, you should use cputime, not tic/toc, since tic/toc
measures wall-clock time and can be affected by other processes on
your system.  Also, it's probably best to run timing tests a few times
and do some averaging.

In any case, for the following script:

  puts ("   n      inv(x)      x\\I      err\n");
  ntest = 5;
  for n = [100,150,200,300,400,500,600,700,800,900,1000,1500,2000]
    x = randn (n);
    I = eye (n);
    tinv = tldiv = 0;
    for i = 1:ntest
      t = cputime ();
      y1 = inv (x);
      tinv += cputime () - t;
      t = cputime ();
      y2 = x\I;
      tldiv += cputime () - t;
    endfor
    tinv /= ntest;
    tldiv /= ntest;
    err = max (abs (y1 - y2)(:));
    printf ("%4d  %9.5f  %9.5f  %e\n", n, tinv, tldiv, err);
  endfor

here are the results I see:

     n      inv(x)      x\I      err
   100    0.00400    0.00480  1.110223e-14
   150    0.01040    0.01040  4.378442e-15
   200    0.01680    0.02160  1.043610e-14
   300    0.04480    0.05600  8.604228e-15
   400    0.09361    0.11041  8.160139e-14
   500    0.16081    0.18881  4.285461e-14
   600    0.25522    0.30162  2.622902e-14
   700    0.38242    0.45283  1.847134e-14
   800    0.53203    0.64564  2.242651e-13
   900    0.72405    0.87205  6.783463e-14
  1000    0.96006    1.13767  5.022718e-14
  1500    2.80978    3.38981  2.424727e-13
  2000    6.29719    7.45967  6.985107e-13

Note that I've also taken the calculation of eye(n) out of the loop
and x\I is still generally slower than inv(x) on my system (AMD64,
and Debian's atlas3-base package, version 3.6.0-20.6).

I guess I don't see a problem here.

jwe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]