Re: [OctDev] Question on performance, coding style and competitive softw

octave-maintainers
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [OctDev] Question on performance, coding style and competitive softw

From:	Alois Schlögl
Subject:	Re: [OctDev] Question on performance, coding style and competitive software
Date:	Thu, 23 Apr 2009 11:18:54 +0200
User-agent:	Thunderbird 2.0.0.21 (X11/20090318)
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jaroslav Hajek wrote:
> On Wed, Apr 22, 2009 at 4:18 PM, Alois Schlögl <address@hidden> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> David Bateman wrote:
>>> Alois Schlögl wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>>
>>>> As some of you might know, my other pet project besides Octave, is
>>>> BioSig http://biosig.sf.net. BioSig is designed in such a way that it
>>>> can be used with both, Matlab and Octave. Mostly for performance reason,
>>>> we cannot abandon support for Matlab [1,2]. Octave is a viable
>>>> alternative in case the computational performance is not important. In
>>>> order to decide on the future strategy of BioSig, I hope to get answers
>>>> on the following questions:
>>>>
>>>> 1) Core development of Octave:
>>>> At the meeting of Octave developer in 2006, the issue was raised
>>>> that the Octave is about 4 to 5 times slower than Matlab [1]. (I
>>>> repeated the tests just recently, the results are attached below, and
>>>> show a difference of factors up to 13, average ~5) This issue is most
>>>> relevant for large computational jobs, were it makes a difference
>>>> whether a specific task takes 1 day or 5 days. Is anyone working to
>>>> address this problem? Is there any hope that the performance penalty
>>>> becomes smaller or will go away within a reasonable amount of time ?
>>>>
>>> Its hard to tell what the source of your speed issues are.. The flippant
>>> response would be that with a JIT in octave then yes we could be as
>>> fast, we just need someone to write it. I suspect something will be done
>>> here in the future. The recent changes of John to have an evaluator
>>> class and his statement of adding a profiler in Octave 3.4 mean that the
>>> machinery needed to add a JIT will be in place.
>>
>> Good to know that someone is working on this. However, as far as I
>> understand its currently not possible to estimate when the performance
>> penalty is expected to be nullified.
>>
> 
> Agreed. And I am more pessimistic than David about JIT in near future,
> unless someone gets funding for that (maybe via GSoC or something).
> 
>>> However looking at your wackerman its not clear to me that its your
>>> for-loop that is taking all of the time in Octave. If it is have you
>>> considered rewriting
>>>
>>> for k = 1:size(m2,1),
>>>  if all(finite(m2(k,:))),
>>>    L = eig(reshape(m2(k,:), [K,K]));
>>>    L = L./sum(L);
>>>    if all(L>0)
>>>      OMEGA(k) = -sum(L.*log(L));
>>>    end;
>>>  end;
>>> end;
>>>
>>> with something like
>>>
>>> rows_m2 = size(m2, 1);
>>> m3 = permute (reshape (m2, [rows_m2, K, K]), [2, 3, 1]);
>>> idx = all (finite (m2), 1);
>>> t = cellfun (@(x) eig(x), mat2cell (m3 (:, :, idx), K, K, ones(1,
>>> rows_m2)),
>>>             'UniformOutput', false);
>>> t = cellfun (@(x) - sum (x .* log (x)),
>>>        cellfun (@(x) x ./ sum(x), 'UniformOutput', false));
>>> t(iscomplex(t)) = NaN;
>>> OMEGA(idx) = t;
>>>
>>> The code above is of course untested. But in the latest tip that should
>>> be much faster for Octave as Jaroslav optimized cellfun recently
>>
>> Using Jaroslav's code and some modifications (diag of 300000 element
>> vector was just too large)
> 
> In 3.1.5x this is no longer an issue, because diagonal matrices are
> optimized. In 3.0.x, I think you can use "dmult" to do the row
> scaling. Or the outer product trick you do below, but the diag
> expression is both more readable and faster in 3.1.5x. Sorry, I just
> tend to think in terms of the development version :)


Thanks for the solution. The problem was with matlab. You might ask why
I bother You with this, its just that I do not want to ignore mat-users.

> 
>>        rows_m2 = size(m2, 1);
>>        m3 = permute (reshape (m2, [rows_m2, K, K]), [2, 3, 1]);
>>        idx = all (isfinite (m2), 2);
>>        t = cellfun (@eig, mat2cell (m3 (:, :, idx), K, K, ones(1,
>> sum(idx))),'UniformOutput', false);
>>        t = [t{:}];
>>        idx2 = all(t>0);
>>        t = t(:,idx2) ./ [ones(K,1) * sum(t(:,idx2))];
>>        t = sum (t .* log (t));
>>        idx = find(idx);
>>        OMEGA(idx(idx2)) = t;
>>
>> the performance increases for Octave from 82.9 to 15.2 s. Thanks.
>> (The programm slowed down on Matlab from 13.0 to 66.15 s, though).
> 
> That's quite surprising. Are you sure you didn't leave the "diag"
> expression in the Matlab test? I don't see why it should get that
> slower...
> 
>> I'm not sure how this technique can be used for the other functions
>> (aar, findclassifier).
> 
> Maybe a different technique will work, I haven't yet looked. There are
> of course also codes that can't be vectorized.
> 
>> Memory usage is also an issue.
> 
> Not that much, I hope. Also note 3.1.5x does manage memory more
> efficiently, apparently even more efficiently than Matlab. Anyway
> Octave (and Matlab) is not really a good tool for memory-critical
> applications, because the COW mechanism is very ill-suited for such
> applications. You definitely want references or pointers if you need
> to keep memory low.
> 
>>>> 2) Coding style:
>>>> Octave understands a superset of commands compared to matlab, and it
>>>> seems the current policy is to enforce the "octave style" and make the
>>>> use of toolboxes incompatible for a use with Matlab. Is not it sensible
>>>> to write platform-neutral applications ? Specifically, is not it in our
>>>> own interest (wider usage make the code more robust) that matlab users
>>>> are not "forced" to buy additional toolboxes but can use open source
>>>> toolboxes e.g. from octave-forge?
>>>>
>>> I'd personally consider that up to the toolboxes author. Using texinfo
>>> in the help string makes the Octave help string "nicer"....  I however
>>> don't think a policy should be made that toolboxes on octave-forge
>>> should be matlab compatible..
>>>
>>
>> I know its up to the toolbox authors. I'm not sure that every author is
>> aware of this. In case someone wants to modify some functions from
>> octave-forge/main for the use with matlab, and make it available to
>> others, what is the proper procedure for this (a) if he is the original
>> author and the function is already in octave-forge/main (b) if he wants
>> to modify an existing function from some other author ?
> 
> If he wants to keep that function in the package, then (obviously) he
> should follow the package's policy (determined by author or
> maintainer). If he just wants to share it on his own, then he should
> feel free to do any changes he wishes, as long as he honors the GPL.
> 
>> The texinfo is the minor problem, because the function is still usable
>> even if the documentation is not properly displayed.
>> The main issues are the incompatible syntax like
>> - - comments: # vs. %
>> - - end vs. endif-endfor-endwhile-endfunction etc.,
>> - - single quote  vs. double quote
>> - - negation operator: ! vs ~
>> which make it impossible to use most octave toolboxes in Matlab
>>
>> BTW, what are the arguments in favor of using octave-only coding style ?
>>
> 
> comments: # is much more common. % is, AFAIK, recognized only by
> Octave and other Matlabish software and TeX.
> Also, on UNIX # allows to use the #! mechanism and thus make
> executable octave scripts.

There are all kinds of comments //, /* */, and because Shell and Octave
scripts are are two different things, this is important.

Cases using the shebang mechanism would certainly need some attention.
However, within all m-files at octave-forge, only
octave-forge/main/info-theory/doc/info-theory.m
is using the shebang mechanism.

> 
> specific end blocks: they catch typing errors more easily, and the
> code is more easily parsable for both humans and computers. I also
> consider it an extremely bad idea that "end" is likewise used in index
> expressions. I think Cleve Moler (or whoever designed it) must have
> been drinking that night.


The idea of the end-operator is also used in other languages (python,
etc), so I guess it's not completely insane. After some reluctance, I
found the end-operator very useful.

> 
> quotes: again, double quotes are somewhat more standard, in particular
> in the C-derived world. more importantly, ""s allow things like \n,
> \t.

Octave does not claim to be compatible to C but to Matlab. \n and \t can
be also used with single quotes in Octave as well as in Matlab.

> 
> negation - this is purely syntactic sugar, AFAIK, again for
> compatibility with the C world.

This "syntactic sugar" is part of the issue and could be easily avoided.

> 
>>>> 3) Scope of Octave and Octave-Forge:
>>>> Open source software has its own merit, but sometimes also other factors
>>>> (e.g. additional costs in hardware, energy supply and cooling systems,
>>>> energy efficiency = "green computing") need to be considered. Given the
>>>> fact that octave-core is currently slower for some tasks, it is worth
>>>> considering to use proprietary mat-engine. The question is whether
>>>> Octave and Octave-forge should provide support of toolboxes for matlab
>>>> users too, or whether these users should go somewhere else? What do you
>>>> think ?
>>>>
>>> I'm not sure how this point differs from your second point.. Again to me
>>> its up to the toolboxes/packages author to decide whether they want
>>> matlab compatibility or not. If a toolbox is compatible I see no issue
>>> sending matlab users to octave-forge for code..
>>>
>> Yes, the question is closely related to the previous one. Of course, if
>> the toolbox is compatible to matlab, there is no problem for the matlab
>> users. Unfortunately, most toolboxes (all in Octave and
>> octave-forge/main and most of octave-forge/extra) are using the
>> octave-only coding style.
>>
>> This seems to suggest that a fork is neccessary in order to make the
>> toolboxes applicable for matlab users. Is there an alternative ?
>>
> 
> You can try to explain to the developers why making the packages
> Matlab-compatible is worth their effort.
> Maybe you'll succeed, at least with some of them.
> 

It would be nice, if developers aiming at compatibility between octave
and matlab could feel at home here.


I looked also at David's suggestion to use oct2mat.

line 188:     gsub("[\\]$","...");
caused this error:
awk: /home/schloegl/matlab/oct2mat/oct2mat: line 188: regular expression
compile failed (bad class -- [], [^] or [)

When I removed the line, the problem was gone. Has anyone a proper
substitute for this line?



Cheers,
  Alois




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknwMnsACgkQzSlbmAlvEIgMmgCgtXIBaHM0YjYO+VR1e1gg2dPI
taoAoJ6vGoq9zKmFIGqjnr2sWWfHDZoh
=Lace
-----END PGP SIGNATURE-----
[Prev in Thread]
Current Thread
[Next in Thread]
Re: Question on performance, coding style and competitive software, (continued)
- Re: Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/22
Prev by Date: Re: Question on performance, coding style and competitive software
Next by Date: Re: [OctDev] Question on performance, coding style and competitive software
Previous by thread: Re: [OctDev] Question on performance, coding style and competitive software
Next by thread: Re: [OctDev] Question on performance, coding style and competitive software
Index(es):
- Date
- Thread