octave-patch-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-patch-tracker] [patch #8426] strings package: speeding up the ed


From: Carnë Draug
Subject: [Octave-patch-tracker] [patch #8426] strings package: speeding up the edit distance
Date: Fri, 08 Aug 2014 13:19:18 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0 Iceweasel/30.0

Follow-up Comment #6, patch #8426 (project octave):

Seems I mangled the verbatim syntax so I'll paste the part that got cut out
again:

* "(nargin >= 3 && length (weights) < 3)" deserves an error message on its
own. Also, do not use length it will return true for a matrix size [2 3].
Finally, shouldn't an empty weigths be allowed? The documentation says "For
the special case that @var{weights} is empty"


function [dist, L] = editdistance (str1, str2, weights = [1 0 1], modus = 0)
  if (nargin < 2 || nargin > 4)
    print_usage ();
  endif

  if (nargin > 2 && (numel (weights) != 3 && ! isempty (weigths))
    error ("editdistance: WEIGTHS must be a 3 element vector.")
  endif


* I can't figure out "(nargin >= 2 || weights(1) == weights(3) && weights(1)
== 1 && weights(2) == 0)" ? This will always return true because this function
is always called with at least 2 arguments. But I remove it, then the rest is
the same as `weigths = [1 0 1]`. What should this be doing? From the
documentation, it would seem it should be:


  saveMemory = nargout < 2;
  if (modus == 0 && saveMemory && isempty (weights)) 
    dist = berghel_roach (str1,str2);
    return
  endif


* while the algorithm seems like it's not really requires a char, it does
require the input to be a vector. You should check for that. For the same
reason mentioned above, do not use length in:


  L1 = length (str1) + 1;
  L2 = length (str2) + 1;



* use more whitespace please. For example, add a space between operators
(including "="), and after commas in a comma separated list

* can you avoid the nested sub-function? I can see that it can modify the
sparse matrix when it's called so this will avoid copying it for write. But
nested functions in Octave and Matlab have weird scoping rules and can lead to
weird bugs later on (and I'm not sure if we're completely Matlab compatible on
that respect). I guess that since is less prone to problems so it should be
fine if you can't find a way that would not impact performance.

* finally, if you clone the repository and submit a changeset, it's easier to
review and you'll keep authorship of the changes:


 hg clone http://hg.code.sf.net/p/octave/strings octave-forge-strings
 cd octave-forge-strings
 ## make your changes
 hg commit -m
 ## enter a commit message following commit message guidelines
http://wiki.octave.org/Commit_message_guidelines
 ## probably something like:
 ##
 ## editdistance: use Berghel and Roach algorithm for performance (patch
#8426)
 ## 
 ## * editdistance.m: blah blah blah multi-line comment with deeper details,
with
 ## each line not taking more than 80 characters.
 hg export -o ../speedup-editdistance.diff tip



    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/patch/?8426>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]