octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Octave-bug-tracker] [bug #36372] Improved ranks.m included:


From: Dave Goel
Subject: Re: [Octave-bug-tracker] [bug #36372] Improved ranks.m included:
Date: Wed, 02 May 2012 23:53:12 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (gnu/linux)

Jordi Gutiérrez Hermoso <address@hidden> writes:
>
> In your case, I would greatly appreciate if you could explain a little in your
> commit message about what changed. Also, you may remove the part of the patch
> that says "this may not become a part of GNU Octave" and associated text. Part
> of that text should become a unit test (look at the bottom of the file for the
> syntax for writing tests).

Sure thing. 

"Use a new algorithm that uses vectorized code." may probably suffice as
a commit message, but here's a walkthrough of my algo.:

Consider x=[11 14 12 12 15]. Consider a *sorted* vector z=sort(x), whose
values are then [11 12 12 14 15].  So, I start with z's ordinal ranks
which are always just the indexes, that is, 1:5. But, because of
conflicts, we want the second and third ranks treated specially. So, we
detect such cases and treat them. The treated ranks are now [1 2.5 2.5 4
5] instead. But, these ranks correspond to z, not to x. The final step
is to make them correspond to x by simply reversing the sorting process.

The concept isn't too hard, as above; the tedious part is the
vectorization of the rank-treatment mentioned above. While the above
vectorization is by itself challenging enough, some more tedium derives
from allowing x to be a matrix rather than a vector.


> Follow-up Comment #2, bug #36372 (project octave):
>
> Oops, I meant to refer you to this part of the Octave manual:
>
> http://www.gnu.org/software/octave/doc/interpreter/Basics-of-Generating-a-Changeset.html
>
> Also, some instructions on how to write the commit message:
>
> http://www.octave.org/wiki/index.php?title=Commit_message_guidelines

Dear Jordi

Thanks very much for the above helpful links re: this issue and for
chatting with me @ IRC as well. I just saw your latest messages there.

Unfortunately, though I had apt-get build-dep octave.* in order to build
octave, I failed at the very hg checkout process itself. Researching the
problem, I realized that's because my mercurial is < 1.9, which is not
compatible with git subdirectories.

It looks like a lot of my software is outdated, and that the only way
for me to contribute in the desired fashion would be to wait till
Debian's stable (I know I kinda live in the past by staunchly sticking
with stable.) gets a new release.

Since I don't have anything other than a Debian stable, I will probably
have to give up on this for now. :(  Once the debian migration happens,
I will be glad to check in again.
----


In the meanwhile, I have gone ahead and made a few changes to the file,
which I am including below.

(1) Per your suggestion, I have checked that all the 13 tests at the end
of the file work.

(2) With some tweaks, they even work with octave 3.2.4's assert which
had some minor bugs.

(3) In the interest of matlab compatibility, it seemed wrong to
introduce a third argument. What if matlab comes in and introduces their
own, *different* third argument some day?  So, I allowed the user who
wants to specify a ranking-scheme by optionally passing a structure
instead of matrix.

This new file is included here.:


> > > ====================================================

## Copyright (C) 1995-2012 Kurt Hornik Copyright (C) 2012 Dave Goel
##
## This file is part of Octave.
##
## Octave is free software; you can redistribute it and/or modify it
## under the terms of the GNU General Public License as published by the
## Free Software Foundation; either version 3 of the License, or (at
## your option) any later version.
##
## Octave is distributed in the hope that it will be useful, but WITHOUT
## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
## FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
## for more details.
##
## You should have received a copy of the GNU General Public License
## along with Octave; see the file COPYING.  If not, see
## <http://www.gnu.org/licenses/>.
##
## -*- texinfo -*-
##
## @deftypefn {Function File} {} ranks (@var{x}, @var{dim}) Return the
## ranks of @var{x} along the first non-singleton dimension adjust for
## ties. If the optional argument @var{dim} is given, operate along this
## dimension. @var{X} can optionally be a structure. In that case, its
## field named x is interpreted as the matrix, and its field named rtype
## is interpreted as the rank-type. The field RTYPE then donates the
## type of ranking: 0 or "fractional" for fractional, which is the
## default, 1 or "competition" for competiton ranking, 2 or "modified"
## for modified competition ranking, 3 or "ordinal" for ordinal ranking
## and 4 or "dense" for dense ranking. When using the third argument,
## you can supply us '' for @var{dim} to have it auto-computed. @end
## deftypefn
##
## This function returns a result identical to GNU Octave's built-in
## ranks.m but is much faster (For example, 1.4s vs. ''no result in
## 33mins'' when the input was a=round(10*rand(100123,100));
## Additionally, we also handle several ranking schemes. This function
## has been submitted to GNU Octave's mailing list and may become part
## of GNU Octave.
##
##
## Authors: KH <address@hidden>, Dave Goel <address@hidden>
## Description: Compute ranks


function y = ranksmy (x, dim)

  if (nargin < 1);
    print_usage ();
  endif

  if isstruct(x);
    rtype=0;
    if isfield(x,"rtype");
      rtype=x.rtype;
      if isempty(rtype);
        rtype=0;
      endif
    endif
    x=x.x;
  else
    rtype=0;
  endif


  nd = ndims (x);
  sz = size (x);
  if (nargin!=2);
    ## Find the first non-singleton dimension.
    dim  = 1;
    while (dim < nd + 1 && sz(dim) == 1)
      dim = dim + 1;
    endwhile
    if (dim > nd)
      dim = 1;
    endif
  else
    if (! (isscalar (dim) && dim == round (dim))
        && dim > 0
        && dim < (nd + 1))
      error ("ranks: dim must be an integer and valid dimension");
    endif
  endif

  if (sz(dim) == 1)
    y = ones(sz);
  else
    ## The algorithm works only on dim = 1, so permute if necesary.
    if (dim != 1)
      perm = [1 : nd];
      perm(1) = dim;
      perm(dim) = 1;
      x = permute (x, perm);
    endif
    sz=size(x);
    [sx ids]=sort(x); ## sx is sorted x.
    
    lin=cumsum(ones(size(x)),1);  ## A linearly increasing array.

    switch rtype;
      case {0,"fractional"};
        lin=(_competition(lin,sx,sz)+_modified(lin,sx,sz))/2;
      case {1,"competition"};
        lin=_competition(lin,sx,sz);
      case {2,"modified"};
        lin=_modified(lin,sx,sz);
      case {3,"ordinal"};
        ## lin=lin;
      otherwise 
        rtype
        error("Illegal value of rtype specified.");
    endswitch
    y=NaN(size(lin));
    ## If input was a vector, we could have simply done this: 
    ## y(ids)=lin;
    y(_ids(ids,sz))=lin;

    if (dim != 1)
      y = permute (y, perm);
    endif
  endif
endfunction

function idf=_ids(ids,sz);
  oo=ones(sz);
  allids=[{ids-1}];
  nn=numel(sz);
  for ii=2:nn;
    allids=[allids,{cumsum(oo,ii)-1}];
  endfor
  idf=allids{end};
  for jj=(nn-1):-1:1;
    idf=idf*sz(jj)+allids{jj};
  endfor
  idf+=1;
endfunction


function linnew=_competition (lin,sx,sz)
  ## Stop increasing lin when sx does not increase. Same as before
  ## otherwise.
  infvec = -Inf * ones ([1, sz(2 : end)]);
  fnewp= find(diff([infvec;sx]));
  linnew=zeros(size(lin));
  linnew(fnewp)=lin(fnewp);
  linnew=cummax(linnew);
endfunction


function linnew=_modified (lin,sx,sz)
  ## Traverse lin backwards. Stop decreasing it when sx doesn't
  ## decrease.
  infvec = Inf * ones ([1, sz(2 : end)]);
  fnewp= find(diff([sx;infvec]));
  linnew=Inf(size(lin));
  linnew(fnewp)=lin(fnewp);
  linnew=flipdim(cummin(flipdim(linnew,1)),1);
endfunction 


%!assert (ranksmy (1:2:10), [1:5]'')
%!assert (ranks (10:-2:1), [5:-1:1]'')
%!assert (ranks ([2, 1, 2, 4]), [2.5, 1, 2.5, 4])
%!assert (ranks (ones (1, 5)), 3*ones (1, 5)'')
%!assert (ranks (1e6*ones (1, 5)), 3*ones (1, 5)'')
%!assert (ranks (rand (1, 5), 1), ones (1, 5))

%% Test input validation
%!error ranks ()
%!error ranks (1, 2, 3)
%!error ranks ({1, 2})
%!error ranks (['A'; 'B'])
%!error ranks (1, 1.5)
%!error ranks (1, 0)
%!error ranks (1, 3)



> > > ====================================================

Sincerely, and best
Dave
-- 










reply via email to

[Prev in Thread] Current Thread [Next in Thread]