octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OctDev] optimized lookup


From: Jaroslav Hajek
Subject: Re: [OctDev] optimized lookup
Date: Tue, 19 Feb 2008 11:22:45 +0100

On Feb 19, 2008 10:19 AM, John W. Eaton <address@hidden> wrote:
> On 19-Feb-2008, Jaroslav Hajek wrote:
>
> | OK, I've removed the function from Octave-Forge. What do you think
> | about including
> | this function into Octave (i.e. replacing the current m-file version?)
> |
> | My primary motivation for optimizing this was the TODO comment in the
> | current lookup.m,
> | that actually suggests writing a such an implementation, pointing out
> | the worse asymptotic
> | complexity and real-life performance. It even seems there had been a
> | binary search m-file implementation before, but was later rejected for
> | being too slow.
>
> As David said, we also have to consider whether the speedup is worth
> the extra maintenence cost (our experience shows that many fewer
> people who use Octave can effectively contribute fixes to C++ code).
>
> In any case, if it is to be included a number of style issues need to
> be addressed so that it is more consistent with the rest of Octave
> (see below for some comments).  Although these changes may seem
> trivial, I think it is important to keep the style consistent so that
> the whole of Octave is easier to read.
>
> | Copyright (C) 2008 VZLU Prague a.s., Czech Republic
>
> Why not Copyright Jaroslav Hajek?  Does your employer require the
> above statement?  Do they claim ownership of the code you write?  If
> so, do you have permission from them to release the code under the
> terms of the GPL?
>

OK, that's an issue. My employer allows me to contribute to open
source projects as a part of my work under a research grant. Since I'm
using Octave for my work, I wanted to give something in return by
contributing to it. It is, however, desirable that the results are
"provable" in some way to the grant donator (government) should he ask
for such proof, and I'm resolving this by assigning the copyright to
my employer. My understanding is that it does not matter, because the
GPL ensures all necessary legal aspects to keep the source free. Yes,
I am explicitly permitted to release under GPL. If there is really a
problem, feel free to change this to my name, or even your name - or I
can do it.

> | #include <oct.h>
>
> In code that is part of Octave, you need to conditionally include
> <config.h> and then only the subset of header files that are really
> needed.  We don't want every file depending on (nearly) all the header
> files in the Octave source tree.  The typical organization of header
> files is
>
>   <config.h>
>
>   standard C/C++ headers
>
>   liboctave headers
>
>   octave headers
>
> In each grouping, the headers are typically listed alphabetically.
>

OK

> | // use this to select an appropriate output index type
> |
> | #define RETURN_REAL_INDICES
> | #ifdef RETURN_REAL_INDICES
> | typedef double ret_idx_type;
> | typedef NDArray ret_idx_array;
> | #elif USE_64_BIT_IDX_T
> | typedef octave_uint64 ret_idx_type;
> | typedef uint64NDArray ret_idx_array;
> | #else
> | typedef octave_uint32 ret_idx_type;
> | typedef uint32NDArray ret_idx_array;
> | #endif
>
> I'm not sure whether much is gained by returning ints, and in any
> case, you should just use octave_idx_type so you don't have to check
> bit sizes.
>

I've discussed this with D.B. yesterday. What is gained is small
memory and possibly performance improvements, what is lost is strict
compatibility (backwards and Matlab). I reckon that returning
double-valued indices in both Octave and Matlab dates back to the
times when there were no integer types. In any case, I guess it's wise
to write new code in
such a way that it is easier to change if some day you decide to
change sort et al. to return integer indices (for instance, if Matlab
undergoes such a change).

> | // simple function templates & specialization is used
> | // to avoid writing two versions of the algorithm
> |
> | // generic comparison
> | template<bool rev>
> | inline bool dcomp(const double& a,const double& b);
> | template<bool rev>
> | inline bool dcompe(const double& a,const double& b);
>
> Please use a space before the "(" that begins a function parameter
> list and after commas.  Also, we usually separate function definitions
> and declarations by a blank line so it is easier to see where one ends
> and the next begins.
>

It seems that Octave's, on the whole, adheres to GNU coding standards, which
I've found yesterday on GNU site.
I'll reformat the code according to these, as well as all directions
you give below.

> |   if (!nvals) return;
>
> Please write this as
>
>   if (! nvals)
>     return;
>
> |   const double *first, *last;
> |   size_t cur;
> |   // the trivial case of empty array
> |   if (begin == end)
> |     {
> |       for(;nvals;--nvals) *(idx++) = 0;
> |       return;
> |     }
> |   // initialize
> |   last = end;
> |   first = begin;
>
> Please declare varibles with their first use if possible.  It might
> also be good to provide an initial value for cur above.  So the code
> above would be
>
>   size_t cur = 0;
>   // the trivial case of empty array
>   if (begin == end)
>     {
>       for (; nvals; --nvals)
>         *(idx++) = 0;
>       return;
>     }
>
>   // initialize
>   const double *last = end;
>   const double *first = begin;
>
>
> | bin_search:
> | store_cur:
> |           if (dcompe<rev>(*last,*vals)) goto search_forwards;
> |         if (dcomp<rev>(*vals,*first)) goto search_backwards;
> |           if (dcompe<rev>(*last,*vals)) goto search_forwards;
> |           if (dcomp<rev>(*vals,*first)) goto search_backwards;
> | search_forwards:
> | search_backwards:
> |     if (!(--dist))
> |       goto store_cur; // a shortcut
> |     else
> |       goto bin_search;
>
> This algorithm seems quite complex with the labels and backward
> branching.  Is this the cleanest way it could be written?  I imagine
> some difficulty for someone later who might need to modify it or fix a
> bug.

I tried, but it looked more messy to me. Perhaps you'll be more successful.
In the end, it seemed the cleanest to separate the algorithm into the
four labelled
blocks and just branch between them accordingly.

>
> | DEFUN_DLD(lookup,args,nargout,"\
> | -*- texinfo -*- \n\
> | @deftypefn {Function File} address@hidden =} lookup (@var{table}, @var{y}) 
> \n\
>
> Please write
>
> DEFUN_DLD (lookup, args, nargout,
>   "-*- texinfo -*-\n\
> @deftypefn {Function File} address@hidden =} lookup (@var{table}, @var{y})\n\
>
> There is no need for spaces before the newline characters.
>
> | If complex values are supplied, their magnitudes are used. \n\
> | @end deftypefn \n")
>
> The last newline character here doesn't do any harm, but it is not
> necessary, so we don't usually include it.
>
> |   int nargin = args.length();
> |
> |   if (nargin != 2 || nargout > 1)
> |     {
> |       print_usage();
> |       return octave_value_list();
> |     }
>
> In most cases, we try to have a single return from functions, so the
> preferred style for argument checking is
>
>   if (arguments_are_ok)
>     {
>       // do the real work.
>     }
>   else
>     print_usage ();

Really? After quick scanning through the functions in
src/DLD-FUNCTIONS/, it seems to me that multiple error returns are
quite common. But I suppose it can be done at the cost of a lot more
statement nesting, or perhaps using boolean variables. Which one is
preferable?

>
> |   if (atable.ndims() > 2 ||
> |       (atable.rows() > 1 && atable.columns() > 1))
>
> When logical expressions span multiple lines, please write
>
>   if (atable.ndims() > 2
>       || (atable.rows() > 1 && atable.columns() > 1))
>
> |       error("table should be a vector");
>
> Please begin error messages with the name of the function.  For
> example,
>

Shame on me, I've been told this before.

>   error ("lookup: table should be a vector");
>
> |       return octave_value_list();
> |     }
> |   NDArray ay;
> |   if (y.is_real_type())
> |     ay = y.array_value();
> |   else if (y.is_complex_type())
> |     ay = y.complex_array_value().abs();
> |   else
> |     {
> |       error("y is not numeric type");
> |     }
> |
> |   ret_idx_array idx(y.dims());
>
> You shouldn't continue after calling error because it doesn't throw an
> exception.
>

a return is missing, of course.

> |   seq_lookup(atable.data(),(size_t)atable.numel(),ay.data(),
> |       (size_t)ay.numel(),idx.fortran_vec());
>
> Casts are discouraged, but if you must use them, please use C++ style
> casts since they are easier to find.

OK

>
> |   return octave_value_list(octave_value(idx));
>
> There is a constructor which will automatically convert an
> octave_value object to an octave_value_list object, so the above can
> be written as
>
>   return octave_value (idx);

Nice, I didn't realize that. This also means that I can use plain return
instead of return octave_value_list() or return retval; (with retval empty).
Except I shouldn't use multiple returns, of course


>
> jwe
>



-- 
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]