octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #50561] 350X slower code for eps() written in


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #50561] 350X slower code for eps() written in C++ rather than an m-file
Date: Thu, 16 Mar 2017 14:11:20 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:51.0) Gecko/20100101 Firefox/51.0

Follow-up Comment #4, bug #50561 (project octave):

That "retval = epsval" was definitely redundant.  An fairly good optimizing
compiler would take that out of the loop.

But there is more than that here.  In particular, this line:


  else if (val < std::numeric_limits<double>::min ())
      epsval(i) = pow (2.0, -1074e0);


I don't know how often that case gets called.  But pow(2.0,-1074e0) is a
constant.  A really good optimizing compiler would take that outside the loop,
but I doubt a fairly general compiler like linux gcc would do that.  (Only
something like a C compiler for embedded chips would get that picky.)

Note that in your script, that is what is effectively done:


  retval(idx2) = pow2 (-1074);


The pow2 routine is called once, and then that value is assigned to all the
appropriate elements.

So, in the C++ loop, the preference is to take that pow (2.0, -1074e0) prior
to the loop by setting some stack variable and then re-using the stack
variable.  Or even better, try to make it a static stack variable so that
pow(2.0,-1074e0) resides in memory and then is simply a very fast machine
instruction load.  The compiler might not like

static double smallpow = pow(2.0,-1074e0);

however, so you might need to be creative in how to generate that value at
compile time.

Also, this pow2() and log2() function.  You may want to check where that is
coming from.  It may be a noticeably faster implementation than the general
routines:

pow (2.0, -1074e0);
octave::math::frexp (val, &expon);
std::pow (2.0,static_cast<double> (expon - 53));

For example, power of two computations can probably be done rather quick in
integer math first with a single instruction bit shift.  (Above, there is a
cast from integer to double, then two doubles for operators, then I'm sure
general pow has much more work to do than simple bit shifts.)  If so, try
substituting pow() with that routine specific to base-2 logarithmic
computations.

Another possible little trick (who's performance probably depends on the
nature of the data) would be to retain "expon" and "pow2(expon)" then reuse...
hold on, why even do that?  Take a look at this particular instruction:

std::pow (2.0,static_cast<double> (expon - 53));

Because the left argument is fixed (2.0, i.e., 2), and I believe expon is an
integer value with fairly limited range, -128 to 128, am I right?  The above
line is ultimately a fairly short look-up table such that

std::pow (2.0,static_cast<double> (expon - 53));

becomes

pow2lut[expon - 53 + 128];

or whatever the proper offsets depending on the possible range of expon.

One could generate the LUT at startup, first time this routine is called, or
during compilation by constructing the LUT


double pow2lut[256] = {
  123456677123412423e-1234,
  988876765443332222e-1231,
...
}


where the numbers could come from Octave, generated using full format
precision.  The latter is a little dodgy for a platform portable program,
probably only something I'd do for highly-controlled embedded code.  I think
generating such a LUT the first time the routine is used though is fine.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?50561>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]