octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xelem vs. operator ()


From: Julien Bect
Subject: Re: xelem vs. operator ()
Date: Fri, 2 Sep 2016 09:57:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.2.0

Le 01/09/2016 à 17:42, Rik a écrit :
On 09/01/2016 03:38 AM, address@hidden wrote:
> Subject: > A question about NDArray, xelem and operator () > From: > Julien Bect <address@hidden> > Date: > 09/01/2016 01:40 AM > To: > octave-maintainers <address@hidden> > List-Post: > <mailto:address@hidden> > Content-Transfer-Encoding: > 7bit > Precedence: > list > MIME-Version: > 1.0 > Message-ID: > <address@hidden> > Content-Type: > text/plain; charset=utf-8; format=flowed > Message: > 4 > > Hi all, > > While I was working on the GSL package the following question arose: > > what is the difference between using A(i) and A.xelem(i) when I need to read a value from an array A in an oct-file ? > > (A is an NDArray and i of type octave_idx_type) > > @++ > Julien

Julien,

Generally you should be using the operator syntax A(i) within an oct-file to access an element.  The xelem() method is for raw access to the underlying data with absolutely no checking whatsoever (no bounds checking for indices outside the size of the Array, no checks for reference counts > 1, i.e, that someone else has a shared copy of this data).  This is okay in the Octave core when we want higher performance AND can guarantee that it is safe to use.  Dynamically linked oct-files have a great potential for de-stabilizing Octave since we have no control over how careful the programmer is with memory references, etc.

If you intend to iterate over every element in an array and possibly change it then there is a slight performance hit (~10%) to using a straight for loop

for (octave_idx_type i = 0; i < x.numel (); i++)
  x(i) += 1;

In that case, you are better off getting a pointer to the actual storage with data() and working on that.  But, optimization should always come after code correctness.  The code above is easy to understand and most of the time 10% is not the problem.

For an example of the optimized approach, see the map() function in Array.h which calls a supplied function (such as cos, sin, etc.) on every element.  This has an additional optimization of working on a stride of 4 elements each time in order to limit the overhead of calling octave_quit () for every single element in the array.

//! Apply function fcn to each element of the Array<T>.  This function
//! is optimized with a manually unrolled loop.
template <typename U, typename F>
Array<U>
map (F fcn) const
{
  octave_idx_type len = numel ();

  const T *m = data ();

  Array<U> result (dims ());
  U *p = result.fortran_vec ();

  octave_idx_type i;
  for (i = 0; i < len - 3; i += 4)
    {
      octave_quit ();

      p[i] = fcn (m[i]);
      p[i+1] = fcn (m[i+1]);
      p[i+2] = fcn (m[i+2]);
      p[i+3] = fcn (m[i+3]);
    }

  octave_quit ();

  for (; i < len; i++)
    p[i] = fcn (m[i]);

  return result;
}

Thank you Rik for the explanation.  (Remark: perhaps some of this could go into the doxygen documentation, somewhere ?)

For the gsl package I have chosen the optimized version, since it has to do with a generic wrapper (the overhead might be negligible for some heavy special functions, but not-so-negligible for faster ones).

The use case is very similar to the map function above, so I don't think it is "risky".

@++
Julien



reply via email to

[Prev in Thread] Current Thread [Next in Thread]