Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?

From:	Daniel J Sebald
Subject:	Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?
Date:	Fri, 21 Jul 2017 11:34:47 -0500
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 07/21/2017 10:43 AM, Daniel J Sebald wrote:

On 07/21/2017 10:17 AM, Rik wrote:

On 07/21/2017 01:24 AM, Daniel J Sebald wrote:

On 07/20/2017 11:31 PM, Rik wrote:

On 07/20/2017 04:39 PM, John W. Eaton wrote:

[snip]

But this change should only be made in cases where the size really
is a
constant.


There were only a few instances of static buffers which I changed to
use
local variables in this cset
(http://hg.savannah.gnu.org/hgweb/octave/rev/cda0614beaec).


Rik, at the bottom of this changeset is

+      octave_idx_type ii = 0;
+      octave_idx_type jj;
       for (jj = 0; jj < (nc - 8 + 1); jj += 8)
         {
           for (ii = 0; ii < (nr - 8 + 1); ii += 8)


[snip]

Just curious about the use of a function pointer and call and how
efficient that is.  For example, complex-conjugation can probably be
done in just a few instruction cycles, whereas repeatedly placing a
variable on the stack and jumping to a function and consume much more.
Or does the Template construct deal with all this, i.e., that the T
*fcn() is treated more like an inline?

There are two different cases here,

  if (nr >= 8 && nc >= 8)

Wouldn't the former case still work if nr or nc were less than 8?

That's an interesting motivation, to avoid cache jumping.  Wouldn't this
idea still apply if say, nc = 3 and nr = 1000000?

This 8x8 construct has got me thinking a bit. It's an interesting idea,very parallel processing oriented in a way, but I'm curious howefficient this is. There is still big spacing in this loop


              for (octave_idx_type j = jj, k = 0, idxj = jj * nr;
                   j < jj + 8; j++, idxj += nr)

if nr is extremely large. Isn't it normally the case that memory ispacked along one dimension versus the other, e.g.,


matrix-index : memory-index
0  0    :       0
1  0    :       1
2  0    :       2
...
999999  :  999999
0, 1    : 1000000
1, 1    : 1000001
2, 1    : 1000002
...
etc.

In other words, we want to make sure we are always placing the innerindex in the direction which the memory is packed contiguous. So,rather than have a block that spans 8 rows and 8 columns, don't we wanta block that spans 64 rows (or less) and just one column? That way weare more likely to stay within the cache memory. Am I thinking aboutthis correctly?

Another thought is that rather than place that function operation withinthe loop (if that fcn is in fact compiled a stack/jump/return sequence):


                  result.xelem (j + idxi) = fcn (buf[k]);

maybe it would make more sense to first copy the matrix as a transposeand then make a call to a function which has arguments NR and NC so thatthe called function can do a simple double loop through the whole arraywith a register-based fcn equivalent, whether that is complex conjugateor whatnot.

Dan

[Prev in Thread]

Current Thread

[Next in Thread]

Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Rik, 2017/07/18
- Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, John W. Eaton, 2017/07/18
  - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Rik, 2017/07/19
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, John W. Eaton, 2017/07/19
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Rik, 2017/07/20
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, John W. Eaton, 2017/07/20
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Rik, 2017/07/21
    - Message not available
    - Message not available
    - Message not available
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Daniel J Sebald <=
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Daniel J Sebald, 2017/07/23
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, John W. Eaton, 2017/07/21
    - Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?, Mike Miller, 2017/07/21
    - Re: initialization style, Rik, 2017/07/21

Prev by Date: AddressSanitizer: stack-use-after-scope
Next by Date: Re: Memory Issues in Octave
Previous by thread: Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?
Next by thread: Re: Replace OCTAVE_LOCAL_BUFFER implementation with std::unique_ptr?
Index(es):
- Date
- Thread