octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 3D versus 2D Indexing and the Speed Thereof


From: John W. Eaton
Subject: Re: 3D versus 2D Indexing and the Speed Thereof
Date: Mon, 9 Apr 2007 20:01:18 -0400

On  6-Apr-2007, Luis F. Ortiz wrote:

| 1)  One of the methods patched is assign2(). It has the following
| signature:
| 
|         template <class LT, class RT>
|         int
|         assign2 (Array<LT>& lhs, const Array<RT>& rhs, const LT& rfv)
| 
| This seems to me to be an attempt to support type conversions during the
| assignment.
| But the code I wrote only works for the case where RT and LT are the
| same type.   What
| is the right way to handle this?  Can it be done at
| runtime/compiletime? 
| Is this ever instantiated with LT != RT?

I thought about this a bit more and came up with the following as a
possible solution.

Your copy strips function is:

  template <class T>
  void
  Array<T>::copy_strips (const Array<T>& source,
                         octave_idx_type dest_offset,
                         octave_idx_type source_offset,
                         octave_idx_type element_count,
                         octave_idx_type block_count,
                         octave_idx_type source_stride,
                         octave_idx_type dest_stride)
  {
    T *raw_source, *raw_dest;

    // First do one element to force the copy-on-write
    elem(dest_offset) = source.elem (source_offset);
    raw_source = &(source.rep->data[source_offset]);
    raw_dest = &(rep->data[dest_offset] );

    for (octave_idx_type i = 0; i < block_count; i++)
     {
       memcpy (raw_dest, raw_source, sizeof(T)*element_count);
       raw_source += source_stride;
       raw_dest += dest_stride;
     }
  }

I think it should maybe be done with something like this (pushing the
actual work down to the Array<T>::ArrayRep level):

  template <class T> class Array
  {
    ...

    class ArrayRep
    {
      ...

      // Generic mixed-type copy-strips function:
      template <class U>
      void
      copy_strips (const Array<U>& source,
                   octave_idx_type dest_offset,
                   octave_idx_type source_offset,
                   octave_idx_type element_count,
                   octave_idx_type block_count,
                   octave_idx_type source_stride,
                   octave_idx_type dest_stride)
      {
        make_unique ();

        const U *source_data = source.data ();
        const U *raw_source = &source_data[source_offset];
        T *raw_dest = &data[dest_offset];

        for (octave_idx_type i = 0; i < block_count; i++)
         {
           for (octave_idx_type j = 0; j < element_count; j++);
             raw_dest[i] = raw_source[i];

           raw_source += source_stride;
           raw_dest += dest_stride;
         }
      }

      // Partial specialization of mixed-type copy-strips function for
      // case of LHS type == RHS type (only really necessary if it
      // actually makes things faster to use memcpy):
      template <class T>
      void
      copy_strips (const Array<T>& source,
                   octave_idx_type dest_offset,
                   octave_idx_type source_offset,
                   octave_idx_type element_count,
                   octave_idx_type block_count,
                   octave_idx_type source_stride,
                   octave_idx_type dest_stride)
      {
        make_unique ();

        const T *source_data = source.data ();
        const T *raw_source = &source_data[source_offset];
        T *raw_dest = &data[dest_offset];

        for (octave_idx_type i = 0; i < block_count; i++)
         {
           memcpy (raw_dest, raw_source, sizeof(T)*element_count);

           raw_source += source_stride;
           raw_dest += dest_stride;
         }
      }

      ...

    }; /* class ArrayRep */

    ...

    template <class U>
    void
    copy_strips (const Array<U>& source,
                 octave_idx_type dest_offset,
                 octave_idx_type source_offset,
                 octave_idx_type element_count,
                 octave_idx_type block_count,
                 octave_idx_type source_stride,
                 octave_idx_type dest_stride)
    {
      rep->copy_strips (source, dest_offset, source_offset,
                        element_count, block_count, source_stride,
                        dest_stride);
    }

    ...

  }; /* class Array */


I haven't actually tried this code yet with the Array class, but I
think it should work.  Here is a very simple and complete example of
the same kind of thing that appears to work correctly for me with g++
3.4 and 4.1:

  #include <iostream>

  template <class T> struct foo
  {
    struct foo_rep
    {
      template <class U>
      void doit (U) { std::cerr << "mixed" << std::endl; }

      void doit (T) { std::cerr << "same" << std::endl; }
    };

    foo (void) : rep (new foo_rep ()) { }

    template <class U>
    void doit (U x) { rep->doit (x); }

    foo_rep *rep;
  };

  int
  main (void)
  {
    double y = 0;
    int z = 0;

    foo<double> x;
    x.doit (y);
    x.doit (z);

    return 0;
  }


jwe


reply via email to

[Prev in Thread] Current Thread [Next in Thread]