octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: octave "Most Wanted" feature


From: robert bristow-johnson
Subject: Re: octave "Most Wanted" feature
Date: Mon, 27 Nov 2006 20:55:17 -0500

> ----- Original Message -----
> From: "Tom Holroyd (NIH/NIMH) [E]" <address@hidden>
> To: "robert bristow-johnson" <address@hidden>
> Subject: Re: octave "Most Wanted" feature
> Date: Mon, 27 Nov 2006 10:11:04 -0500
> 
> 
> The correct technical term is Index Origin.

That's good to know.  I have heard multiple terms for it (and had used "Index 
Base" instead), but "Index Origin" sounds like the best term.

> What would be involved in allowing both??


> Could you allow,
> 
>       > octave_index_origin (0);
> or
>       > octave_index_origin (1);      (the default)

or even 

        > octave_index_origin (-256);

But there are multiple dimensions (often two) in the array (or "matrix" as 
MATLAB and Octave like to call it, but for 3+ dimensions, I don't think 
"matrix" is the best term for a MATLAB/Octave variable) and there should be an 
index_origin property for each dimension.  I really think that the syntax of 
this would be closest to the existing size() and reshape() functions found in 
both MATLAB and Octave:

________________

 -- Built-in Function:  size(A, N)
 -- Built-in Function:  size(A)
     Return the number rows and columns of A.

     With one input argument and one output argument, the result is
     returned in a row vector.  If there are multiple output arguments,
     the number of rows is assigned to the first, and the number of
     columns to the second, etc.  For example,

          size([1, 2; 3, 4; 5, 6])
               => [ 3, 2 ]

          [nr, nc] = size([1, 2; 3, 4; 5, 6])
               => nr = 3
               => nc = 2

     If given a second argument, `size' will return the size of the
     corresponding dimension.  For example

          size([1, 2; 3, 4; 5, 6], 2)
               => 2

     returns the number of columns in the given matrix.



 -- Function File:  reshape(A, M, N, ...)
 -- Function File:  reshape(A, SIZ)
     Return a matrix with the given dimensions whose elements are taken
     from the matrix A.  The elements of the matrix are access in
     column-major order (like Fortran arrays are stored).

     For example,

          reshape([1, 2, 3, 4], 2, 2)
               =>  1  3
                   2  4

     Note that the total number of elements in the original matrix must
     match the total number of elements in the new matrix.

     A single dimension of the return matrix can be unknown and is
     flagged by an empty argument.

________________

Note that (providing we don't violate the restrictions in reshape()):

     siz1 = size( reshape(A, siz2) );

that the vectors siz1 and siz2 are equal.  Or after executing

     A = reshape(A, size(B));

that the shape of A and B are the same.  size() and reshape() are related in 
this way.

But in this proposed case with the origin of the different array dimensions, 
the functions (and restrictions) would be a little different.  Since "base" 
might not be as good as "origin" for this parameter, let's call these two 
proposed functions:

       origin(A, N)
       origin(A)

and

       reorigin(A, io1, io2, ...)
       reorigin(A, SIZ)             .

So origin(A) would return a row vector (same as in size(A)) that would contain, 
in its elements, the Index Origins of each dimension in the same manner as 
size().  And reorigin(A, io1, io2, io3, ...) would assign io1 to the Index 
Origin of dimension 1 (the columns) and io2 to the Index Origin of dimension 2 
(the rows), io3 to the Index Origin of dimension 3 (if it exists), etc.  So:

       orig1 = origin( reorigin(A, orig2) );

would mean that the row vectors orig1 and orig2 would be equal and that

       A = reorigin(A, origin(B));

that the origins of the corresponding dimensions of A and B are the same.

Does this (plus the previous post) explain clearly the function


> or would this require rewriting Octave?

Yes, it requires rewriting a few parts deep inside of Octave (but in such a way 
as to preserve backward compatibility which we all should consider to be 
super-important).  I tried to explain this before.  Here it is again, but I'll 
try to focus what it would be.  I do not know where to even begin to find this 
in the Octave code, but imagining that Octave is written in C or C++, a single 
Octave variable would have a structure definition that might look like and have 
similar fields:


enum octave_class {text, real, complex};
// I don't wanna cloud the issue considering other classes.

typedef struct
 {
 void* data;     // pointer to actual array data
 char* name;     // pointer to the variable's name
 enum octave_class type; // class of Octave variable (real, complex,...)
 int num_dimensions;     // number of array dimensions >= 2
 long* size; // points to a vector with the number of rows, columns,etc.
 } octave_variable;

name[32];       // Octave names are unique to 31 chars
size[num_dimensions];

if (type == text)
 {
 char data[size[0]*size[1]*...*size[num_dimensions-1]];
 }
 else if (type == real)
  {
  double data[size[0]*size[1]*...*size[num_dimensions-1];
  }
 else if (type == complex)
  {
  double data[2][size[0]*size[1]*...*size[num_dimensions-1]];
  }

When an element, A(n,k), of a 2 dimensional Octave array A is accessed, first k 
and n are confirmed to be integer value (not a problem in C), then confirmed to 
be positive and less than or equal to size[0] (or size(A,1) in Octave) and 
size[1] (or size(A,2) in Octave), respectively.  It those constraints are 
satisfied, the value of that element is accessed as:

data[(k-1)*size[0] + (n-1)];

For a 3 dimensional array, A(m,n,k), it would be the same but now:

data[((k-1)*size[1] + (n-1))*size[0] + (m-1)];

Now how Octave needs to be modified is this: A new field, a vector of the same 
length as the size[] vector (one element for each dimension of the Octave 
variable) is added to the structure as:

typedef struct
 {
 void* data;     // pointer to actual array data
 char* name;     // pointer to the variable's name
 enum octave_class type; // class of Octave variable (real, complex,...)
 int num_dimensions;     // number of array dimensions >= 2
 long* size; // points to a vector with the number of rows, columns,etc.
 long* index_origin; // points to a vector with the index_origins
 } octave_variable;

name[32];       // Octave names are unique to 31 chars
size[num_dimensions];
index_origin[num_dimensions];  // the Index Origins for each dim


And each element of index_origin[] is intialized to the number 1 when a new 
Octave variable is created.  Then those constants "1" that are subtracted out 
of m, n, and k to point to the actual data need to be replaced with 
index_origin[0], index_origin[1], and index_origin[2], respectively.  For a 
2-dim array, it's

data[(k-index_origin[1])*size[0] + (n-index_origin[0])];

For a 3 dimensional array, A(m,n,k):

data[((k-index_origin[2])*size[1] + (n-index_origin[1]))*size[0] + 
(m-index_origin[0])];


Then, in the other post, I tried to spell out in the salient cases (except for 
Matrix Division using / or \) how certain common Octave operations (Matrix 
addition, Matrix multipliplication, Matrix contatination, functions returning 
indices like max(), min(), find(), elementary functions of an array, etc.) 
would be extended so that the would still work exactly the same for 1-Origin 
arrays, yet still have meaning for arrays with different Index Origin. I didn't 
mention it, but I would also right away create (with different names so as to 
not break backward compatibility) new functions for fft() (call it "dft()") and 
conv() (call it "convolve()") so that zer-Origin or negative-Origin arrays can 
be used that is compatible with all of the textbooks and published literature 
(no one does the DFT or discrete-convolution with the MATLAB indexing 
convention which is why I know it is just wrong). 



--

r b-j                  address@hidden

"Imagination is more important than knowledge."




reply via email to

[Prev in Thread] Current Thread [Next in Thread]