[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: octave "Most Wanted" feature
From: |
robert bristow-johnson |
Subject: |
Re: octave "Most Wanted" feature |
Date: |
Mon, 27 Nov 2006 20:55:17 -0500 |
> ----- Original Message -----
> From: "Tom Holroyd (NIH/NIMH) [E]" <address@hidden>
> To: "robert bristow-johnson" <address@hidden>
> Subject: Re: octave "Most Wanted" feature
> Date: Mon, 27 Nov 2006 10:11:04 -0500
>
>
> The correct technical term is Index Origin.
That's good to know. I have heard multiple terms for it (and had used "Index
Base" instead), but "Index Origin" sounds like the best term.
> What would be involved in allowing both??
> Could you allow,
>
> > octave_index_origin (0);
> or
> > octave_index_origin (1); (the default)
or even
> octave_index_origin (-256);
But there are multiple dimensions (often two) in the array (or "matrix" as
MATLAB and Octave like to call it, but for 3+ dimensions, I don't think
"matrix" is the best term for a MATLAB/Octave variable) and there should be an
index_origin property for each dimension. I really think that the syntax of
this would be closest to the existing size() and reshape() functions found in
both MATLAB and Octave:
________________
-- Built-in Function: size(A, N)
-- Built-in Function: size(A)
Return the number rows and columns of A.
With one input argument and one output argument, the result is
returned in a row vector. If there are multiple output arguments,
the number of rows is assigned to the first, and the number of
columns to the second, etc. For example,
size([1, 2; 3, 4; 5, 6])
=> [ 3, 2 ]
[nr, nc] = size([1, 2; 3, 4; 5, 6])
=> nr = 3
=> nc = 2
If given a second argument, `size' will return the size of the
corresponding dimension. For example
size([1, 2; 3, 4; 5, 6], 2)
=> 2
returns the number of columns in the given matrix.
-- Function File: reshape(A, M, N, ...)
-- Function File: reshape(A, SIZ)
Return a matrix with the given dimensions whose elements are taken
from the matrix A. The elements of the matrix are access in
column-major order (like Fortran arrays are stored).
For example,
reshape([1, 2, 3, 4], 2, 2)
=> 1 3
2 4
Note that the total number of elements in the original matrix must
match the total number of elements in the new matrix.
A single dimension of the return matrix can be unknown and is
flagged by an empty argument.
________________
Note that (providing we don't violate the restrictions in reshape()):
siz1 = size( reshape(A, siz2) );
that the vectors siz1 and siz2 are equal. Or after executing
A = reshape(A, size(B));
that the shape of A and B are the same. size() and reshape() are related in
this way.
But in this proposed case with the origin of the different array dimensions,
the functions (and restrictions) would be a little different. Since "base"
might not be as good as "origin" for this parameter, let's call these two
proposed functions:
origin(A, N)
origin(A)
and
reorigin(A, io1, io2, ...)
reorigin(A, SIZ) .
So origin(A) would return a row vector (same as in size(A)) that would contain,
in its elements, the Index Origins of each dimension in the same manner as
size(). And reorigin(A, io1, io2, io3, ...) would assign io1 to the Index
Origin of dimension 1 (the columns) and io2 to the Index Origin of dimension 2
(the rows), io3 to the Index Origin of dimension 3 (if it exists), etc. So:
orig1 = origin( reorigin(A, orig2) );
would mean that the row vectors orig1 and orig2 would be equal and that
A = reorigin(A, origin(B));
that the origins of the corresponding dimensions of A and B are the same.
Does this (plus the previous post) explain clearly the function
> or would this require rewriting Octave?
Yes, it requires rewriting a few parts deep inside of Octave (but in such a way
as to preserve backward compatibility which we all should consider to be
super-important). I tried to explain this before. Here it is again, but I'll
try to focus what it would be. I do not know where to even begin to find this
in the Octave code, but imagining that Octave is written in C or C++, a single
Octave variable would have a structure definition that might look like and have
similar fields:
enum octave_class {text, real, complex};
// I don't wanna cloud the issue considering other classes.
typedef struct
{
void* data; // pointer to actual array data
char* name; // pointer to the variable's name
enum octave_class type; // class of Octave variable (real, complex,...)
int num_dimensions; // number of array dimensions >= 2
long* size; // points to a vector with the number of rows, columns,etc.
} octave_variable;
name[32]; // Octave names are unique to 31 chars
size[num_dimensions];
if (type == text)
{
char data[size[0]*size[1]*...*size[num_dimensions-1]];
}
else if (type == real)
{
double data[size[0]*size[1]*...*size[num_dimensions-1];
}
else if (type == complex)
{
double data[2][size[0]*size[1]*...*size[num_dimensions-1]];
}
When an element, A(n,k), of a 2 dimensional Octave array A is accessed, first k
and n are confirmed to be integer value (not a problem in C), then confirmed to
be positive and less than or equal to size[0] (or size(A,1) in Octave) and
size[1] (or size(A,2) in Octave), respectively. It those constraints are
satisfied, the value of that element is accessed as:
data[(k-1)*size[0] + (n-1)];
For a 3 dimensional array, A(m,n,k), it would be the same but now:
data[((k-1)*size[1] + (n-1))*size[0] + (m-1)];
Now how Octave needs to be modified is this: A new field, a vector of the same
length as the size[] vector (one element for each dimension of the Octave
variable) is added to the structure as:
typedef struct
{
void* data; // pointer to actual array data
char* name; // pointer to the variable's name
enum octave_class type; // class of Octave variable (real, complex,...)
int num_dimensions; // number of array dimensions >= 2
long* size; // points to a vector with the number of rows, columns,etc.
long* index_origin; // points to a vector with the index_origins
} octave_variable;
name[32]; // Octave names are unique to 31 chars
size[num_dimensions];
index_origin[num_dimensions]; // the Index Origins for each dim
And each element of index_origin[] is intialized to the number 1 when a new
Octave variable is created. Then those constants "1" that are subtracted out
of m, n, and k to point to the actual data need to be replaced with
index_origin[0], index_origin[1], and index_origin[2], respectively. For a
2-dim array, it's
data[(k-index_origin[1])*size[0] + (n-index_origin[0])];
For a 3 dimensional array, A(m,n,k):
data[((k-index_origin[2])*size[1] + (n-index_origin[1]))*size[0] +
(m-index_origin[0])];
Then, in the other post, I tried to spell out in the salient cases (except for
Matrix Division using / or \) how certain common Octave operations (Matrix
addition, Matrix multipliplication, Matrix contatination, functions returning
indices like max(), min(), find(), elementary functions of an array, etc.)
would be extended so that the would still work exactly the same for 1-Origin
arrays, yet still have meaning for arrays with different Index Origin. I didn't
mention it, but I would also right away create (with different names so as to
not break backward compatibility) new functions for fft() (call it "dft()") and
conv() (call it "convolve()") so that zer-Origin or negative-Origin arrays can
be used that is compatible with all of the textbooks and published literature
(no one does the DFT or discrete-convolution with the MATLAB indexing
convention which is why I know it is just wrong).
--
r b-j address@hidden
"Imagination is more important than knowledge."