octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Octave Interpreter


From: Stefan Seefeld
Subject: Re: Octave Interpreter
Date: Mon, 06 Oct 2014 08:27:10 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.1.1

Hi Mathew,

[I'm moving this back to the list, I assume you just accidentally hit
'reply' instead 'reply-all']


On 2014-10-06 00:07, Bipin Mathew wrote:
> Hello Richard, Stefan, et. al. 
>
>     Just to get us thinking about what implementing a distributed
> computing engine for Octave would take, lets consider a simple example
> and discuss what might be required. Just to get it rolling, lets look at 
>
> A = fft(X)
>
> where X is an instance of a dmat class (i.e. a "distributed matrix" ),
> in the sense that its contents are distributed across several servers.
> For the moment, i suppose we can consider fft to be an overloaded
> function specific to type dmat. Moreover, just for concreteness and
> simplicity lets think of X as a 2d matrix of size 200 rows by 10 columns.

OK.

>
> For a distributed implementation of fft, I reckon the first thing the
> top level fft function would need to know is "Where is X?". This could
> be gotten from a distributed data store of some kind, but suppose in
> the end we get something like this.
>
> X = { num_dimensions = 2
>     [1,100]  
>  ,[1,10],{server1/path_a1;server2/path_b1;...serverN/path_z1}
>     [101,200],[1,10],{server1/path_a2;server2/path_b2;serverN/path_z2}
>     [1,100]    ,[11,20],{server1/path_a3;server2/path_b3;serverN/path_z3}
>     [101,200] ,[11,20],{server1/path_a4;server2/path_b4;serverN/path_z4}
> }

OK.

>
> The first row for example says, all the elements contained withing the
> 1st -> 100th elements of the first dimension and the 1->10th elements
> of the second dimension can be found on
> servers/path  {server1/path_a1;server2/path_b1;...serverN/path_z1}
> (NFS? ). I also imagine instead of NFS paths, we can have identifiers
> to locations in memory of a slave process spawned on these servers.

Yes. I really think we should use standard APIs for this, rather than
inventing our own (API, transport protocol, etc.).
For example, it seems the existing MPI bindings I was just pointed to
would be a great starting point.
(Note that the pMatlab package itself also is layered over MPI Matlab
bindings.)

>
> The top level fft function would then spawn slave octave processes on
> the corresponding servers ( using a job handler ? ). The slaves would
> then load from disk and read into memory the the file they were
> assigned ( the file would be local to that slave. We should move
> computation not data ) and do a localized computation. In this case
> each slave does an FFT along each of the columns that are local to it
> and maintains that "sub"-fft in its memory.

Where these additional processes are spawned is an interesting question.
I doubt that an operation such as fft() would be the right place to do
it, though. For a proof-of-concept, I think we may as well consider the
MPI bindings, and just spawn octave itself via mpirun. Once this
mechanism is working, we could think of ways to delay the sub-process
spawning, to let users only run the normal single-node octave (acting as
a "controller"), and then spawn "engine" processes later from a new
"parallel_init()" function.

>
> The top-level fft function would then ask each slave process for its
> "sub"-fft and do the necessary computation locally to get the FFT of
> the complete vector OR, push the necessary multiplier matrix down to
> the slaves and have the slave multiply the sub-fft matrix by the
> multiplier matrix provided by the top-level fft function and viz-a-vie
> have a distributed matrix that represents the answer to the original FFT.

Yes, though, I think reinventing such an API is not a good idea. Rather,
we should use existing know-how, such as MPI.
In particular, the pMatlab package
(http://www.ll.mit.edu/mission/cybersec/softwaretools/pmatlab/pmatlab.html)
is layered over a Matlab MPI API
(http://www.ll.mit.edu/mission/cybersec/softwaretools/matlabmpi/matlabmpi.html),
so if we can manage to support an MPI API similar to that with the
Octave MPI package, we may get pMatlab support for free.

In other words, I see two main areas that need work:

* Assess the state of the Octave MPI bindings (and adding any additional
functionality required that's not yet covered)

* Consider ways to spawn sub-processes and set up MPI communicators
interactively


    Stefan

-- 

      ...ich hab' noch einen Koffer in Berlin...




reply via email to

[Prev in Thread] Current Thread [Next in Thread]