bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] apl computer


From: Christian Robert
Subject: Re: [Bug-apl] apl computer
Date: Fri, 26 Aug 2016 21:47:46 -0400
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

Well I saw a couple times where parallelism could have been very useful.

Something like:

      {SolveSudoku ⍵}⍤2 ⊣ Cube_of_9x9x1000_sudoku

      {SolveSudoku ⍵}¨ big_array_of_enclosed_9x9_sudoku

but I don't want ⍤ (rank operator) or ¨ (foreach)
operators doing parallelism per default, so I though
introducing a new operator (like in (3÷⍨1) reverse operators) who
would say "I know what I'm doing and run the function to the left in as many threads 
as I have cpu's available"


Just an idea I had a year or two ago when testing parallelism in gnu-apl. Very 
few
operators can really gain from parallelism, ⍤ (rank operator) and ¨ (foreach) 
are two.

comments ?

Xtian.



On 2016-08-26 16:57, Xiao-Yong Jin wrote:
Of course.  Simple scalar functions would be the worst to parallelize.
We always need a large amount of operations per thread to be worthwhile.
Something like the following might be a good thing to start
   ⌹⍤2 X
where (LargeNumber = ×/¯2↓⍴X) ∧ 2<⍴⍴X
To support general function instead of builtins the code would need to analyze 
the data dependencies, which I believe is impossible in APL because of ⍎ can’t 
be done without execute it, but supporting a subset seems good enough.  Looking 
at data parallel Haskell, and you know it’s a really hard problem.

I wish there will be an array language that could hide all the details of 
AVX/OMP/CUDA/MPI stuff.

There are some efforts in bringing SIMD type executions to array languages but 
in my opinion these are just not enough, as our central processors having more 
and more fast cores/threads.  (Once there was a time when APL was wonderful in 
utilizing computing powers, but it’s long gone.)

On Aug 26, 2016, at 2:34 PM, Juergen Sauermann <address@hidden> wrote:

Hi,

maybe, maybe not.

Our earlier measurements on an 80-core machine indicated that
the way how the cores connect to the memories seems to determine the
parallel performance that can be achieved in GNU APL.

One can easily prove that for sufficiently large APL arrays (of arbitrary rank) 
all scalar APL
functions must scale linearly with N on a N-processor PRAM.

See http://link.springer.com/chapter/10.1007%2F978-3-642-61012-7_11 (in German).

One can also show that on the 80-core machine that we used, the performance 
stops increasing
at fairly low N (around 10) even for very large arrays.

Which means that either the GNU APL parallel implementation of scalar functions 
is wrong or
that the 80-core machine we used is performance-wise not even near the PRAM 
model. My best
guess is the latter.

/// Jürgen


On 08/26/2016 08:41 PM, Xiao-Yong Jin wrote:
On Aug 26, 2016, at 1:12 PM, address@hidden
 wrote:

finally a computer just perfect for gnuapl


http://thehackernews.com/2016/08/powerful-multicore-processor.html
Now is the perfect time to invest your time and effort in improving parallel 
efficiency in gnu-apl.












reply via email to

[Prev in Thread] Current Thread [Next in Thread]