bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-apl] first shot at parallel APL


From: Juergen Sauermann
Subject: Re: [Bug-apl] first shot at parallel APL
Date: Fri, 26 Sep 2014 14:04:15 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.0

Hi Elias,

if you used a recent SVN then you need to set the thresholds (vector size) above which
parallel execution is performed:

      (⍳4) ∘.time 10⋆⍳7
0 0 1 3 29 254 2593
0 0 1 2 25 252 2618
0 0 1 2 26 258 2682
0 0 1 2 26 263 2866
     
      )COPY 5 FILE_IO
loading )DUMP file /usr/local/lib/apl/wslib5/FILE_IO.apl...

      1 FIO∆set_dyadic_threshold  '⋆'  
⍝ returns the previous threshold for dyadic ⋆
8070450532247928832

      (⍳4) ∘.time 10⋆⍳7
0 0 0 2 30 250 2590
0 0 0 1 15 149 1580
0 0 0 1 11 113 1225
0 3 0 0 12 103 1120

I am currently working on a benchmark workspace that determines the optimal thresholds
for the different scalar functions (and those thresholds will beome the future defaults). Right
now the default thresholds are so high that you will always have sequential execution.

/// Jürgen


On 09/26/2014 07:22 AM, Elias Mårtenson wrote:
I've tested this code, and I don't see much of an improvement as I increase the core count:

Given the following function:

    ∇Z ← NCPU time LEN;T;X;tmp
      ⎕SYL[26;2] ← NCPU
      X ← LEN⍴2J2
      T ← ⎕TS
      tmp ← X⋆X
      Z←1 1 1 24 60 60 1000⊥⎕TS - T
    ∇

I'm running this command on my 8-core workstation:

      (⍳8) ∘.time 10⋆⍳7
0 0 0 2 19 188 2139
0 0 1 2 19 189 2147
0 0 1 2 19 210 2256
0 0 0 2 19 194 2427
0 0 0 3 28 284 3581
0 0 0 3 27 280 3510
0 0 0 3 27 284 3754
0 0 0 3 27 279 3637

Regards,
Elias

On 26 September 2014 13:05, Elias Mårtenson <address@hidden> wrote:
Thanks, I have merged the necessary changes.

Regards,
Elias

On 22 September 2014 23:50, Juergen Sauermann <address@hidden> wrote:
Hi,

I have finished a first shot at parallel (i.e. multicore) GNU APL: SVN 480.

This version computes all scalar functions in parallel if the ravel length of the result exceeds 100.
This can make the computation of small (but still > 100) vectors slower than if they were computed sequentially.
Therefore parallel execution is not yet the default. To enable it:

    ./configure
    make parallel
    make
    sudo make install


The current version uses some linux-specific features, which will be ported to other platforms later on (if possible).
./configure is supposed to detect this.

Some simple benchmarks are promising:

      X←1000000⍴2J2   ⍝ 1 Mio complex numbers
     
      ⎕SYL[26;2]←1   ⍝ 1 core
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
246
     
      ⎕SYL[26;2]←2   ⍝ 2 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
136
     
      ⎕SYL[26;2]←3   ⍝ 3 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T
102
     
      ⎕SYL[26;2]←4   ⍝ 4 cores
      T←⎕TS ◊ ⊣X⋆X ◊ 1 1 1 24 60 60 1000⊥⎕TS - T

91

The next step will be to find the break-even points of all scalar functions, so that parallel execution is
only done when it promises some speedup.

Elias, the PointerCell constructor has got one more argument . I have updated emacs-mode and sql accordingly.
- you may want to sync back.

/// Jürgen







reply via email to

[Prev in Thread] Current Thread [Next in Thread]