|
From: | David Lamkins |
Subject: | Re: [Bug-apl] ScalarBenchmark for inner and outer products |
Date: | Fri, 17 Oct 2014 10:33:52 -0700 |
Hi David,
you may see a non-zero startup cost even though an operation shows cost 0.
This is because startup cost is averaged over all monadic or all dyadic operations.
The reason for zero startup cost on the products is most likely due to a reorg of the counter numbers.
I forgot to update ScalarBenchmark.apl; fixed in SVN 489.
In general the OP and STAT columns in ScalarBenchmark.apl should match the ]PSTAT command, e.g if:
]pstat 38
╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗
║ A f.g B ║ 0 │ 0 │ 0 │ 0 │ 0 ║
╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝
then the STAT number for f.g should be 38 in ScalarBenchmark,apl.
/// Jürgen
On 10/17/2014 05:58 PM, David B. Lamkins wrote:
I'm seeing zero start-up costs for inner and outer products when running ScalarBenchmark.apl.
===================== Mat1_IRC +.× Mat1_IRC =============================== Benchmarking start-up cost for Mat1_IRC +.× Mat1_IRC ... Length Sequ Cycles Para Cycles Linear Sequ Linear Para ====== =========== =========== =========== =========== 25 0 0 0 0 25 0 0 0 0 25 0 0 0 0 25 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 4 0 0 0 0 4 0 0 0 0 4 0 0 0 0 1 0 0 0 0 regression line sequential: 0 + 0×N cycles regression line parallel: 0 + 0×N cycles ===================== Vec1_IRC ∘.× Vec1_IRC =============================== Benchmarking start-up cost for Vec1_IRC ∘.× Vec1_IRC ... Length Sequ Cycles Para Cycles Linear Sequ Linear Para ====== =========== =========== =========== =========== 25 0 0 0 0 25 0 0 0 0 25 0 0 0 0 25 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 16 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 9 0 0 0 0 4 0 0 0 0 4 0 0 0 0 4 0 0 0 0 1 0 0 0 0 regression line sequential: 0 + 0×N cycles regression line parallel: 0 + 0×N cyclesBut then in the summary section -- just above ]PSTAT -- I see:
-------------- Mat1_IRC +.× Mat1_IRC -------------- average sequential startup cost: 359 cycles average parallel startup cost: 832 cycles per item cost sequential: 0 cycles per item cost parallel: 0 cycles parallel break-even length: not reached -------------- Vec1_IRC ∘.× Vec1_IRC -------------- average sequential startup cost: 359 cycles average parallel startup cost: 832 cycles per item cost sequential: 0 cycles per item cost parallel: 0 cycles parallel break-even length: not reachedHere the startup costs are nonzero, but the per-item costs are all zero.
This doesn't look right... Or am I missing something?
In case it might shed some additional light, here's the final section of the ]PSTAT output. The rest looks reasonable except for epsilon-underbar, which reports all zeroes.
╔═════════════════╦════════════╤══════════╤══════════╤══════════╤══════════╗ ║ Function ║ │ N │ ⌀ VLEN │ ⌀ cycles │ cyc÷VLEN ║ ╟─────────────────╫────────────┼──────────┼──────────┼──────────┼──────────╢ ║ f B overhead ║ 18446744003448130869 │ 283 │ 1993 │ 34818579233229 │ 17466187239 ║ ║ A f B overhead ║ 18446743954621671206 │ 1114 │ 84 │ 1447585256996 │ 17221844259 ║ ║ scalar B ║ 130198460 │ 283 │ 3873 │ 460065 │ 118 ║ ║ A scalar B ║ 91680403 │ 1114 │ 949 │ 82298 │ 86 ║ ║ clone B ║ 233950109373 │ 75391125 │ 131 │ 3103 │ 23 ║ ║ A f.g B ║ 911702656227 │ 40046 │ 163 │ 22766385 │ 139671 ║ ║ A ∘.g B ║ 9809803882 │ 121 │ 1000000 │ 81072759 │ 81 ║ ║ A ⍴ B ║ 9071 │ 3 │ 27 │ 3023 │ 111 ║ ║ PrintBuffer(B) ║ 135760049 │ 1168 │ 25 │ 116232 │ 4649 ║ ╚═════════════════╩════════════╧══════════╧══════════╧══════════╧══════════╝
[Prev in Thread] | Current Thread | [Next in Thread] |