|
From: | Dr . Jürgen Sauermann |
Subject: | Re: Incorrect results with unique of 20 or more elements |
Date: | Wed, 29 Jan 2020 19:55:50 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 |
Hi, interesting talk, definitely worthwhile watching. I could not understand it completely since I am not a very good APL programmer, but it was interesting to see that I am not the only one that did monadic ∪ wrong. I have done a major rework of the algorithm and I believe that the new one gives the same result as the ISO algorithm, but is at the same time more efficient. SVN 1231. Best Regards, Jürgen On 1/29/20 10:20 AM, Jay Foad wrote:
For ideas on how to make Unique behave sanely in the presence of tolerant comparison, see: https://www.youtube.com/watch?v=fPWky9IOG40 Jay. On Tue, 28 Jan 2020 at 20:13, Dr. Jürgen Sauermann <mail@jürgen-sauermann.de> wrote:Hi Kacper, I am aware of the non-transitivity of = when ⎕CT ≠ 0. However, the algorithm used in GNU APL is essentially the same as in the ISO standard. The result is sometimes puzzling. The difference between GNU APL and ISO is that the ISO algorithm has, in the worst case, quadratic runtime, while GNU APL first sorts the items, which brings down the execution time to n log(n). However, sorting takes only place when number of items is large (20 or more). That's why you see a difference (only) between 19 and 20: the underlying algorithm are then different (and due to the non-transitivity of = give the observed different results. I could lower the split point of the two algorithms from 20 to 2 but that would result in a somewhat lower performance for short vectors. IMHO, given the non-transitivity of = (which is the root cause for the observed behaviour), is the performance penalty for maintaining the exact order of almost identical values not worth its effort. Best regards, Jürgen Sauermann On 1/28/20 6:10 PM, Kacper Gutowski wrote:Actually, while the algorithm used for 20≥⍴ works well with characters or integers (once you fix the direction of inequality), I don't think it's actually correct at all for non-zero ⎕CT because tolerant equality is not transitive. Consider this: X←1+0 1 2 5 4 3×(⎕CT←1E¯9)÷2 ∪19↑X 1 1.000000001 1.000000002 0 ∪20↑X 1.000000001 1.000000002 0 Both expressions should give the same result. -k |
[Prev in Thread] | Current Thread | [Next in Thread] |