[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Incorrect results with unique of 20 or more elements
From: |
Dr . Jürgen Sauermann |
Subject: |
Re: Incorrect results with unique of 20 or more elements |
Date: |
Tue, 28 Jan 2020 21:13:13 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 |
Hi Kacper,
I am aware of the non-transitivity of = when ⎕CT ≠ 0. However, the
algorithm used in GNU APL is essentially the same as in the ISO
standard. The result is sometimes puzzling.
The difference between GNU APL and ISO is that the ISO
algorithm has, in the worst case, quadratic runtime, while
GNU APL first sorts the items, which brings down the execution
time to n log(n).
However, sorting takes only place when number of items is large (20 or
more). That's why you see a difference (only) between 19 and 20: the
underlying algorithm are then different (and due to the non-transitivity
of = give the observed different results.
I could lower the split point of the two algorithms from 20 to 2 but
that would result in a somewhat lower performance for short vectors.
IMHO, given the non-transitivity of = (which is the root cause for the
observed behaviour), is the performance penalty for maintaining the
exact order of almost identical values not worth its effort.
Best regards,
Jürgen Sauermann
On 1/28/20 6:10 PM, Kacper Gutowski wrote:
> Actually, while the algorithm used for 20≥⍴ works well with characters
> or integers (once you fix the direction of inequality), I don't think
> it's actually correct at all for non-zero ⎕CT because tolerant
> equality is not transitive.
> Consider this:
> X←1+0 1 2 5 4 3×(⎕CT←1E¯9)÷2
> ∪19↑X
> 1 1.000000001 1.000000002 0
> ∪20↑X
> 1.000000001 1.000000002 0
>
> Both expressions should give the same result.
> -k
>
>