bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35939: version sort is incorrect with hyphen-minus


From: Ian Jackson
Subject: bug#35939: version sort is incorrect with hyphen-minus
Date: Thu, 27 Jun 2019 11:25:01 +0100

Vincent Lefevre writes ("Re: bug#35939: version sort is incorrect with 
hyphen-minus"):
> On 2019-06-26 18:40:50 -0700, Paul Eggert wrote:
> > Perhaps the coreutils manual could be improved to make this all clearer, and
> > perhaps it should refer to the Debian manual if it doesn't already.
> 
> In this case, there should be a new ordering option to provide
> true numeric sort with strings mixing non-negative integers and
> characters.

I think the Debian algorithm is such an algorithm, but it has a
wrinkle which you are not expecting.  Here is the specification:
  https://www.debian.org/doc/debian-policy/ch-controlfields.html#version

Note in particular
  | The lexical comparison is a comparison of ASCII values modified so
  | that all the letters sort earlier than all the non-letters and so
  | that a tilde sorts before anything, even the end of a part

So in the Debian algorithm, `-' sorts after `a'.  I specified this
rule.  I did it mainly because of versions like `1.0beta3', which is
is probably a prerelease of `1.0' and therefore earlier than `1.0.3'.
So `b' has to sort before `.' and my rule seemed the simplest one to
achieve that.  (The version comparison algorithm is a tradeoff between
complexity, and breadth of support for people's then-existing
practices.)  Nowadays Debian invariably writes `1.0~beta3' but when I
invented this scheme I did not include the (invaluable) `~' feature.

When this is extended to UTF-8, presumably the ordering should be an
ordering of unicode scalar values, with the rule about letters
interpreted as referring to anything which Unicode considers a letter.

If you want to test the Debian algorithm and have access to a copy of
dpkg, you can append -1 to both strings to be the "Debian revision",
and prepend "1:" to be the "epoch", and then the middle part should be
compared the same way as sort -V etc.

Vincent, what is your use case for a comparison algorithm which is
like the Debian one but which sorts letters after punctuation ?

Ian.

-- 
Ian Jackson <address@hidden>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]