[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#35939: version sort is incorrect with hyphen-minus
From: |
Assaf Gordon |
Subject: |
bug#35939: version sort is incorrect with hyphen-minus |
Date: |
Wed, 26 Jun 2019 12:25:26 -0600 |
User-agent: |
Mutt/1.11.4 (2019-03-13) |
(Adding Ian Jackson for dpkg/debian-version details)
Hello,
On Tue, May 28, 2019 at 02:53:39AM +0200, Vincent Lefevre wrote:
> With GNU coreutils 8.30 under Debian/unstable, I get:
>
> $ LC_ALL=C ls
> ab-cd abb abe
> $ LC_ALL=C ls -v
> abb abe ab-cd
>
> The hyphen-minus character should still be regarded as being less
> than the letters (there are no digits, so both are expected to be
> equivalent). The GNU coreutils manual says:
>
[...]
Thanks for the report and the clear details.
To summarize,
"ls -v" and "sort -V" (coreutils' version sort) behaves differently than
other implementations in regards to minus character:
$ printf "%s\n" abb ab-cd | sort -V
abb
ab-cd
$ v1="abb"
$ v2="ab-cd"
$ dpkg --compare-versions "$v1" lt "$v2" && printf "$v1\n$v2\n" || printf
"$v2\n$v1\n"
ab-cd
abb
If I understand correctly,
The reason is that in Debian's version comparison algorithm [1], the minus
character has a special meaning: it separates the "upstream version"
part from the "debian revision" part.
In Debian's implementation [2], a version string is first split into three
parts (epoch, upstream version, debian revision) using ":" for epoch
delimiter and "-" for revision delimiter. Only then the three parts are
compared, separately [3].
[1] https://www.debian.org/doc/debian-policy/ch-controlfields.html#version
[2] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/parsehelp.c#n191
[3] https://git.dpkg.org/cgit/dpkg/dpkg.git/tree/lib/dpkg/version.c#n140
On ther other hand, coreutils' implementation (from gnulib [4]) does not
break version string into three parts - it treats the entire string as a
single "upstream version" part.
The rules for sorting the "upstream version" string say:
"... The lexical comparison is a comparison of ASCII values modified so
that all the letters sort earlier than all the non-letters and so that a
tilde sorts before anything" (from [1])
[4] https://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/filevercmp.c
Therefore, dpkg first seprates "ab" from "cd", then compares "ab" to
"abb" - and 'ab' comes first;
Coreutils compare "ab-cd" to "abb" (or technically, just "ab-" to
"abb"), and because "letters sort earlier than all non-letters", "abb"
comes first.
I hope this helps explain the differences (I also hope this explanation is
correct, and I invite others to chime in).
regards,
- assaf
- bug#35939: version sort is incorrect with hyphen-minus,
Assaf Gordon <=
- bug#35939: version sort is incorrect with hyphen-minus, Paul Eggert, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Vincent Lefevre, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Assaf Gordon, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Ian Jackson, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Ian Jackson, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Paul Eggert, 2019/06/26
- bug#35939: version sort is incorrect with hyphen-minus, Vincent Lefevre, 2019/06/27
- bug#35939: version sort is incorrect with hyphen-minus, Ian Jackson, 2019/06/27
- bug#35939: version sort is incorrect with hyphen-minus, Florian Weimer, 2019/06/28
bug#35939: version sort is incorrect with hyphen-minus, Ian Jackson, 2019/06/26