bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18540: Sorting bug?


From: Göran Uddeborg
Subject: bug#18540: Sorting bug?
Date: Tue, 23 Sep 2014 22:24:45 +0200

I discovered a behaviour of "sort" that looks like a bug to me.  When
one key in the input is an initial part of another key, the shorter
key is sorted first if the key is all there is on the line.  But if
there are other fields too, not included in the key, the order
changes.  That is true even with the --stable flag, so "sort" seems to
consider the order of the keys different in the two cases.

I sort in a non-C locale.  sv_SE.utf8 actually, but en_US.utf8 behaves
the same so I illustrate using that.

First case, the key is all there is on the line.  The shorter line
gets sorted earlier, regardless of input order:

    address@hidden Hämtat]$ { echo 'binutils x86_64'; echo 
'binutils-x86_64-linux-gnu x86_64'; } | LANG=en_US.utf8 sort --stable --debug 
--key=1,1 --field-separator=!
    sort: using ‘en_US.utf8’ sorting rules
    binutils x86_64
    _______________
    binutils-x86_64-linux-gnu x86_64
    ________________________________
    address@hidden Hämtat]$ { echo 'binutils-x86_64-linux-gnu x86_64'; echo 
'binutils x86_64'; } | LANG=en_US.utf8 sort --stable --debug --key=1,1 
--field-separator=!
    sort: using ‘en_US.utf8’ sorting rules
    binutils x86_64
    _______________
    binutils-x86_64-linux-gnu x86_64
    ________________________________



Second case, the input lines contains a second field.  Now the longer
field gets sorted earlier, regardless of input order:

    address@hidden Hämtat]$ { echo 'binutils x86_64!new'; echo 
'binutils-x86_64-linux-gnu x86_64!new'; } | LANG=en_US.utf8 sort --stable 
--debug --key=1,1 --field-separator=!
    sort: using ‘en_US.utf8’ sorting rules
    binutils-x86_64-linux-gnu x86_64!new
    ________________________________
    binutils x86_64!new
    _______________
    address@hidden Hämtat]$ { echo 'binutils-x86_64-linux-gnu x86_64!new'; echo 
'binutils x86_64!new'; } | LANG=en_US.utf8 sort --stable --debug --key=1,1 
--field-separator=!
    sort: using ‘en_US.utf8’ sorting rules
    binutils-x86_64-linux-gnu x86_64!new
    ________________________________
    binutils x86_64!new
    _______________


I can't see any reason for this.  Is it me not understanding sorting,
or is it actually a bug?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]