bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6327: sort fails on some UTF-8 input


From: Pádraig Brady
Subject: bug#6327: sort fails on some UTF-8 input
Date: Wed, 02 Jun 2010 16:31:52 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3

On 02/06/10 05:51, River Tarnell wrote:
> I'm using coreutils 8.5 on Solaris 10.
> 
> GNU 'sort' fails to sort some input, while Solaris 'sort' handles it
> correctly:
> 
> willow% /opt/ts/gnu/bin/sort sort_test.txt 
> /opt/ts/gnu/bin/sort: string comparison failed: Illegal byte sequence
> /opt/ts/gnu/bin/sort: Set LC_ALL='C' to work around the problem.
> /opt/ts/gnu/bin/sort: The strings compared were
> `\360\222\203\276\360\222\205\226' and
> `\360\222\200\255\360\222\213\253\360\222\213\253\360\222\200\255'.
> willow% /usr/bin/sort sort_test.txt 
> 𒃾𒅖
> 𒀭𒋫𒋫𒀭
> willow% 
> 
> I've attached the example file sort_test.txt.

I'm not sure what those characters are, but they're valid UTF8
and my linux system here has no issue with sorting them.
Note we just use strcoll() to do the comparison.
What strcoll() are you linking against?

cheers,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]