[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#6327: sort fails on some UTF-8 input
From: |
Pádraig Brady |
Subject: |
bug#6327: sort fails on some UTF-8 input |
Date: |
Wed, 02 Jun 2010 16:31:52 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100227 Thunderbird/3.0.3 |
On 02/06/10 05:51, River Tarnell wrote:
> I'm using coreutils 8.5 on Solaris 10.
>
> GNU 'sort' fails to sort some input, while Solaris 'sort' handles it
> correctly:
>
> willow% /opt/ts/gnu/bin/sort sort_test.txt
> /opt/ts/gnu/bin/sort: string comparison failed: Illegal byte sequence
> /opt/ts/gnu/bin/sort: Set LC_ALL='C' to work around the problem.
> /opt/ts/gnu/bin/sort: The strings compared were
> `\360\222\203\276\360\222\205\226' and
> `\360\222\200\255\360\222\213\253\360\222\213\253\360\222\200\255'.
> willow% /usr/bin/sort sort_test.txt
> 𒃾𒅖
> 𒀭𒋫𒋫𒀭
> willow%
>
> I've attached the example file sort_test.txt.
I'm not sure what those characters are, but they're valid UTF8
and my linux system here has no issue with sorting them.
Note we just use strcoll() to do the comparison.
What strcoll() are you linking against?
cheers,
Pádraig.