bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re-2: uniq works not correct


From: Jim Meyering
Subject: Re: Re-2: uniq works not correct
Date: Tue, 09 Sep 2008 07:11:11 +0200

Eric Blake <address@hidden> wrote:
> [Please keep the list posted on replies, so that others may chime in]
>
> According to address@hidden on 9/8/2008 1:27 PM:
>> Hallo
>>
>> Thanks for your answer.
>>
>> In my first mail is attach a file "test".
>>
>> To arrive at the conclusion, I`ve the same file by sort twice used.
>> there's no way around this
>>
>> type the following command:
>>
>> sort test test | uniq -u |wc -l
>>
>> The reslut should to be equal "0"
>
> And I got that result, because I did 'export LC_ALL=C' (bash notation; or
> 'setenv LC_ALL C' for csh notation) before running the experiment.  I
> could not reproduce your failure, which is almost certainly due to your
> current locale settings.
>
> But in looking closer at your report, I noticed that uniq uses xmemcoll (a
> wrapper around strcoll) rather than strcmp when determining whether lines
> are equal.  So, it looks like uniq is SUPPOSED to recognize lines with
> different byte contents but equal collation values as identical, but that
> it failed to do so in your case.  It would be very informative for us to
> know which locale you were running when you saw unexpected results;
> perhaps there is a bug after all, where sort's use of xmemcoll and uniq's
> use of xmemcoll are not lining up, to the point where uniq is not properly
> filtering lines that sort treated as identical.  Please show us the output
> of running 'locale' on your SUSE11.0 box.

Thanks for investigating.
FYI, using the latest coreutils from git, and running the following
shows that that particular combination of sort, uniq, and input data
produces no output for any of the 688 (702) locales installed on my
debian unstable (rawhide) systems:

  for i in $(locale -a); do
    (export LC_ALL=$i; sort test test | uniq -u|grep . && echo $i); done

Same result with the 52 locales on an OpenSolaris system.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]