coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sort option for Posix-locale simple comparisons


From: Ray Dillinger
Subject: Re: Sort option for Posix-locale simple comparisons
Date: Mon, 08 Apr 2013 09:46:02 -0700
User-agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.12) Gecko/20130116 Icedove/10.0.12

On 04/08/2013 08:00 AM, Eric Blake wrote:
>  On 04/08/2013 01:27 AM, Ray Dillinger wrote:


 It turns out that 'sort' is grabbing locale information now and
 doing a locale-aware sort.

> Yes, this behavior has been required by POSIX for more than
>20 years,

Really.  Hm.  It wasn't that long ago.  Oh, wait, I know
what this is.  We weren't using locales other than the 'C'
locale on our servers until we needed a UTF-8 locale to
handle non-English text, so we made that change three
and a half years ago.  Okay, at least now I know when it
broke and how much archived data has to be reprocessed.
 ... Yikes...  That's going to be about a solid 300 days of
CPU time by the time the reprocessing is done and the
data miner gets through it. Figure all-night runs on about
half our server cluster for about a month before we'll be
caught up, plus a hard pull on our offsite backups.

 There is a workaround; one can set the locale to 'C' or 'POSIX'
 directly in a script (or at the shell prompt) and then set it
 back after calling 'sort'.

> That is not just a workaround, but the POSIX-mandated
> way to get sane sorting results. Script writers have been
> doing this for years.

Sigh.  Well, I'm going to reiterate that it's ugly, moves a lot of
unnecessary boilerplate into scripts, can fail in too many
ways, and can cause secondary failures.  Nice to see you
acknowledging it as "sane results" though.

Still, if your minds are made up about not doing this, I guess
the other points you make about procedure are not relevant
to this issue.  Good to know for other stuff later though.

Thanks for considering it.

Ray






reply via email to

[Prev in Thread] Current Thread [Next in Thread]