bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?


From: Linda Walsh
Subject: Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
Date: Mon, 21 May 2012 14:02:59 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.24) Gecko/20100228 Lightning/0.9 Thunderbird/2.0.0.24 Mnenhy/0.7.6.666



Eric Blake wrote:

On 05/21/2012 01:51 PM, Linda Walsh wrote:

POSIX is not supposed to be prescriptive -- but **descriptive**...

I can't think of anywhere that a-z or A-Z would have included letters
from the opposite case... so how did POSIX come to *prescribe* that this
be the case... since I can't see that as being descriptive.

POSIX 1992 was the culprit that proscribed that [A-Z] must be in
collation order across all locales,.....



realizing that range expressions had unusual semantics as a result of
the 1992 specification, POSIX 2001 quickly

---
[quickly?!  9 years later?!   *cough*]

reverted things, but by then,
the cat was out of the bag.  POSIX 2001 had to continue to allow
existing implementations, by stating that range expressions in anything
but the C locale are explicitly undefined.

---------------------


        Explicitly undefined?   Or locale dependent?

        I.e. Unicode does specify ordering, so if your locale is set
to UTF-8 character encoding, then it is explicitly defined.  This would
seem to be in conflict with unicode -- and any implementation claiming
to be unicode compatible MUST use unicode ordering when the local character
set is defined to be Unicode.

        This doesn't conflict with Posix, as Posix doesn't define an order
for such -- but a different standard, (Unicode) does specify a standard.  So
for those using UTF-8, shouldn't that have made the order randomization 'moot'?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]