bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?


From: Eric Blake
Subject: Re: locale specific ordering in EN_US -- why is a<A<b<B<y<Y<z<Z?
Date: Mon, 21 May 2012 15:14:06 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

On 05/21/2012 03:02 PM, Linda Walsh wrote:

>> the cat was out of the bag.  POSIX 2001 had to continue to allow
>> existing implementations, by stating that range expressions in anything
>> but the C locale are explicitly undefined.
> 
> ---------------------
> 
> 
>     Explicitly undefined?   Or locale dependent?

POSIX explicitly undefined ranges for all but the C locale.  _Other
standards_, such as Unicode, are free to add range requirements on top
of what POSIX requires, but alas, Unicode collation order does NOT
currently specify anything about regular expression or glob range
matching, so it is out of scope for Unicode to say what [A-Z] expands to.

> 
>     I.e. Unicode does specify ordering, so if your locale is set
> to UTF-8 character encoding, then it is explicitly defined.  This would
> seem to be in conflict with unicode -- and any implementation claiming
> to be unicode compatible MUST use unicode ordering when the local character
> set is defined to be Unicode.

Unicode may specify collation ordering, but it does NOT specify regular
expression range ordering.

> 
>     This doesn't conflict with Posix, as Posix doesn't define an order
> for such -- but a different standard, (Unicode) does specify a
> standard.  So
> for those using UTF-8, shouldn't that have made the order randomization
> 'moot'?

Wishing doesn't make it so.  The fact is that regular expression ranges
are currently unspecified in all but the C locale; the RRI project is
attempting to make it sane across all locales within the scope of GNU
programs, but it takes time to write and approve the patches necessary
to get to that point.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]