Re: LC_COLLATE in the C locale

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LC_COLLATE in the C locale

From:	Paul Eggert
Subject:	Re: LC_COLLATE in the C locale
Date:	Wed, 18 Dec 2019 08:27:02 -0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2

On 12/18/19 2:29 AM, Bruno Haible wrote:
> Hi Paul,
> 
>> I do have a qualm in that coreutils (and I assume others) interpret 
>> !hard_locale
>> (LC_COLLATE) as meaning that the locale is unibyte and uses native byte
>> comparison.
> Isn't this warranted by section "LC_COLLATE Category in the POSIX Locale" in
> <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap07.html> ?

I don't see where that section requires unibyte.

>> As I recall on some platforms (macOS maybe?), the C locale uses
>> UTF-8 so this interpretation isn't correct.
> UTF-8 has the nice property that byte-per-byte comparison and codepoint-per-
> codepoint comparison are equivalent.

True, so the code that assumes strcmp == strcoll should work. But I think some
code specifically assumes unibyte. Presumably that code should also check
MB_CUR_MAX, which should be enough in practice (even though it doesn't suffice
in theory).

[Prev in Thread]

Current Thread

[Next in Thread]

hard-locale: make multithread-safe, Bruno Haible, 2019/12/17
- Re: hard-locale: make multithread-safe, Tim Rühsen, 2019/12/17
- Re: hard-locale: make multithread-safe, Paul Eggert, 2019/12/17
  - Re: hard-locale: make multithread-safe, Bruno Haible, 2019/12/18
    - Re: hard-locale: make multithread-safe, Bruno Haible, 2019/12/21
  - Re: LC_COLLATE in the C locale, Bruno Haible, 2019/12/18
    - Re: LC_COLLATE in the C locale, Paul Eggert <=
  - Re: hard-locale: make multithread-safe, Bruno Haible, 2019/12/18
    - Re: hard-locale: make multithread-safe, Bruno Haible, 2019/12/24

Prev by Date: localename: ensure multithread-safety in future changes
Next by Date: Re: [PATCH 4/4] dfa: do not match invalid UTF-8
Previous by thread: Re: LC_COLLATE in the C locale
Next by thread: Re: hard-locale: make multithread-safe
Index(es):
- Date
- Thread