Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8

bug-gnulib

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8

From:	Pádraig Brady
Subject:	Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8
Date:	Thu, 07 Jun 2012 15:27:31 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0

On 06/07/2012 03:13 PM, Paolo Bonzini wrote:
> Il 07/06/2012 14:50, Eric Blake ha scritto:
>>>> The fix could be to have two different locale_charset() functions,
>>>> one that returns "US-ASCII" and another one that returns "UTF-8".
>>>> The first one to be used when MB_CUR_MAX and mbrtowc() are used as
>>>> well, the second one to be used by gettext(). But the separation
>>>> line between the two cases is not yet clear to me. Any insights?
> 
> The separation line is what you wrote: whether you'll use the text
> simply for presentation, or whether you'll process it before.  But
> alternatively, we might try a variant of what Eric has suggested...
> 
>> On OS X, can we wrap MB_CUR_MAX to pretend to be 1 when in the "C"
>> locale, to match what cygwin did in distinguishing between 'C' and
>> 'C.UTF-8'?
> 
> ... which is to wrap MB_CUR_MAX and pretend that it is 3.

How do Mac OS X native command line tools select between
UTF8 and byte processing (C in the traditional sense).
I'd find it surprising if they didn't support a "C" mode,
for performance or functional reasons.

Max Horn states that Terminal.app on Mac OS X has an option
"Set LANG environment variable" which is _enabled by default_.
So that seems to me like it would handle the gettext issue?
Thus allowing the handling of US-ASCII as "normal",
and not mapping to UTF8.

cheers,
Pádraig.

[Prev in Thread]

Current Thread

[Next in Thread]

Fwd: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paolo Bonzini, 2012/06/01
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Pádraig Brady, 2012/06/01
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Stephen J. Butler, 2012/06/01
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paul Eggert, 2012/06/02
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paolo Bonzini, 2012/06/02
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Bruno Haible, 2012/06/07
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Eric Blake, 2012/06/07
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paolo Bonzini, 2012/06/07
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Pádraig Brady <=
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Eric Blake, 2012/06/07
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paolo Bonzini, 2012/06/07
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Eric Blake, 2012/06/07
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/06/10
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/06/18
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paolo Bonzini, 2012/06/23
    - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paul Eggert, 2012/06/23
- Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/06/06
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Max Horn, 2012/06/06
  - Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8, Paul Eggert, 2012/06/06

Prev by Date: Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8
Next by Date: Re: [PATCH] readme-release: require the promoted modules
Previous by thread: Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8
Next by thread: Re: GNU sed version 4.2.1: on OS X, C locale gets aliased to UTF-8
Index(es):
- Date
- Thread