Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only

From:	Eric Blake
Subject:	Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
Date:	Mon, 21 Dec 2015 10:49:27 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 12/18/2015 12:55 PM, Markus Armbruster wrote:
> Alberto Garcia <address@hidden> writes:
> 
>>>>> We do however have translations for a few simple strings for the GTK+
>>>>> menu items, so in order to run QEMU using the C locale, and yet have a
>>>>> translated UI let's use setlocale() for LC_MESSAGES only.
>>>>>
>>>> Not sure why I noticed it only now and if it's related to any recent
>>>> package upgrade on my side (using RHEL 7), but I noticed that
>>>> non-ASCII characters in the GTK UI strings are broken for me and git
>>>> bisect pointed to this commit.
>>>
>>> I guess we need to set LC_CTYPE too.
>>
>> That affects functions in ctype.h (isalpha(), islower(), isupper(), ...)
>> I guess that's safe?

Gnulib introduces functions named c_isalpha(), c_islower(), and so
forth, which behave identically regardless of the current locale,
precisely because locale-dependent definitions on which byte sequences
form a valid character can cause undesirable behavior.  I don't know if
glib does the same, but it does indeed have the potential to affect us,
in at least util/id.c:id_wellformed().  It would be weird to let the
user's choice of locale determine which ids they can create.

> 
> If we're guessing, then I guess it isn't.  But we shouldn't be guessing.
> 
> "LC_CTYPE affects the behavior of the character handling functions and
> the multibyte and wide character functions."
> 
> I doubt there's much use for the latter in QEMU itself, but in
> libraries, all bets are off.  I guess this is what actually screws up
> GTK.
> 
> We do use the former.  LC_CTYPE set to some sufficiently funky locale is
> bound to upset these uses.
> 
> In short: nope, we can't just set LC_CTYPE, at least not without further
> analysis.

In fact, if LC_CTYPE and LC_COLLATE are incompatible, then strcoll() has
undefined behavior.  GNU coreutils warns:

    Unless otherwise specified, all comparisons use the character
    collating sequence specified by the ‘LC_COLLATE’ locale.(1)
    [...]
    (1) If you use a non-POSIX locale (e.g., by setting ‘LC_ALL’ to
    ‘en_US’), then ‘sort’ may produce output that is sorted differently than
    you’re accustomed to.  In that case, set the ‘LC_ALL’ environment
    variable to ‘C’.  Note that setting only ‘LC_COLLATE’ has two problems.
    First, it is ineffective if ‘LC_ALL’ is also set.  Second, it has
    undefined behavior if ‘LC_CTYPE’ (or ‘LANG’, if ‘LC_CTYPE’ is unset) is
    set to an incompatible value.  For example, you get undefined behavior
    if ‘LC_CTYPE’ is ‘ja_JP.PCK’ but ‘LC_COLLATE’ is ‘en_US.UTF-8’.

Off-hand, we are specifically NOT calling setlocale() for the categories
that we want to leave in the C locale, so we don't have to worry about
LC_ALL throwing us off.  And I'm hard-pressed to think of an example
where LC_COLLATE=C while LC_CTYPE is a multibyte character will cause
unusual sorting artifacts (the one that coreutils is warning against is
when you have two incompatibly different multibyte character sets
involved, where our case is a multibyte character set for display but a
unibyte set for collation).  But it is indeed a can of worms, that
requires special analysis.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Kevin Wolf, 2015/12/18
- Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Gerd Hoffmann, 2015/12/18
  - Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Kevin Wolf, 2015/12/18
  - Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Alberto Garcia, 2015/12/18
    - Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Markus Armbruster, 2015/12/18
    - Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only, Eric Blake <=

Prev by Date: Re: [Qemu-devel] Jobs 2.0 QAPI [RFC]
Next by Date: Re: [Qemu-devel] [PATCH] qmp: return err msg when powerdown a vm when it isn't in running state
Previous by thread: Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
Next by thread: [Qemu-devel] [PATCH] virtio-gpu: fix memory leak in error path
Index(es):
- Date
- Thread