[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: regcomp gnulib - glibc sync bears fruit
From: |
Bruno Haible |
Subject: |
Re: regcomp gnulib - glibc sync bears fruit |
Date: |
Tue, 5 Jan 2010 21:14:10 +0100 (MET) |
Hi Paolo,
> Before proceeding, however, I'm curious whether using nl_langinfo
> (CODESET) is less precise than locale_charset on some platform. Bruno?
Here's my reply to Jim from yesterday. For some reason it was apparently
not distributed to the mailing list.
Hi Jim,
> @@ -893,7 +896,9 @@ init_dfa (re_dfa_t *dfa, size_t pat_len)
> dfa->map_notascii = (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MAP_TO_NONASCII)
> != 0);
> #else
> - if (strcmp (locale_charset (), "UTF-8") == 0)
> + codeset_name = nl_langinfo (CODESET);
> + if (strcasecmp (codeset_name, "UTF-8") == 0
> + || strcasecmp (codeset_name, "UTF8") == 0)
> dfa->is_utf8 = 1;
>
> /* We check exhaustively in the loop below if this charset is a
This patch is not wrong: It takes care of the fact that the result
of nl_langinfo(CODESET) can be in upper case or in lower case,
depending on the system, and that on HP-UX, "utf8" is returned
(see lib/config.charset).
But I would nevertheless not apply it nor recommend it, because
the nl_langinfo module may include a lot more stuff in the future:
- It may include real localizations of the values, instead of returning
English dummy values. I have already written the converter from
glibc locale data to PO files that can be read by the nl_langinfo
replacement.
- It may include an emulation of the NL_LOCALE_NAME(category)
macro that works since glibc 2.11.1. This emulation would rely on
the 'localename' module.
When all you need is nl_langinfo(CODESET), the full-blown 'nl_langinfo'
module is too heavyweight. 'localcharset' is not a POSIX API, but fits
better due to the gnulib module structure.
Bruno
- Re: regcomp gnulib - glibc sync bears fruit,
Bruno Haible <=