bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: check-AUTHORS fails because of non ansi characters


From: Jim Meyering
Subject: Re: check-AUTHORS fails because of non ansi characters
Date: Sat, 21 Jun 2008 10:34:02 +0200

Eric Blake <address@hidden> wrote:
> According to Jim Meyering on 6/20/2008 1:09 PM:
> |> 58c58
> |> < ptx: Fran?ois Pinard
> |> ---
> |>> ptx: François Pinard
>
> In my email, this is rendering as one vs. two characters.  I suspect it
> might be a locale issue - perhaps Jim is using a UTF-8 locale, and Michael
> is using a Latin-1 encoding?  The proper_name_utf8 modules renders in
> UTF-8, but then passes the result through iconv to transliterate into your
> locale.

The problem is probably that his system lacks the en_US.UTF-8 locale,
which is used by that check-AUTHORS rule.

Here's a change I'm considering.  It's easy in the sense that it's merely
using an existing m4 macro, gt_LOCALE_FR_UTF8, but has the drawback
of depending on a locale that is less likely to be installed than the
English one.

However, people who run "make distcheck" are required to
have plenty of tools that regular "make check" users need
not have, so that's ok.

One twist was that on my system, the french translation of "F. Pinard"
was identical to the original, so even with LC_ALL=$(LOCALE_FR_UTF8),
"make check-AUTHORS" was still failing, due to the way proper_name_utf8
works (returning the translation if it contains the original ascii
string).

Another potential glitch: I merely use "perl", here.
Not $(PERL), so it fails when perl is missing.  But that was deliberate.
With "make distcheck"-run tests I can afford to take such liberties.

>From be49c621dd38e21336b86f62342d1fe54f3e2882 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 21 Jun 2008 09:57:10 +0200
Subject: [PATCH] avoid "make check-AUTHORS" failure on systems lacking 
en_US.UTF-8
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

* configure.ac: Use gt_LOCALE_FR_UTF8.
* src/Makefile.am (check-AUTHORS): Skip the test if we don't
have a French UTF8 locale.
* src/ptx.c (AUTHORS): Spell François' name differently.
Use octal rather than equivalent hexadecimal for c-cedilla,
to be consistent with other uses of proper_name_utf8.
---
 configure.ac    |    2 ++
 src/Makefile.am |   42 ++++++++++++++++++++++++++----------------
 src/ptx.c       |    2 +-
 3 files changed, 29 insertions(+), 17 deletions(-)

diff --git a/configure.ac b/configure.ac
index ac93e1c..766b76d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -336,6 +336,8 @@ AM_GNU_GETTEXT_VERSION([0.15])

 # For a test of uniq: it uses the $LOCALE_FR envvar.
 gt_LOCALE_FR
+# For the check-AUTHORS test.
+gt_LOCALE_FR_UTF8

 AC_CONFIG_FILES(
   Makefile
diff --git a/src/Makefile.am b/src/Makefile.am
index 342fc09..f962907 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -349,22 +349,32 @@ au_dotdot = authors-dotdot
 au_actual = authors-actual
 .PHONY: check-AUTHORS
 check-AUTHORS: $(all_programs)
-       rm -f $(au_actual) $(au_dotdot)
-       for i in `ls $(all_programs) | sed -e 's,$(EXEEXT)$$,,' \
-           | $(ASSORT) -u`; do                         \
-         test "$$i" = '[' && continue;                 \
-         exe=$$i;                                      \
-         if test "$$i" = install; then                 \
-           exe=ginstall;                               \
-         elif test "$$i" = test; then                  \
-           exe='[';                                    \
-         fi;                                           \
-         LC_ALL=en_US.UTF-8 ./$$exe --version                  \
-           | perl -0 -pi -e 's/,\n/, /gm'              \
-           |sed -n '/Written by /{ s//'"$$i"': /; s/,* and /, /; s/\.$$//; p; 
}'; \
-       done > $(au_actual)
-       sed -n '/^[^ ][^ ]*:/p' $(top_srcdir)/AUTHORS > $(au_dotdot)
-       diff $(au_actual) $(au_dotdot) && rm -f $(au_actual) $(au_dotdot)
+       if test '$(LOCALE_FR_UTF8)' = none; then                \
+         echo '$@: skipping this test' 1>&2;                   \
+         echo 'your system lacks a french UTF8 locale' 1>&2;   \
+         echo 'consider installing e.g., fr_FR.UTF-8' 1>&2;    \
+       else                                                    \
+         rm -f $(au_actual) $(au_dotdot);                      \
+         for i in `ls $(all_programs) | sed -e 's,$(EXEEXT)$$,,' \
+             | $(ASSORT) -u`; do                               \
+           test "$$i" = '[' && continue;                       \
+           exe=$$i;                                            \
+           if test "$$i" = install; then                       \
+             exe=ginstall;                                     \
+           elif test "$$i" = test; then                        \
+             exe='[';                                          \
+           fi;                                                 \
+           LC_ALL=$(LOCALE_FR_UTF8) ./$$exe --version          \
+             | perl -0 -pe 's/,\n/, /gm'                       \
+             | perl -ne '/.*crit par / or next;'               \
+               -e 's//'"$$i"': /;'                             \
+               -e 's/,* et /, /; s/\.$$//; print';             \
+         done > $(au_actual);                                  \
+         sed -n '/^[^ ][^ ]*:/p' $(top_srcdir)/AUTHORS         \
+           > $(au_dotdot);                                     \
+         diff -u $(au_actual) $(au_dotdot)                     \
+           && rm -f $(au_actual) $(au_dotdot) || exit 1;       \
+       fi

 # Make sure we don't define any S_IS* macros in src/*.c files.
 # Not a big deal, but they're already defined via system.h.
diff --git a/src/ptx.c b/src/ptx.c
index 827d22e..d3c1c63 100644
--- a/src/ptx.c
+++ b/src/ptx.c
@@ -37,7 +37,7 @@
 /* TRANSLATORS: Please translate "F. Pinard" to "François Pinard"
    if "ç" (c-with-cedilla) is available in the translation's character
    set and encoding.  */
-#define AUTHORS proper_name_utf8 ("F. Pinard", "Fran\xc3\xa7ois Pinard")
+#define AUTHORS proper_name_utf8 ("Franc,ois Pinard", "Fran\303\247ois Pinard")

 /* Number of possible characters in a byte.  */
 #define CHAR_SET_SIZE 256
--
1.5.6.rc3.23.gc3bdd




reply via email to

[Prev in Thread] Current Thread [Next in Thread]