[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
remove strcasecmp from regex code; use localcharset instead
From: |
Paul Eggert |
Subject: |
remove strcasecmp from regex code; use localcharset instead |
Date: |
Wed, 14 Feb 2007 16:17:40 -0800 |
User-agent: |
Gnus/5.1008 (Gnus v5.10.8) Emacs/21.4 (gnu/linux) |
I installed this. It adds a dependency of regex on localcharset,
which I'd rather avoid, but the alternatives seemed worse to me.
2007-02-14 Paul Eggert <address@hidden>
Fix regex code so it doesn't rely on strcasecmp.
* lib/regex_internal.h: Include <langinfo.h> only if _LIBC is defined.
Otherwise, include gnulib's langinfo.h.
* lib/regcomp.c (init_dfa): Don't use strcasecmp, as it can have
undesirable behavior in non-C locales. Instead, rely on localecharset.
* m4/regex.m4 (gl_PREREQ_REGEX): Don't require AM_LANGINFO_CODESET.
* modules/regex (FILES): Remove m4/codeset.m4.
(Depends-on): Add localcharset. Remove strcase.
Index: lib/regex_internal.h
===================================================================
RCS file: /cvsroot/gnulib/gnulib/lib/regex_internal.h,v
retrieving revision 1.31
diff -u -p -r1.31 regex_internal.h
--- lib/regex_internal.h 2 Feb 2007 22:15:44 -0000 1.31
+++ lib/regex_internal.h 15 Feb 2007 00:16:51 -0000
@@ -27,8 +27,10 @@
#include <stdlib.h>
#include <string.h>
-#if defined HAVE_LANGINFO_H || defined HAVE_LANGINFO_CODESET || defined _LIBC
+#ifdef _LIBC
# include <langinfo.h>
+#else
+# include "localcharset.h"
#endif
#if defined HAVE_LOCALE_H || defined _LIBC
# include <locale.h>
Index: lib/regcomp.c
===================================================================
RCS file: /cvsroot/gnulib/gnulib/lib/regcomp.c,v
retrieving revision 1.26
diff -u -p -r1.26 regcomp.c
--- lib/regcomp.c 5 Feb 2007 15:38:59 -0000 1.26
+++ lib/regcomp.c 15 Feb 2007 00:16:52 -0000
@@ -829,9 +829,6 @@ static reg_errcode_t
init_dfa (re_dfa_t *dfa, size_t pat_len)
{
__re_size_t table_size;
-#ifndef _LIBC
- char *codeset_name;
-#endif
#ifdef RE_ENABLE_I18N
size_t max_i18n_object_size = MAX (sizeof (wchar_t), sizeof (wctype_t));
#else
@@ -875,22 +872,7 @@ init_dfa (re_dfa_t *dfa, size_t pat_len)
dfa->map_notascii = (_NL_CURRENT_WORD (LC_CTYPE, _NL_CTYPE_MAP_TO_NONASCII)
!= 0);
#else
-# ifdef HAVE_LANGINFO_CODESET
- codeset_name = nl_langinfo (CODESET);
-# else
- codeset_name = getenv ("LC_ALL");
- if (codeset_name == NULL || codeset_name[0] == '\0')
- codeset_name = getenv ("LC_CTYPE");
- if (codeset_name == NULL || codeset_name[0] == '\0')
- codeset_name = getenv ("LANG");
- if (codeset_name == NULL)
- codeset_name = "";
- else if (strchr (codeset_name, '.') != NULL)
- codeset_name = strchr (codeset_name, '.') + 1;
-# endif
-
- if (strcasecmp (codeset_name, "UTF-8") == 0
- || strcasecmp (codeset_name, "UTF8") == 0)
+ if (strcmp (locale_charset (), "UTF-8") == 0)
dfa->is_utf8 = 1;
/* We check exhaustively in the loop below if this charset is a
Index: m4/regex.m4
===================================================================
RCS file: /cvsroot/gnulib/gnulib/m4/regex.m4,v
retrieving revision 1.62
diff -u -p -r1.62 regex.m4
--- m4/regex.m4 6 Feb 2007 07:02:59 -0000 1.62
+++ m4/regex.m4 15 Feb 2007 00:16:52 -0000
@@ -1,4 +1,4 @@
-#serial 44
+#serial 45
# Copyright (C) 1996, 1997, 1998, 1999, 2000, 2001, 2003, 2004, 2005,
# 2006, 2007 Free Software Foundation, Inc.
@@ -203,7 +203,6 @@ AC_DEFUN([gl_PREREQ_REGEX],
[
AC_REQUIRE([AC_GNU_SOURCE])
AC_REQUIRE([AC_C_RESTRICT])
- AC_REQUIRE([AM_LANGINFO_CODESET])
AC_CHECK_FUNCS_ONCE([iswctype mbrtowc wcrtomb wcscoll])
AC_CHECK_DECLS([isblank], [], [], [#include <ctype.h>])
])
Index: modules/regex
===================================================================
RCS file: /cvsroot/gnulib/gnulib/modules/regex,v
retrieving revision 1.23
diff -u -p -r1.23 regex
--- modules/regex 8 Feb 2007 23:34:28 -0000 1.23
+++ modules/regex 15 Feb 2007 00:16:52 -0000
@@ -8,17 +8,16 @@ lib/regex_internal.c
lib/regex_internal.h
lib/regexec.c
lib/regcomp.c
-m4/codeset.m4
m4/regex.m4
Depends-on:
alloca
extensions
gettext-h
+localcharset
malloc
stdbool
stdint
-strcase
ssize_t
wchar
wctype
- remove strcasecmp from regex code; use localcharset instead,
Paul Eggert <=