bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 02/14] Fix character encoding aliases for OS/2


From: Daiki Ueno
Subject: Re: [PATCH 02/14] Fix character encoding aliases for OS/2
Date: Thu, 25 Dec 2014 16:54:52 +0900
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.0.50 (gnu/linux)

Thanks for the update.  I'm attaching a slightly modified version of the
patch, with typo and style fixes (e.g. adding two spaces after a period,
according to GCS, etc).

KO Myung-Hun <address@hidden> writes:

> I used CODEPAGE and COUNTRY parts of OS/2 command references.

Could you provide the title and the corresponding section number of the
manual?  For example, for VMS, we have the following comment:

      /* The list of encodings is taken from the OpenVMS 7.3-1 documentation
         "Compaq C Run-Time Library Reference Manual for OpenVMS systems"
         section 10.7 "Handling Different Character Sets".  */

> And the differences of others is because cp915, cp921 and cp1381 are not
> supported by libiconv. In these cases, I picked up code pages from
> config.charset.

I see.

> Finally, I changed a code page for bg_BG from cp1251 to cp855 which is
> etc code page.

Is this a desired change?  It seems those codepages are largely
incompatible.  I changed this to ISO-8859-5.

By the way, it's tempting to call DosQueryCp if a charset is omitted
from the locale name, to avoid maintaining the default mapping
ourselves.  I'd rather not do that for now, but is it feasible?

Regards,
--
Daiki Ueno
>From b7c0c0f267a3b00d65bc7845ba486298a12628de Mon Sep 17 00:00:00 2001
From: KO Myung-Hun <address@hidden>
Date: Thu, 23 Feb 2012 22:37:21 +0900
Subject: [PATCH] localcharset: embed charset aliases for OS/2

Embed the contents of charset.alias file to avoid the troubles finding a
separate file, like Windows.

* lib/config.charset: Remove os2* from case "$os" in
* lib/localcharset.c (get_charset_aliases): Use embedded encoding
aliases on OS/2.
---
 lib/config.charset |  4 +--
 lib/localcharset.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/lib/config.charset b/lib/config.charset
index 4e4c7ed..3e6c88f 100644
--- a/lib/config.charset
+++ b/lib/config.charset
@@ -348,12 +348,10 @@ case "$os" in
     #echo "sun_eu_greek ?" # what is this?
     echo "UTF-8 UTF-8"
     ;;
-  freebsd* | os2*)
+  freebsd*)
     # FreeBSD 4.2 doesn't have nl_langinfo(CODESET); therefore
     # localcharset.c falls back to using the full locale name
     # from the environment variables.
-    # Likewise for OS/2. OS/2 has XFree86 just like FreeBSD. Just
-    # reuse FreeBSD's locale data for OS/2.
     echo "C ASCII"
     echo "US-ASCII ASCII"
     for l in la_LN lt_LN; do
diff --git a/lib/localcharset.c b/lib/localcharset.c
index 1c17af0..5147862 100644
--- a/lib/localcharset.c
+++ b/lib/localcharset.c
@@ -128,7 +128,7 @@ get_charset_aliases (void)
   cp = charset_aliases;
   if (cp == NULL)
     {
-#if !(defined DARWIN7 || defined VMS || defined WINDOWS_NATIVE || defined 
__CYGWIN__)
+#if !(defined DARWIN7 || defined VMS || defined WINDOWS_NATIVE || defined 
__CYGWIN__ || defined OS2)
       const char *dir;
       const char *base = "charset.alias";
       char *file_name;
@@ -342,6 +342,74 @@ get_charset_aliases (void)
            "CP54936" "\0" "GB18030" "\0"
            "CP65001" "\0" "UTF-8" "\0";
 # endif
+# if defined OS2
+      /* To avoid the troubles of installing a separate file in the same
+         directory as the DLL and of retrieving the DLL's directory at
+         runtime, simply inline the aliases here.  */
+
+      /* On OS/2, a charset is usually omitted from the locale name
+         (for example, users set LANG just to ko_KR for Korean), thus
+         we define the default mapping from a locale name to charset.
+         The list of encodings is taken from the OS/2 documentation
+         "FIXME",
+         see also <http://www.borgendale.com/locale.htm>.
+         Since OS/2 codepages CP915, CP921, and CP1361 are not defined
+         in libiconv, map them to their equivalents.  */
+                                            /*  Pri,  Alt,  Etc */
+      cp = "ar_AA" "\0" "CP864" "\0"        /*  864,  850,  437 */
+           "bg_BG" "\0" "ISO-8859-5" "\0"   /*  915,  850,  855 */
+           "ca_ES" "\0" "CP850" "\0"        /*                  */
+           "cs_SZ" "\0" "CP852" "\0"        /*  852,  850       */
+           "da_DK" "\0" "CP850" "\0"        /*  850,  865, 1004 */
+           "de_AT" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "de_CH" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "de_DE" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "el_GR" "\0" "CP869" "\0"        /*  869,  850,  813 */
+           "en_AU" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_CA" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_GB" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_IE" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_NZ" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_US" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "en_ZA" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "es_ES" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "es_LA" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "et_EE" "\0" "CP922" "\0"        /*  922,  850       */
+           "fi_FI" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "fr_BE" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "fr_CA" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "fr_CH" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "fr_FR" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "hr_HR" "\0" "CP852" "\0"        /*  852,  850       */
+           "hu_HU" "\0" "CP852" "\0"        /*  852,  850, 1004 */
+           "is_IS" "\0" "CP850" "\0"        /*  850,  861, 1004 */
+           "it_CH" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "it_IT" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "iw_IL" "\0" "CP862" "\0"        /*  862,  850,  437 */
+           "ja_JP" "\0" "CP943" "\0"        /*  943,  850,  942 */
+           "ko_KR" "\0" "CP949" "\0"        /*  949,  850,  944 */
+           "lt_LT" "\0" "ISO-8859-13" "\0"  /*  921,  850       */
+           "lv_LV" "\0" "ISO-8859-13" "\0"  /*  921,  850       */
+           "mk_MK" "\0" "CP855" "\0"        /*  855,  850,  915 */
+           "nl_BE" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "nl_NL" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "no_NO" "\0" "CP850" "\0"        /*  850,  865, 1004 */
+           "pl_PL" "\0" "CP852" "\0"        /*  852,  850       */
+           "pt_BR" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "pt_PT" "\0" "CP850" "\0"        /*  850,  860, 1004 */
+           "ro_RO" "\0" "CP852" "\0"        /*  852,  850, 1004 */
+           "ru_RU" "\0" "CP866" "\0"        /*  866,  850,  915 */
+           "sh_BA" "\0" "CP852" "\0"        /*  852,  850       */
+           "sk_SK" "\0" "CP852" "\0"        /*  852,  850       */
+           "sl_SI" "\0" "CP852" "\0"        /*  852,  850       */
+           "sq_AL" "\0" "CP850" "\0"        /*  850,  437       */
+           "sr_SP" "\0" "CP855" "\0"        /*  855,  850,  915 */
+           "sv_SE" "\0" "CP850" "\0"        /*  850,  437, 1004 */
+           "th_TH" "\0" "CP874" "\0"        /*  874,  850       */
+           "tr_TR" "\0" "CP857" "\0"        /*  857,  850, 1004 */
+           "zh_CN" "\0" "GB2312" "\0"       /* 1381,  850,  946 */
+           "zh_TW" "\0" "CP950" "\0";       /*  950,  850,  948 */
+# endif
 #endif
 
       charset_aliases = cp;
@@ -530,7 +598,7 @@ locale_charset (void)
             }
         }
 
-      /* Resolve through the charset.alias file.  */
+      /* Resolve through aliases.  */
       codeset = locale;
     }
   else
-- 
2.1.0


reply via email to

[Prev in Thread] Current Thread [Next in Thread]