[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion fr
From: |
Bruno Haible |
Subject: |
Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv |
Date: |
Sun, 02 Apr 2023 23:40:42 +0200 |
Hi Mike,
> * Why was the bug report that wanted glibc's IBM1047 mapping table changed
> > closed as "NOT A BUG"?
> > https://bugzilla.redhat.com/show_bug.cgi?id=170072
>
> Based on Eric and Jakub's discussion, I would agree with Jakub that
> unfortunately we seem
> to have 2 'standards' here which are incompatible and it would be good for
> the user community
> if we supported both.
This bug report references http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf
from 2003. In the newest edition, at
https://www.unicode.org/versions/Unicode15.0.0/ch05.pdf ,
there are more details:
* The "newline function" is represented by
- U+000A (= LF) on Unix,
- U+0085 (= NEL) on EBCDIC-based OS,
- U+000D (= CR) on MacOS 9 and earlier.
[Table 5-2]
* In EBCDIC-based OSes other than z/OS (you listed them in
<https://lists.gnu.org/archive/html/bug-gnu-libiconv/2023-04/msg00002.html>)
this newline function is represented by EBCDIC 0x15.
[Table 5-1]
* Likewise "text files on z/OS traditionally use NEL for the newline
function."
[page 210]
* But in the z/OS Unix System Services U+000A maps to EBCDIC 0x15. [Table 5-1]
"That mapping arises from the use of the LF character for the newline
function in C programs and in Unix environments"
[page 210]
So, for z/OS users, it appears to depend whether they are working more with
"traditional" programs or more with "C programs and Unix environments".
> > * Why did msbrown write "Note that "line feed" is 0x25 in
> > EBCDIC/IBM-1047, but
> > the C language '\n' is 0x15 (EBCDIC "new line")." ?
> > https://www.austingroupbugs.net/view.php?id=251
>
> This is the crux of the situation. A huge number of tools are either
> written in C/C++ or
> the tools are built with other tools written in C/C++ and the '\n' in all
> the code is 0x15.
> So choosing a different value for a file means that none of those tools
> work. In particular,
> if you iconv a file and try to use 'less' it won't work because it won't
> 'see' the newlines.
I see. It sounds like there was a standard at some point; then the
tools written in C made up a different de-facto standard, and now the
original standard is less relevant (but no new formal standard was issued).
> I would think on z/OS for our UNIX System
> Services customers
> we could 'compile in' this value (like PCRE) which would be my preference
> over an
> environment variable.
I prefer an environment variable, for these reasons:
1) Given the text that I've cited above it looks like some "traditional"
files on z/OS need U+0085 to map to EBCDIC 0x15. Therefore, for the same
user in the same OS, sometimes one way is desired, sometimes the other
way.
2) Compatibility with glibc iconv and recode 3.7.x; both map U+0085 to
EBCDIC 0x15:
$ echo hello | iconv -f ASCII -t IBM1047 | hd
000000 88 85 93 93 96 25 .....%
$ printf 'hello\u0085' | iconv -f UTF-8 -t IBM1047 | hd
000000 88 85 93 93 96 15 ......
$ echo hello | recode ASCII..IBM1047 | hd
000000 88 85 93 93 96 25 .....%
$ printf 'hello\u0085' | recode UTF-8..IBM1047 | hd
000000 88 85 93 93 96 15 ......
3) It is more sensible to _deviate_ from a formal standard by setting
an environment variable, than it is to _adhere_ to a formal standard
by setting an environment variable. (Remember the days of the
POSIX_ME_HARDER environment variable? :-) )
Bruno
- [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/01
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/01
- Message not available
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/02
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/02
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv,
Bruno Haible <=
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Bruno Haible, 2023/04/03
- Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv, Mike Fulton, 2023/04/03