bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion fr


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] difference in iconv for EBCDIC SBCS conversion from z/OS OS-provided iconv
Date: Sun, 02 Apr 2023 12:11:42 +0200

[Re-adding the mailing list in CC. Please keep the mailing list in CC.]

Mike Fulton wrote:
> > https://en.wikipedia.org/wiki/EBCDIC#Definitions_of_non-ASCII_EBCDIC_controls
> > is not conclusive:
> >   "Line break. Default mapping (0085) matches ISO/IEC 6429's NEL.
> >    Mappings sometimes swapped with Line Feed (EBCDIC 0x25) in accordance
> >    with UNIX line breaking convention."
> >
> I agree it's confusing.
> Here's a reasonably good IBM doc on what the EBCDIC code pages want for NL:
> https://www.ibm.com/docs/en/zos/2.1.0?topic=server-different-end-line-characters-in-text-files
> 
> We definitely need EBCDIC 0x15 to map to ASCII 0x0A. Here is the 2 line
> file:
> Hello
> World
> 
> dumped out in hex:
> 
> FULTONM@ZOSCAN2B bash ~> hexdump hw.ibm1047.txt
> 000000 c8859393 9615e696 99938415
> 00000c
> 
> FULTONM@ZOSCAN2B bash ~> hexdump hw.iso8859-1.txt
> 000000 48656c6c 6f0a576f 726c640a
> 00000c

If it was so simple, that EBCDIC 0x15 always needs to map to U+000A, then

  * Why does this documentation — also from z/OS 2.1.0 — say that EBCDIC 0x25
    maps to U+000A ?
    https://www.ibm.com/docs/en/zos/2.1.0?topic=acif-fileformat

  * Why does the glibc/localedata/charmaps/IBM1047, which has a
    "source: IBM Character Data Representation Architecture" annotation,
    map 0x25 to U+000A, since its initial revision in 1997?

  * Why does Wikipedia say "sometimes swapped"?
    
https://en.wikipedia.org/wiki/EBCDIC#Definitions_of_non-ASCII_EBCDIC_controls

  * Why was the bug report that wanted glibc's IBM1047 mapping table changed
    closed as "NOT A BUG"?
    https://bugzilla.redhat.com/show_bug.cgi?id=170072

  * Why does PCRE have two configure options --enable-ebcdic and
    --enable-ebcdic-nl25 ?
    https://opensource.apple.com/source/pcre/pcre-12/pcre/configure.ac.auto.html

  * Why did msbrown write "Note that "line feed" is 0x25 in EBCDIC/IBM-1047, but
    the C language '\n' is 0x15 (EBCDIC "new line")." ?
    https://www.austingroupbugs.net/view.php?id=251

I'll keep doing what glibc does, in this respect.

But you can certainly, in the libiconv version that you build for z/OS, map
0x15 and 0x25 the other way around, and then get feedback from your users
about it. Just make sure, please, that you modify the "iconv --version"
output, so that when users report a bug, it's clear whether it's the
original libiconv or a modified one.

In this situation of contradicting statements, a better solution for you
is maybe to make the behaviour dependent on an environment variable. I'm
being told that some environment variables are needed on z/OS anyway, see
https://lists.gnu.org/archive/html/bug-gnulib/2019-11/msg00036.html .
Therefore another environment variable should be acceptable to your users.

Bruno






reply via email to

[Prev in Thread] Current Thread [Next in Thread]