[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input
From: |
Dave Kemper |
Subject: |
Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input |
Date: |
Tue, 14 May 2024 17:29:16 -0500 |
On Tue, May 14, 2024 at 8:53 AM G. Branden Robinson
<g.branden.robinson@gmail.com> wrote:
> I aim to drop EBCDIC a.k.a.
> code page (CCSID) 1047 support from groff 1.24.
No objection to this.
> The idea is, for 1.24, to get everybody migrating to pure ASCII input
> documents (as might be generated by preconv(1)) by the time GNU troff
> sees them.
I don't strongly object, but I wonder about the advisability of
requiring preconv on a wide swath of documents that didn't previously
require it while Savannah #59442 (preconv vs soelim) and #65108
(handling encoding of filenames) are unresolved.
This set of intertwined problems doesn't go away even when groff
accepts UTF-8 natively: files included via .so still might use Latin-1
-- especially any files dating from before 2026 (or whenever 1.24
comes out), where that was the native input encoding -- and the
underlying file system might use a different encoding for filenames.
Are support for EBCDIC and for Latin-1 tightly enough coupled in the
code that it's unnecessarily complex to remove the former while
retaining the latter?