Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input

groff

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input

From:	Dave Kemper
Subject:	Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input
Date:	Tue, 14 May 2024 17:29:16 -0500

On Tue, May 14, 2024 at 8:53 AM G. Branden Robinson
<g.branden.robinson@gmail.com> wrote:
> I aim to drop EBCDIC a.k.a.
> code page (CCSID) 1047 support from groff 1.24.

No objection to this.

> The idea is, for 1.24, to get everybody migrating to pure ASCII input
> documents (as might be generated by preconv(1)) by the time GNU troff
> sees them.

I don't strongly object, but I wonder about the advisability of
requiring preconv on a wide swath of documents that didn't previously
require it while Savannah #59442 (preconv vs soelim) and #65108
(handling encoding of filenames) are unresolved.

This set of intertwined problems doesn't go away even when groff
accepts UTF-8 natively: files included via .so still might use Latin-1
-- especially any files dating from before 2026 (or whenever 1.24
comes out), where that was the native input encoding -- and the
underlying file system might use a different encoding for filenames.

Are support for EBCDIC and for Latin-1 tightly enough coupled in the
code that it's unnecessarily complex to remove the former while
retaining the latter?

[Prev in Thread]

Current Thread

[Next in Thread]

ripping out EBCDIC (cp1047)/preparing for UTF-8 input, G. Branden Robinson, 2024/05/14
- Re: ripping out EBCDIC (cp1047)/preparing for UTF-8 input, Dave Kemper <=

Prev by Date: Re: Greek in Groff
Next by Date: Writing techical paper
Previous by thread: ripping out EBCDIC (cp1047)/preparing for UTF-8 input
Next by thread: Re: Greek in Groff
Index(es):
- Date
- Thread