groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8 in grout and a performance regression (was: synchronous and as


From: G. Branden Robinson
Subject: Re: UTF-8 in grout and a performance regression (was: synchronous and asynchronous grout)
Date: Thu, 19 Dec 2024 13:43:01 -0600

At 2024-12-19T20:23:56+0100, onf wrote:
> On Thu Dec 19, 2024 at 7:15 PM CET, G. Branden Robinson wrote:
> > At 2024-12-19T17:20:09+0000, Deri wrote:
> > > I don't mind, I just thought you considered readability of the grout
> > > file important [1],
> >
> > I do.  ISO 646/ASCII is readable practically everywhere.  There remain
> > places where UTF-8, at least when exercising code points greater than
> > U+007F, is not, and even if the processing stream supports UTF-8
> > perfectly, lack of font coverage can make UTF-8 unreadable again.
> 
> Although looking up Unicode codepoint numbers is arguably better
> than seeing gibberish, neither is a particularly good form to work
> with. Your reasoning sounds like "making it perfect for most people
> would make it horrible for a small minority, so let's rather keep it
> bad for everyone".

I think you're exhibiting a form of the base rate fallacy here.

The number of people who read GNU troff output ("grout"), whether with
their eyeballs or with a program they've written, is *tiny*.

I think a lot of people, even when troubleshooting, fail to consider
grout as an inspection site in the first place.  They try to reason from
groff input to whatever is emitted by the output driver.  And that
works, often enough.

I therefore consider your claim overstated.

One of the people who _does_ examine grout, though, is Deri, which is
why I said, in a part of my message you didn't quote:

>> It could be a setting in the DESC file so you don't need to change
>> output drivers in one go.
>
> I'd be fine with that, for the same reason I mused about a "caveman
> mode" where the "tcommand" DESC directive is ignored and we fall back
> to fully synchronous 'c', 'C', and 'h' sequences.

...and which happens to also be responsive to one of your subsequent
points:

> I understand that groff cannot treat it this way on input due to
> composing/decomposing characters etc., but I feel like it would've
> been possible on output if groff hadn't abandoned the c & h commands,
> no?

...where this "abandonment" would not only be configurable (as it always
has been via "DESC" file editing), but I conceived of it as dynamically
alterable (via troff(1) command-line option or an environment variable).

> I prefer the approach of most other Unix tools which treat UTF-8
> transparently as "data".

When you elide my conciliatory remarks and points of agreement to make
me appear argumentative, I am pressured to draw the same conclusion of
you, and with better evidence.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]