[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Differences in `ne` and `bp` line-breaking behavior
From: |
G. Branden Robinson |
Subject: |
Re: Differences in `ne` and `bp` line-breaking behavior |
Date: |
Sun, 1 Dec 2024 20:26:59 -0600 |
Hi onf,
At 2024-12-02T02:14:56+0100, onf wrote:
> No changes to groff are, strictly speaking, necessary.
> Your planned changes to .ad aren't necessary by the same logic either.
> But you desire them anyway because they make groff's behavior more
> reasonable. The same desire has motivated my proposal.
You omit part of the logic. I see `ad` _actually get used_. In man
pages. Frequently, it is used incorrectly. You can tell this because
the request is either a no-op where it is used, or has a perverse
effect, like frustrating the reader's deconfiguration of adjustment,
making some paragraphs in the man page have ragged right margins and the
others, not. These are obviously unintentional errors.[1] I've
submitted patches to projects like ncurses and bash (that got accepted,
possibly because I included explanations) to correct the misuse.
So part of my motivation for reforming/revising adjustment management is
that I _see_ people mis-applying the existing language feature. They
don't get it. When enough people don't get it, that suggests that the
problem is more with the language than with its users.
By contrast, your sample size for `ne` misuse is one--yourself.
Formerly two, before I came to understand the request.
Moreover, I don't think `ne` will ever see uptake among man page
authors--for a couple of good reasons. One is that man pages are
rendered at widely disparate line lengths. Also, there are subtle
annoyances involving changes in line height due to headings being set a
larger type size (but only in troff mode), and the height of tables
(horizontal rules and horizontal box edges in tbl(1) each take up an
additional vee, but only in nroff mode) Much of the time, a guess
one makes about how many vees are needed to avoid stranded lines in a
man page is going to be wrong. Further, many man page authors care only
about terminal output, and accustomed to "continuous rendering"[2] where
the problem of stranded lines never rears its head in the first place.
I don't think most man page authors are going to develop motivation to
exercise this feature whether it behaves as you want, or not.
And that's okay, because a superior solution IMO is to (1) support
keeps, which I think can be done straightforwardly and adopted without
much effort or subtlety of understanding by man page authors and (2)
more ambitiously, reëngineer our man(7) (and mdoc(7)?) paragraphing
macros under the hood to automatically format paragraphs into a
diversion, and avoid stranded lines with internal logic. This will pay
big rewards--the document author won't even need to think about
it[3]--but will require some cleverness in implementation. Doug McIlroy
put me on the scent of one approach, "self-renewing input traps", but I
haven't set aside the time to explore this, nor seen it done by anyone
else (a foreboding sign).
But, those things are for groff 1.25 at the earliest. Though if (1) is
as easy as I think it is, maybe I could still get it in for 1.24...
> > Formatter requests are primitive things. Most requests don't also
> > perform breaks. Only a handful do. Those that do support the
> > no-break control character to _schedule_ a change to formatter state
> > at the next break, when that happens for some other reason.
>
> Please tell me how that is the case here:
> $ nroff << EOF | sed -E 's/^/./'
> .pl 3
> .fi
> One two three
> four five six
> 'bp
> seven eight nine
> .br
> eleven twelve.
> EOF
> output:
> .
> .
> .
> .One two three four five six seven eight nine
> .eleven twelve.
> .
>
> It's obvious here that 'bp breaks page IMMEDIATELY, not when the .br
Eh?
If it broke the page "IMMEDIATELY", you'd get this:
One two three four five six
<page break>
seven eight nine eleven twelve.
But don't take my word for it. Ask "groff -a".
$ printf 'Hello, world!\n.bp\n.c2 @\none two three\nfour five six\n@bp\nseven
eight nine\n.br\nten eleven twelve\n' | groff -a
<beginning of page>
Hello, world!
<beginning of page>
<beginning of page>
one two three four five six seven eight nine
ten eleven twelve
Now let's try it with the regular control character.
$ printf 'Hello, world!\n.bp\none two three\nfour five six\n.bp\nseven eight
nine\n.br\nten eleven twelve\n' | groff -a
<beginning of page>
Hello, world!
<beginning of page>
one two three four five six
<beginning of page>
seven eight nine
ten eleven twelve
> It's obvious here that 'bp breaks page IMMEDIATELY,
It doesn't. It _schedules_ (or enqueues) a page break to occur when the
next (line) break does, causing the line to be set on the next page.
As with the phrase "prior to output", you and I seem to have strongly
divergent interpretations of certain phrases.
> The difference between .bp and 'bp is that one also breaks the pending
> input line and the other does not, NOT that the latter schedules page
> break after next line break.
This statement is unintelligible to me. You appear to be contradicting
the Aristotelian law of identity.
The `bp` request, regardless of one's choice of control character,
_always_ schedules a page break at the next line break. Invoked with
the ordinary control character, it also _causes_ a break, and the rest
follows as a consequence.
> My proposal seeks to make `ne`'s behavior consistent with that of
> `bp`.
Looking over the request list, why not have `cu` or `ev` imply breaks as
well? Or `hy`? Why not `ft` and `ps`? What's the limiting principle?
> Please explain how, given the following descriptions from groff(7):
> .bp Break page and start a new one.
> .ne d Break page if distance to next page location trap is less
> than distance d (default scaling unit v).
>
> is it not reasonable to expect these two lines to behave similarly:
> .ne 3v
> .if \n[.t]u<3v .bp
>
> ... because they don't.
Maybe one of the *roff veterans on the list can challenge your
frustrated intuition here. I don't seem to be making any headway.
> > > Note that this change would break compatibility with other troff
> > > implementations. However, it would be easy to fix any documents
> > > which rely on the current behavior by substituting[2] any .ne
> > > for 'ne, which, as pointed out above, behaves exactly like .ne
> > > in other troff implementations.
> > >
> > > I invite anyone who disagrees with this proposal to raise any
> > > objections they might have, either here or on the bug tracker.
> >
> > I don't exactly object, but I'm pretty deeply uncertain about it.
> >
> > And we'd need to retain traditional handling of `ne` for AT&T
> > compatibility mode, anyway.
>
> The 'anyway' sounds like you think I am suggesting we remove the
> current behavior entirely.
You're reading a lot into the word "anyway". In this context it is
synonymous with "besides", or "in any {event,case}". Please dial back
your attempts at mind-reading.
> As I explained in the part you quoted above, that is not the case.
> `ne` would still behave as it does now if called with the no-break
> control character (i.e. 'ne).
The disruptive aspect is that anyone relying on the existing behavior
(and not rendering in compatibility mode) would have to alter their
documents to make the appropriate change.
That's not forbidden. We can cross Rubicons. The "NEWS" file documents
these. But a change needs to pay its freight. As far as I know you're
the only person in the world who's ever been this upset by the behavior
of `ne`. (You can't include me despite my documented confusion with the
request, because I did not undertake a similar reform immediately
myself, as I would have if I was certain I was right. Instead I decided
I didn't sufficiently understand what was going on with the formatter.
And that was true: when I first started trying to solve stranded line
problems in groff's corpus of man pages, this issue was entangled with
the changed-type-size-impact-on- line-height and non-zero-height-of-
horizontal-rules-in-nroff-mode issues I mentioned above.)
> I see no reason to change the behavior in compatibility mode, I just
> forgot about it. We can simply add this to the proposal:
> 3. `ne` does not break line in compatibility mode
>
> To summarize, `ne` would break line if
> a) in compatibility mode
> b) not in compatibility mode and called with the regular control
> character
I'm not convinced. I reiterate:
"Part of my motivation for reforming/revising adjustment management is
that I see people mis-applying the existing language feature."
Show me exhibits of people besides yourself making the same mistake with
`ne`.
Regards,
Branden
[1] Man page authors are notorious cargo cultists. Here's one of my
favorite examples.
https://lists.gnu.org/archive/html/groff/2019-03/msg00032.html
John Gardner's hypothesis is the best I've seen for that case.
[2] since groff 1.17, 3 May 2001, thanks to Werner Lemberg
[3] I predict that some would mis-describe this behavior as
"implementing the Knuth-Plass paragraphing algorithm", because it's
a lot easier to tell that K-P doesn't strand single lines of a
multi-line paragraph than it is to tell that it adjusts interword
spacing in an esthetically superior way. In fact, that would be a
great way to measure this frequently voiced complaint about groff,
except for the facts that (1) most people who voice such complaints
about groff simply repeat things they've heard or read, rather than
testing the claim for themselves and (2) most people who have voiced
such complaints in the past, never stop making them even after
the underlying problem is fixed, no matter how much time passes.
Unix has always had a cadre of users who resemble sports fans much
more than scientists or engineers, to the nonstop detriment of its
culture.
signature.asc
Description: PGP signature
Re: Differences in `ne` and `bp` line-breaking behavior, Deri, 2024/12/02