[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
mandoc(1) and diversions (was: Proposed: change `pm` request argument se
From: |
G. Branden Robinson |
Subject: |
mandoc(1) and diversions (was: Proposed: change `pm` request argument semantics) |
Date: |
Fri, 18 Aug 2023 00:14:58 -0500 |
At 2023-08-18T03:24:53+0200, Ingo Schwarze wrote:
> > Understanding *roff a little better 6 years later, I can more easily
> > imagine ways to run AT&T troff out of memory on a PDP-11.
> > Ultra-long diversions would be one way,[1]
> > [1] Nobody _except_ mandoc(1) seems to handle this well. Credit
> > where it's due. https://savannah.gnu.org/bugs/?64229
>
> Praise is usually nice to have, but i must admit this particular
> praise surprises me on more than one level. :-)
And your reply did so for me!
> https://man.openbsd.org/roff.7#di says:
>
> di divname
> Begin a diversion. Currently unsupported. [by mandoc(1)]
Hah! You'll soon see why I didn't anticipate that.
> I'm not completely convinced not supporting a particular request
> at all amounts to "handling it well".
>
> Besides,
>
> $ time { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | mandoc
> mandoc: Cannot allocate memory
> 0m07.61s real 0m05.67s user 0m01.81s system
>
> i.e. infinite input crashes mandoc - admittedly via err(3) after
> malloc(3) returns NULL, which is relatively controlled, but
> still a crash.
No kidding! But check this out.
$ dpkg -l mandoc|tail -n 1
ii mandoc 1.14.6-1 amd64 BSD manpage compiler toolset
$ { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | mandoc; echo $?
0
$ time { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | mandoc
real 0m1.161s
user 0m0.022s
sys 0m2.217s
I'm afraid I got a misleading impression of mandoc(1)'s performance
here. ;-)
But something is clearly hinky in the mandoc build for Debian.
> But GNU troff isn't actually *that* much worse:
>
> $ time { printf '.di foo\n.nf\n'; yes abcdefghijklm; } | troff
> Abort trap (core dumped)
> 0m24.72s real 0m04.43s user 0m03.82s system
And it's doing actual work, by contrast. ;-)
> Exiting via abort(3) is also a relatively contolled way of dying.
Yeah, I'd prefer that's what happened on GNU/Linux.
> This downside merely follows from the choice of the implementation
> language C++, which suffers from ill-designed, very messy error
> handling in general.
Well in this case, maybe the libstdc++ vendor.
> I'm not sure why you see a SIGKILL getting thrown at the troff process
> on your machine - but i *suspect* that may have nothing to do with GNU
> troff either and may be an implementation detail of whatever operating
> system, C++ compiler, and C++ standard library you are using.
My thoughts as well.
> Sure, on first sight, an explicit abort(3) being called on the C
> library level *might* look slightly safer than SIGKILL flying around -
> then again, i'm not really sure it makes a difference. Whether that
> actually is a security risks depends on many details you did not
> disclose. Quite possible it isn't.
I think the problem is simple. Nothing checks for the integer
wraparound in the `int` that stores the vertical drawing position in
basic units. On a "page" in GNU troff, you'll hit the implicit page
trap before that can happen,[1] but in a diversion, the vertical drawing
position can wrap from INT_MAX to INT_MIN without anyone noticing.
In a language with real data types, like Ada, Clark would have
foreclosed this failure mode in 1990 or so.
But I haven't yet proven my suspicions, so we'll see. I doubt that
future opportunities to compare Ada favorably to C++ will dry up.
> > I'll say it before Ingo does: mandoc(1) (as I understand it) _does_
> > build a syntax tree for the entire document before producing output,
> > which enables some of the nice features that it has.
>
> Correct.
>
> However, before Alejandro gets carried away with enthusiasm, let
> me emphasize that is does the opposite of what Alejandro is asking
> for: He wants all the man(7) macros converted to roff(7) requests.
> Instead, mandoc *removes* all roff requests from the document such
> that it gets a pure man(7) syntax tree with (almost) no roff left in
> it - still making sure most of those roff requests take effect before
> being removed.
Ahhhh. I think you reimplemented "environments" in the *roff sense. ;-)
> Sound impossible, almost paradoxical? Yes it does,
> but it works surprisingly well all the same. See this 12-year-old
> presentation,
>
> https://www.openbsd.org/papers/bsdcan11-mandoc-openbsd.html
>
> in particular page 12 "No way around some low-level roff requests."
> and page 13 "Desperation lead to success: Paradigmatic switch"
I see. You turned troff upside down, making formatter requests into
"macros" for man(7) and mdoc(7) (plus, quasi-environmental stuff).
Diabolical. 3:-)
Regards,
Branden
signature.asc
Description: PGP signature