[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Proposed: QS/QE macros for quotation in man(7)
From: |
G. Branden Robinson |
Subject: |
Proposed: QS/QE macros for quotation in man(7) |
Date: |
Mon, 16 Dec 2024 08:28:21 -0600 |
Hi Dave & Alex,
I guess I'd better firm this up a bit.
Synopsis:
.QS
Begin quotation. An opening quotation mark is formatted. The
line is not broken.
.QE
End quotation. A closing quotation mark is formatted. The line
is not broken.
At 2024-12-15T23:30:27-0600, Dave Kemper wrote:
> On Sat, Dec 14, 2024 at 12:02 PM G. Branden Robinson
> <g.branden.robinson@gmail.com> wrote:
> > This is likely due for a cleaned up re-proposal under the new
> > names `QS`/`QE` as suggested by Doug McIlroy.
>
> Man pages using these proposed macros--which, since the macros don't
> exist yet, will be man pages edited in 2025 or later--will surely
> never be formatted by a roff that's limited to two-character
> identifiers. Do other man-parsing tools expect all macros to be two
> characters or less?
I don't feel I have sufficiently broad knowledge of what man-parsing
tools are out there besides *roffs and mandoc(1). I _think_, based on
parallel behavior, that Michael Kerrisk uses lexgrog(1) from man-db to
extract the summary-description (the body text of the "Name" section)
from pages. Thomas Dickey maintains one of the several tools named
"man2html" that has existed over the years.[1]
Plan 9 troff is out there, retains the old AT&T/DWB troff limitation of
two-character names, and in fact implemented the newest groff man(7)
macro, `MR`, before we did.[2] And they are limited to two characters.
That said, I have suspicions that Plan 9 from User Space users don't use
its troff to render non-Plan-9 man pages.
> Or, as the man language is selectively expanded, can its new macros be
> given human-meaningful names?
If I were starting a new man macro package from scratch, I would
certainly do this. Since I'm not, I find it difficult to locate much
value in a macro language that's only _partly_ human-readable. On top
of the existing crypticness, we'd be adding inconsistency.
At 2024-12-16T11:19:32+0100, Alejandro Colomar wrote:
> Hi Branden, Dave,
>
> On Sun, Dec 15, 2024 at 11:30:27PM -0600, Dave Kemper wrote:
> > On Sat, Dec 14, 2024 at 12:02 PM G. Branden Robinson
> > <g.branden.robinson@gmail.com> wrote:
> > > This is likely due for a cleaned up re-proposal under the new
> > > names `QS`/`QE` as suggested by Doug McIlroy.
>
> I still don't know what to expect of those macros. Could you please
> send some examples of what you have in mind?
Here are some examples of where bash(1), as of this year, uses its new
page-local `Q` macro.
----
.TP
.B \-\-dump\-po\-strings
Equivalent to \fB\-D\fP, but the output is in the GNU \fIgettext\fP
.Q po
(portable object) file format.
----
When the shell is in posix mode, it does not recognize
\fBtime\fP as a reserved word if the next token begins with a
.Q \- .
----
The element with index 0 is the name of any currently-executing
shell function.
The bottom-most element (the one with the highest index) is
.Q main .
----
Any numeric argument given to a \fBreadline\fP command that was defined using
.Q "bind \-x"
(see
.SM
.B "SHELL BUILTIN COMMANDS"
below)
when it was invoked.
----
Here's how I'd write these with QS/QE (ignoring other style preferences
of mine).
----
.TP
.B \-\-dump\-po\-strings
Equivalent to \fB\-D\fP, but the output is in the GNU \fIgettext\fP
.QS
po
.QE
(portable object) file format.
----
When the shell is in posix mode, it does not recognize
\fBtime\fP as a reserved word if the next token begins with a
.QS
\-\c
.QE
\&.
----
The element with index 0 is the name of any currently-executing
shell function.
The bottom-most element (the one with the highest index) is
.QS
main\c
.QE
\&.
----
Any numeric argument given to a \fBreadline\fP command that was defined using
.QS
bind \-x
.QE
(see
.SM
.B "SHELL BUILTIN COMMANDS"
below)
when it was invoked.
----
I _can_ foresee some objections.
1. "Oh no! I have to learn how to use `\c`!"
Yeah. Macros like `BR` and `IR` are able to conceal the necessity
of that escape sequence from the man page author because they format
their arguments. Under the hood, `BR` for example could be
implemented (rudimentarily) like this.
.de BR
.nr Of \n(.f \" "old font"
.ft B
$1\c
.ft R
$2
.ft \n(Of
..
QS/QE don't format their arguments because they don't take any.
That in turn is because *roffs ignore macros they don't recognize,
and so the text of their arguments is lost--it doesn't format. If
we want to avoid doing violence to man pages that get formatted on
old systems or with old formatters that don't know QS/QE, using `\c`
more often is part of the price.
Increasing the use of `\c` also increases the pressure on me to go
do something about po4a.[3]
2. "Oh no! I have to remember to use `\&` before `.` at the start of
an input line!"
You _already_ have to remember that. But QS/QE might stimulate more
occasions for recollecting it.
To address some of the use cases in bash(1), as I mentioned earlier, it
is necessary[4] for `QS` to support a Boolean argument to suppress
hyphenation of the first word in the quotation.
bash(1) today:
.TP 8
.B ignoreeof
The effect is as if the shell command
.QN "IGNOREEOF=10"
had been executed
(see
.B "Shell Variables"
.ie \n(zZ=1 in \fIbash\fP(1)).
.el above).
Under my proposal:
.TP 8
.B ignoreeof
The effect is as if the shell command
.QS 1
IGNOREEOF=10
.QE
had been executed
(see
.B "Shell Variables"
.ie \n(zZ=1 in \fIbash\fP(1)).
.el above).
> I think for consistency sticking to the short format is a good thing,
> unless at some point we find we need longer ones (but that would be a
> good point for saying we have enough macros in man(7) that the
> language is too fat).
I agree. mdoc(7) illustrates the distance one can carry a two-letter
macro lexicon--a perhaps inadvisably long way.
Possible further enhancements
=============================
A. Have `QS` accept second and third arguments specifying the quotation
characters to use. This is like mdoc(7)'s `Eo`/`Ec`, and would make
QS and QE more general inline enclosure/bracketing macros.
Press
.QS 1 < >
Enter
.QE
to continue.
I'm leaning away from this, though. (1) It doesn't seem quite in
keeping with man(7)'s philosophy in a way I struggle to articulate
(maybe Doug can help). (2) It trades away the advantage of not
losing text (apart from the quotation marks themselves, which
historically man pages authors don't bother to put in in their
literal forms in the first place because they aren't sure how).
Here, the characters that get dropped are not quotation marks per
se, but '<' and '>', which people have been able to type and get
formatted without trouble forever, unlike “ ” ‘ ’. (In indulgence
of "power users'" petulant rebellion against good typography,
distributors hack up "man.local" to make it harder still to get ‘
and ’.[5]) And (3) people might get carried away.
.SY tbl
.QS 0 [ ]
.B \-C
.QE
.QS 0 [ ]
.I file
\&.\|.\|.
.YS
I don't think that's an improvement on the status quo.
.SY tbl
.RB [ \-C ]
.RI [ file\~ .\|.\|.]
.YS
B1. Alternate double and single quotation marks with the parity of the
nesting level.
.QS
I mean,
if 10 years from now,
when you are doing something quick and dirty,
you suddenly visualize that I am looking over your shoulders and say
to yourself
.QS
Dijkstra would not have liked this\c
.QE
,
well,
that would be enough immortality for me.
.QE
Rendering on a typesetter or UTF-8 terminal:
“I mean, if 10 years from now, when you are doing something quick
and dirty, you suddenly visualize that I am looking over your
shoulders and say to yourself ‘Dijkstra would not have liked this’,
well, that would be enough immortality for me.”
Rendering on an ASCII or Latin-1 terminal:
"I mean, if 10 years from now, when you are doing something quick
and dirty, you suddenly visualize that I am looking over your
shoulders and say to yourself 'Dijkstra would not have liked this',
well, that would be enough immortality for me."
B2. It would be trivial to support the British, who use the wrong
quotation marks^W^W^W^W^Wdrive on the wrong side of the
road^W^W^W^W^W^W^W^Whave a different quotation mark convention. A
documented rendering configuration register, akin to `LL`, `IN`, and
`PO`, could invert the sense of nesting parity. I imagine this
would be another matter handled in "man.local". In fact, since in
groff we have `\V`, it could even be made sensitive to the locale
settings of the process environment. I don't think I'd bother in
the stock configuration (and because I'm not an expert on which
territories prefer the "other" quotation convention as strongly as
Fleet Street does). Distributions are more likely to have the
relevant expertise.
But this, too, may not be worth messing with, as even if the
quotation marks are correct according to one's training/biases,
everybody has to read nonstandard English spellings in man pages
written on the other side of the ocean anyway, and that's probably
more jarring. Yet after 45 years of man(7), there's been little
user demand expression for man pages to offer bifurcated en_GB and
en_US localized versions. I've never seen anyone throw a fit about
this, and I've seen the Phoronix forums.
I really could go either way on this aspect of the macros.
Regards,
Branden
[1] https://invisible-island.net/scripts/man2html.html
[2] They initially called it `IM` and changed it to `MR` at my request.
https://github.com/9fans/plan9port/commits/master/tmac
[3] I distracted myself with a vision of a grand solution that would
solve a whole bunch of problems at once. But that will take time to
design (let alone deploy). A spot fix for po4a's non-interpretation
of `\c` is important, too.
https://github.com/mquinson/po4a/issues/527
[4] Not _strictly_ necessary, but I am loath to encourage man page
authors to experiment with `nh` and `hy` requests; that can only end
in tears.
[5] "With Debian (and other distributors...) capitulating to pressure to
override the meanings of these input characters once again, a cost
is imposed on correctly composed pages that historically rendered
well: whereas `foo' formerly reliably appeared as ‘foo’ everywhere
directional single quotes were supported (and as 'foo' where they
were not), now `foo' appears as `foo', making the page ugly and
wrong. (I know of no UTF-8 font for a terminal emulator that
renders these glyphs as the ASCII standard ANSI X3.4-1968 depicts
them; see <https://ia800800.us.archive.org/35/items/\
enf-ascii-1968-1970/Image070917151315.pdf>. The inconsistency of
the unlettered, selectively ASCII-championing revanchist stance is
nearly as frustrating as its ignorance.)"
https://gitlab.com/procps-ng/procps/-/merge_requests/213/diffs?commit_id=a3ac4b667929320d4c8012435d63a9d1dd538a8d
signature.asc
Description: PGP signature
- on the need for better quotation in man(7) (was: names of ISO 8859 encodings), G. Branden Robinson, 2024/12/14
- Re: on the need for better quotation in man(7) (was: names of ISO 8859 encodings), Dave Kemper, 2024/12/16
- Re: on the need for better quotation in man(7) (was: names of ISO 8859 encodings), Alejandro Colomar, 2024/12/16
- Proposed: QS/QE macros for quotation in man(7),
G. Branden Robinson <=
- Re: Proposed: QS/QE macros for quotation in man(7), Dave Kemper, 2024/12/18
- Re: Proposed: QS/QE macros for quotation in man(7), Alejandro Colomar, 2024/12/18
- Re: Proposed: QS/QE macros for quotation in man(7), G. Branden Robinson, 2024/12/18
- Re: Proposed: QS/QE macros for quotation in man(7), G. Branden Robinson, 2024/12/18
- Re: Proposed: QS/QE macros for quotation in man(7), Alejandro Colomar, 2024/12/19
- Re: Proposed: QS/QE macros for quotation in man(7), onf, 2024/12/19
- Re: Proposed: QS/QE macros for quotation in man(7), G. Branden Robinson, 2024/12/19
- Re: Proposed: QS/QE macros for quotation in man(7), onf, 2024/12/20
- On code quality, C, and C++ (was: Proposed: QS/QE macros for quotation in man(7)), G. Branden Robinson, 2024/12/20
- Re: On code quality, C, and C++ (was: Proposed: QS/QE macros for quotation in man(7)), onf, 2024/12/20