groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Rendering the em dash on the terminal


From: Jeff Conrad
Subject: RE: Rendering the em dash on the terminal
Date: Mon, 26 Aug 2024 16:41:47 -0700

> From: groff-bounces+jeff_conrad=msn.com@gnu.org <groff-
> bounces+jeff_conrad=msn.com@gnu.org> On Behalf Of Dave Kemper
> Sent: Saturday, 24 August, 2024 12:33 PM

> The new logic is this:
> 
> .ie '\?\*[.T]\?'\?utf8\?' .char \[em] \[em]\[em]
> .el                       .char \[em] --
> 

Aesthetics
==========
> The motivation is given in the commit log: making \[em] look "more
> like a true em dash, taking up two character cells."

Dunno if taking up two character cells makes it “look more like a
true em dash”; it may be more aesthetically pleasing than two hyphens.

Dash List
---------
There are situations in which I’m not sure what gives the best aesthetics.
For example, with mm’s DL (dash list) macro, I might prefer

 —— First item
 —— Next item

to

 -- First item
 -- Next item

Neither is great; far better might be

 — First item
 — Next item

But there may be no easy way to get there from here.

Clarity
=======
> An em dash in any monospace font is hard to distinguish from a
> hyphen and other dash-like glyphs.

Agree.  And I think _clarity must trump aesthetics_.  A single em
dash is not obviously seen as such.  And unlike an en dash
(probably seen as a hyphen by most folks anyway, even in typeset
material, which is why most newspapers seldom use it), the
distinction is important.

Sometimes the distinction is important even with an en dash.  A
reasonable rule is that recognition should fail gracefully.  An
example might be Oakland’s “Anti Police-Terror Project.”
Properly, “anti” is a prefix and needs a hyphen, but it’s more
complicated when it modifies a compound.  Chicago style would use
“Anti–Police Terror Project”; suffice it to say that the failure
here is less than graceful.

Any approach that has an em dash take up two character cells
might lead to confusion in a few instances.

Two-Em Dash
-----------
A two-em dash is often used to indicate omissions: from the
Chicago Manual of Style (18th ed.), § 6.99,

    Admiral N—— and Lady R—— were among the guests

Some folks use a single em dash here, which would look the same
as above.  But actually using two em dashes would give

    Admiral N———— and Lady R———— were among the guests

which isn’t so good.

Three-Em Dash
-------------
A three-em dash is commonly used in a bibliography to indicate
the same author(s) as the previous entry, e.g.,

    Chaudhuri, Amit. Odysseus Abroad. Alfred A. Knopf, 2015.
    ———. A Strange and Sublime Address. Minerva, 1992.

Input in the normal manner would give

    Chaudhuri, Amit. Odysseus Abroad. Alfred A. Knopf, 2015.
    ——————. A Strange and Sublime Address. Minerva, 1992.

which seems kinda long. But perhaps it’s just me.

I suppose a workaround might be terminal-specific characters like
‘2m’ and ‘3m’.  I long had these as strings, more for ease of
entry than for handling different devices.  In this case, though,
it’s not clear how these characters would be handled so there are
clear distinctions among ‘em’, ‘2m’, and ‘3m’.  And if the
typographical convention of ‘--’ were to prevail for ‘em’, I’m
not sure how it would apply to ‘2m’ and ‘3m’.

Comments
========
> My first concern is that this motivation is communicated only in the
> commit log, leaving a bit of a head-scratcher to anyone merely reading
> the code.  If this logic is kept, its motive should be commented in
> the code.

This seems reasonable.  Most folks can probably figure this out
after a bit of head scratching, but it would be nice to spare
them the trouble.

Typographic Convention
======================
> Two em dashes in a row is part of no typographic convention.

Agree.  But the ‘--’ convention comes from manuscript preparation
in typewriter days; I wonder how many younger users are even
aware of it.

Copy and Paste
==============
> This will paste very poorly into any text field that uses a
> proportional font.

How often would someone copy and paste from man(1) output?  And I
think the goodness or badness would depend on the target; if the
target is text, it might look a bit strange because the ‘——’
sequence isn’t common.  If the target is something destined for
output in proportional type, I’m not sure ‘--’ is much better.
The only proper sequence in that case is a single em dash, but as
we all seem to agree, this isn’t great for output to a monospace
terminal.

Full disclosure: I format my man pages as PDF, so I may not be
the best person to comment on the appearance of output to
monospace device.

Searches
========
> It interferes with greps and other searches: most readers
> seeing two hyphen-like characters in a row in a monospace font
> will conclude that they are in fact two hyphens, the
> longstanding convention, rather than two em dashes.

Would it?  I’d probably never think to search for ‘——’, but I
don’t often search for ‘--’, either, because it’s almost always
context dependent.  Conceivably, I might search for an em dash
that either precedes or follows a specific text, but such a
search would work with ‘——’.

Don’t throw stones ...
======================
I make these comments having done things in years past that would
make ‘——’ look pretty benign.  In the mid-1980s, we used Elan’s
eroff (basically, AT&T version 2 troff); unfortunately, the
downloadable HP fonts we long used had the HP Roman 8 character
set, which didn’t include em or en dashes or many other
characters.  Two hyphens in typeset output looked pretty crummy,
so I came up with

.ds EM 
\%\^\v'-.43m'_\h'-\w'_'u/2u'_\h'-3u*\w'_'u/2u'\h'1m'\h'-\w'_'u'_\v'.43m'\^

(we used mm, so “\*(EM” was the standard way to insert an em
dash).  To my knowledge, we never had a problem with this getting
hyphenated; apparently eroff would not break a sequence with an
unclosed vertical motion.  The leading ‘\%’ was added for good
measure; I can’t remember proving whether it actually helped.
The leading and trailing thin spaces were an aesthetic
enhancement.

There are simpler ways to do this, depending on what you assume
and how fussy you are.

>From Acrobat, the string above copies as ‘__’--hardly ideal.  But
copy and paste wasn’t an issue with output from a printer.

Convention, Again
=================
> But even if the aesthetic concern in monospace-land is given more
> weight, two em dashes in a row is a less preferable substitution than
> the longstanding convention of two hyphens.

This certainly is one with which many of us are familiar, but
again, I wonder if this is true for many younger users, such as
some TeX afficionados who use ‘---’.

So ultimately, I dunno.  For the most common usages, ‘——’ may be
aesthetically preferable to ‘--’.  But in some less common
situations, this may confuse more than enhance.  I think it’s
worth hearing what others think.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]