bug-groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: minor hyphenation issue


From: Dave Kemper
Subject: Re: minor hyphenation issue
Date: Tue, 18 Apr 2017 17:41:09 -0500

So, if I understand the situation correctly, groff gets its
hyphenation information from TeX.  TeX isn't accommodating any English
words with non-ASCII characters because of its hyphenation algorithm's
limitations, and Werner is reluctant to have groff accommodate them
because of the maintenance complexity of modifying or augmenting the
TeX rules.  Is this a fair summation?

Can TeX's list of patterns be expanded to include letters with
diacritics without breaking TeX's English hyphenation algorithm?  That
is, if Latin-9 characters are included, will the algorithm simply
ignore them, or fail?

On 4/12/17, Werner LEMBERG <address@hidden> wrote:
> The very issue is rather that *users* are not accomodated to select an
> input and/or font encoding while typesetting US English texts.

Probably true in general.  However, those English users who write
about résumés or Blue Öyster Cult -- and who care enough to get
details correct -- will either learn how to produce Latin-1 characters
(which groff accepts), or learn the escape sequences in groff (and I
presume TeX has an equivalent mechanism) that allow these characters
to be represented with ASCII input.

The user can, of course, use .hw to correctly break the occasional
such word in predominantly ASCII English text,  However, it's far from
intuitive that such accommodation is the user's responsibility, when
all other hyphenation Just Works without the user having to think
about it.  It would be nice if these sorts of words worked out of the
box.

Side note: groff does, I observe, correctly break "öyster" (which is
technically not even a real English word) but not "résumé" (which is
not only a real word, but needs the accents to distinguish it from the
unrelated word "resume").  I assume this is because no hyphenation
point of öyster is adjacent to the non-ASCII letter.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]