bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#16292: 24.3.50; info docs now contain single straight quotes instead


From: Eli Zaretskii
Subject: bug#16292: 24.3.50; info docs now contain single straight quotes instead of `'
Date: Tue, 31 Dec 2013 10:27:54 +0200

> Date: Mon, 30 Dec 2013 21:58:31 -0800
> From: Paul Eggert <eggert@cs.ucla.edu>
> CC: 16292@debbugs.gnu.org, grfz@gmx.de
> 
> Eli Zaretskii wrote:
> 
> > (names of people) so using Latin-1 doesn't hamper users'
> > ability to read the manual in any way
> 
> Most of the non-ASCII words are people's names, but many are not,
> and often ASCIIfying these would hurt the manual.
> These include symbols (e.g., "¬"), examples of encoding
> ("@samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}"),
> calendars ("Bahá'í"), the names of GNU programs ("真 Gnus"),
> and configuration examples ("écrit" in email configuration).

I don't think we care about encoding of a handful of words, as long as
the bulk of the manual, including markup and quotes, is legible.  I
only mentioned Latin-1 because it seemed to cover most of the
non-ASCII characters.  But I don't insist on it.  Neither do I insist
on a single-byte encoding of those few words and names; in particular,
UTF-8 will do -- but only for the non-ASCII text in the manuals.

> Nowadays, on GNUish and POSIXish systems in the Emacs target
> audience, there's more usage of UTF-8 than of Latin-1.  On
> Ubuntu and Fedora, for example, the default locale for US
> English is en_US.utf8.  Hence, converting info files to
> Latin-1 would hurt standalone info users in the typical
> setup on GNUish and POSIXish platforms.

It hurts them in a very small number of places, most or all of which
don't affect in any way the ability of the reader to read and
understand the presented material.

As I say above, I won't object to having the non-ASCII words encoded
in UTF-8, as long as it doesn't affect the (single and double) quote
characters, and any other characters/strings (like '#' and '=>') we
use for describing the Emacs and Lisp features.

The problem here is that @documentencoding is virulent when you use
UTF-8: it affects the quotes, not just non-ASCII text in the Texinfo
sources.  This is unlike any other value of @documentencoding.  And
that is the only problem that bothers me, and IMO should bother us
all.

Perhaps a possible solution would be to customize OPEN_QUOTE_SYMBOL
and CLOSE_QUOTE_SYMBOL (although I'm not sure it affects double
quotes), or edit the Info files with Sed to replace Unicode quote
characters with some ASCII characters.  The rest of the non-ASCII text
can be left intact, in UTF-8.

> Perhaps Microsoft Windows users are different, and typically
> use Latin-1 or some other unibyte encoding.

This has nothing to do with Windows; I first hit the problem on a
GNU/Linux machine that was configured with a non-UTF locale.  The
reason I never saw the problem since last March is that I still use
makeinfo from Texinfo 4.13, which doesn't affect the quote characters
when @documentencoding of UTF-8 is specified.  So the Info files I
produce when I build Emacs don't suffer from this misfeature.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]