bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fix quotes for @samp


From: Karl Berry
Subject: Re: fix quotes for @samp
Date: Tue, 2 Aug 2005 00:56:32 +0200

Regarding HTML quotes, ‘ and ’ are surely better in theory,
but I seem to remember that I tried putting them in not too long ago and
got a lot of complaints.  But I guess we can try it again, maybe things
are better.  I see that we're doing “ and ” now, so what the
heck.  I installed it.

Actually, we could go even further.  Your patches change the output for
@samp, but how about making `...' in straight text output
‘...’?

Regarding quotes in Info output, I really don't think changing the
default output is worth the inevitable trouble.  I also don't think the
output should depend on the locale of the person who runs makeinfo; Info
output gets shipped around, installed on different computers, etc.
Instead, seems to me that the right thing to do is make the output
respect @documentencoding, and support utf-8 there.  Of course this is
not exactly trivial, but it is the path we have started down.  E.g.,
makeinfo --enable-encoding on a document with @documentencoding
ISO-8859-1 will produce 8-bit Info files.

Regarding quotes in general, I expect you will be unhappy to hear that
after the latest discussion on gnu-prog-discuss (a couple months ago by
now), I expended a fair amount of effort discussing it with rms, Bruno,
Paul Eggert, and others, and the consensus was that we had to stick with
`...' for default (C locale) output, that nothing else was widely
supported enough -- even today (but that using better quote chars in
other locales was ok/desirable).  See patch for standards.texi below,
which I plan to be installing in the near future.

Best,
karl


***************
*** 2971,2972 ****
--- 2977,3032 ----
  
+ 
+ @node Character set
+ @section Character set
+ @cindex character set
+ @cindex encodings
+ @cindex ASCII characters
+ @cindex non-ASCII characters
+ 
+ Sticking to the ASCII character set (plain text, 7-bit characters) is
+ preferred in GNU source code comments, text documents, and other
+ contexts, unless there is good reason to do something else because of
+ the application domain.  For example, if source code deals with the
+ French Revolutionary calendar, it is OK if its literal strings contain
+ accented characters in month names like ``Flor@'eal''.  Also, it is OK
+ to use non-ASCII characters to represent proper names of contributors in
+ change logs (@pxref{Change Logs}).
+ 
+ If you need to use non-ASCII characters, you should normally stick with
+ one encoding, as one cannot in general mix encodings reliably.
+ 
+ 
+ @node Quote characters
+ @section Quote characters
+ @cindex quote characters
+ 
+ In the C locale, GNU programs should stick to plain ASCII for quotation
+ characters in messages to users: preferably 0x60 (@samp{`}) for left
+ quotes and 0x27 (@samp{'}) for right quotes.  It is ok, but not
+ required, to use locale-specific quotes in other locales.
+ 
+ The @uref{http://www.gnu.org/software/gnulib/, Gnulib} @code{quote} and
+ @code{quotearg} modules provide a reasonably straightforward way to
+ support locale-specific quote characters, as well as taking care of
+ other issues, such as quoting a filename that itself contains a quote
+ character.  See the Gnulib documentation for usage details.
+ 
+ In any case, the documentation for your program should clearly specify
+ how it does quoting, if different than the preferred method of @samp{`}
+ and @samp{'}.  This is especially important if the output of your
+ program is ever likely to be parsed by another program.
+ 
+ Quotation characters are a difficult area in the computing world at this
+ time: there are no true left or right quote characters in ASCII, or even
+ Latin1; the @samp{`} character we use was standardized as a grave
+ accent.  Moreover, Latin1 is still not universally usable.
+ 
+ Unicode contains the unambiguous quote characters required, and its
+ common encoding UTF-8 is upward compatible with address@hidden  However,
+ Unicode and UTF-8 are not universally well-supported, either. 
+ 
+ This may change over the next few years, and then we will revisit
+ this.
+ 
+ 
  @node Mmap




reply via email to

[Prev in Thread] Current Thread [Next in Thread]