[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Groff] Re: man page encoding
From: |
Bruno Haible |
Subject: |
[Groff] Re: man page encoding |
Date: |
Wed, 6 Jul 2005 13:50:31 +0200 |
User-agent: |
KMail/1.5 |
Hi Andries,
Thanks for the details.
> (2) You say: `The goal is that "groff -T... -mandoc" on any man page works,
> without need to specify the encoding as an argument to groff'.
>
> (2A) This will work in simple cases, where input encoding and output
> encoding and system character set are equal.
> ...
> /usr/bin/groff -Tnippon -mandocj
The input encoding and the output encoding are often different. For
example, when a user in a ja_JP.UTF-8 locale views a man page in EUC-JP
encoding. The output device is -Tutf8 in this case.
The problem with "-Tnippon" is that it needs to specify a particular
output device in order to cope with input in EUC-JP.
> (3A) man.conf contains the default invocation, like
> /usr/bin/nroff -Tlatin1 -mandoc
This is bad: The encoding of the output should be determined by the
user's current locale, not hardcoded in a configuration file.
Get rid of this line in man.conf!
> (2B) Maybe this does not have to work - the requirement is that "man ls"
> works, not that "groff [options] ls.1" works.
No, the goal is really that "groff [options] ls.1" works. When a
translator or man page author wants to view a man page, s/he should
be able to do so without installing the file in particular directories.
> (3C) The iconv hack mentioned earlier today used a charset file
> in the directory to indicate the character set of all man pages in that
> directory.
That's bad, because the meaning of the file changes depending on which
directory it sits in. "groff [options] ls.1" needs to work without
referring to other files in the same directory.
> (4) Yes, character set information in a man page would be desirable.
> But it is bad to require it.
Why? HTML requires it. XML requires it. We require it in PO files, and there
it's a life saver. Emacs requires it in many files, in order to display the
file correctly.
> Putting the info on the first line of the file is a bad idea.
> Many things want to be on the first line.
> (The .so directive, the 't and 'e directives, etc.)
When there's a .so directive, you don't need to specify the encoding.
When there's 't and 'e directives, the comment with -*- coding -*-
can come after it, without disturbing groff's determination of the
preprocessors to be run.
> (-) In short: the system-wide convention (you would choose UTF-8
> but I know people who would choose KOI-8) we have already, it is (3A).
Sorry, this needs to go away. Hardcoding output encodings in a configuration
file is a no-no.
> The man program (and/or groff) can react to the user's locale settings.
Yes, that's the way to go.
> Since almost all translations are produced by national translation teams
> working via the Montreal translation robot, the rules are rather uniform,
> and it will not be very difficult to introduce new rules.
Thanks, then let's go for the proposed
.\" t -*- coding: EUC-JP -*-
syntax.
Bruno
- [Groff] viewing UTF-8 encoded man pages, Bruno Haible, 2005/07/05
- [Groff] Re: viewing UTF-8 encoded man pages, Andries Brouwer, 2005/07/05
- [Groff] man page encoding, Bruno Haible, 2005/07/05
- [Groff] Re: man page encoding, Andries Brouwer, 2005/07/06
- [Groff] Re: man page encoding,
Bruno Haible <=
- [Groff] Re: man page encoding, Andries Brouwer, 2005/07/07
- Re: [Groff] Re: man page encoding, Werner LEMBERG, 2005/07/07
- [Groff] Re: man page encoding, Bruno Haible, 2005/07/07
- Re: [Groff] Re: man page encoding, Zvezdan Petkovic, 2005/07/07
- Re: [Groff] Re: man page encoding, Andries Brouwer, 2005/07/07
- Re: [Groff] Re: man page encoding, Werner LEMBERG, 2005/07/07
- Re: [Groff] Re: man page encoding, Clarke Echols, 2005/07/07
- Re: [Groff] Re: man page encoding, Werner LEMBERG, 2005/07/07
- Re: [Groff] Re: man page encoding, Werner LEMBERG, 2005/07/07
- Re: [Groff] Re: man page encoding, Bruno Haible, 2005/07/07