[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: po / pot file integration: documentation
From: |
Ralf Wildenhues |
Subject: |
Re: po / pot file integration: documentation |
Date: |
Tue, 17 Aug 2010 20:39:24 +0200 |
User-agent: |
Mutt/1.5.20 (2010-04-22) |
Hi Bruno,
* Bruno Haible wrote on Sun, Aug 15, 2010 at 10:24:58PM CEST:
> In order to test our common understanding, and to prepare for the unit tests,
> here is a proposed patch to the documentation. Feel free to put it on a
> git branch.
You can do it yourself, when we've hashed out all questions. Or I can
do it for you, if you prefer.
I've created a new branch 'pot-primary' based off of the 'maint' branch.
I don't expect it it to be merged into branch-1.11, and if we need
features only in master but not in maint, then it should be fine to
merge master into 'pot-primary' at that point. Other than that, I'm
fine with a branch policy such that the 'pot-primary' branch may be
rewound in case that proves necessary, but of course we should aim for
that not to be the needed.
Nits and questions to the patch below.
> 2010-08-15 Bruno Haible <address@hidden>
>
> Document the handling of POT and PO files.
s/handling/intended &/ ?
> * doc/automake.texi (Public Macros): Document AM_POT_TOOLS.
> (Private Macros): Add AM_NLS.
> (Internationalization): New chapter.
> --- doc/automake.texi.orig Sun Aug 15 22:15:54 2010
> +++ doc/automake.texi Sun Aug 15 22:13:56 2010
> @@ -277,6 +279,32 @@
>
> * Built Sources Example:: Several ways to handle built sources.
>
> +Internationalization
> +
> +* POT Files:: Declaring a POT file
> +* Parametrizing xgettext:: Declaring how a POT file gets created
My dictionary accepts 'parametrizing', but I wonder whether
'parameterizing' is the more common spelling. Since this ends up in the
section URLs and command-line usage of the stand-alone info reader, it's
a good idea to fix this beforehand and to make it not too complicated.
Hmm, can we think of a simpler word with the same meaning?
> +* Message catalog translations::Declaring the translations
> +* Message catalog installation::Declaring how message catalogs get installed
> +* Other POT files details:: Rarely used settings for POT files
Does info cope with no whitespace after ::? I'm wondering whether
increasing indentation here (and thereby messing up alignment with
previous menus) would be nicer?
> +
> +Parametrizing xgettext
> +* A POT file's _SOURCES:: Declaring the source files.
Let's drop trailing period in menus, automake.texi has only a couple of
them.
> +* A POT file's _COPYRIGHT_HOLDER:: The message catalogs' copyright holder
> +* A POT file's _MSGID_BUGS_ADDRESS:: Allowing translators to report bugs
BUGS_ADDRESS or BUG_ADDRESS? Oh well, I guess that was already set in
stone years ago; it slightly conflicts with current Autoconf usage of
the term.
> +* Additional xgettext options:: Fine tuning of xgettext
I think it's "Fine-tuning".
> +* Multiple xgettext invocations:: Advanced uses of xgettext
> +
> +Message catalog installation
> +
> +* A POT file's _CATALOGFORMAT:: Formats of compiled message catalogs
> +* A POT file's _LOCALE_CATEGORIES:: Using different locale categories
> +
> +Other POT files details
> +
> +* A POT file's _USE_MSGCTXT:: Declaring whether msgctxt is used
> +* A POT file's _MSGMERGE_OPTIONS:: Specifying options for msgmerge
> +
> Other GNU Tools
>
> * Emacs Lisp:: Emacs Lisp
> @@ -3861,6 +3889,19 @@
> @command{configure} to explicitly set the correct path (if you're sure
> you have an @command{emacs} that supports Emacs Lisp).
>
> address@hidden AM_POT_TOOLS
> address@hidden AM_POT_TOOLS
> +
> +The @code{AM_POT_TOOLS} macro performs tests for the @code{POTS} primary.
> +See @pxref{Programming Languages, , Programming Languages, gettext,
@pxref prints a lower-case "see" in the PDF output, here you could best
remove the initial "See" and use @xref.
> +GNU gettext tools} for a list of programming languages that support
> +localization through PO files.
> +
> +The @code{AM_POT_TOOLS} macro determines whether internationalization
> +should be used.
Adding (@pxref{Internationalization}) would be good here.
> If so, it sets the @code{USE_NLS} variable to @samp{yes},
> +otherwise to @samp{no}. It also determines the right values for Makefile
> +variables used by the Makefile rules for the POT and PO files.
> +
> @item AM_PROG_AS
> @acindex AM_PROG_AS
> @vindex CCAS
> @@ -4030,6 +4071,11 @@
> @code{include} statements. This macro is automatically invoked when
> needed; there should be no need to invoke it manually.
>
> address@hidden AM_NLS
> +This macro determines whether internationalization should be used. If
> +so, it sets the @code{USE_NLS} variable to @samp{yes}, otherwise to
> address@hidden
> +
> @item AM_PROG_INSTALL_STRIP
> This is used to find a version of @code{install} that can be used to
> strip a program at installation time. This macro is automatically
> @@ -7352,6 +7398,421 @@
> is converting @file{.h} files into @file{.c} files.
>
>
> address@hidden Internationalization
> address@hidden Internationalization
> +
> +An internationalized program is a program that can communicate with the
> +user in his native language, provided that some translation work has been
Let's avoid 'his', so how about 'its users in their native language'?
> +done. An internationalized program can be localized to a certain
> +language by translating the messages by which the program communicates
> +with the user. These message are grouped in files called ``message
> +catalogs''.
> +
> +The process of localization consists of three steps:
> +
> address@hidden
> address@hidden
> +The developers send a file with English messages and (optionally)
> +some translation hints for the translators.
> address@hidden
> +The translators translate these files to their respective language.
> address@hidden
> +The translators send back the translated message catalogs, and the
> +developers integrate them in the package.
> address@hidden enumerate
> +
> +Automake supports the message catalog format of GNU Gettext
> +(@pxref{Top, , Introduction, gettext, GNU gettext tools}). When the GNU
> +Gettext tools are in use:
> +
> address@hidden
> address@hidden
> +The file that contains the English messages and translation hints is
> +called a POT file (Portable Object Template) and usually named
> address@hidden@var{domain}.pot}. The @var{domain} is a identifier for the
@address@hidden
an identifier
> +translation domain. Different packages must use different @var{domain}s.
> +Therefore, usually, the @var{domain} is the same as the package name.
I think this is slightly more common: Therefore, the @var{domain} is
usually the same ...
> address@hidden
> +The file that contains the translation produced by a translation team
> +is called a PO file (Portable Object) and usually named
> address@hidden@address@hidden or
> address@hidden@address@hidden@var{CC}.po}. Here @var{ll} is the
Again, @file, two instances.
Here, ...
> +language code identifier according to ISO 639, and the optional @var{CC}
> +is the country code identifier according to ISO 3166. The optional
> address@hidden is only needed to distinguish different dialects of the same
> +language, such as @code{pt_BR} (Brazilian Portuguese), which is slightly
> +different from @code{pt} (Portuguese).
> address@hidden itemize
> +
> address@hidden
> +* POT Files:: Declaring a POT file
> +* Parametrizing xgettext:: Declaring how a POT file gets created
> +* Message catalog translations::Declaring the translations
> +* Message catalog installation::Declaring how message catalogs get installed
> +* Other POT files details:: Rarely used settings for POT files
> address@hidden menu
> +
> address@hidden POT Files
> address@hidden POT Files
> +
> +In Automake, a message catalog template (POT file) is a primary.
I think the Automake lingo is that _POTS is the primary; see the
"amhello Explained" node.
> The
> +Makefile rules generated by Automake also handle all the PO files that
> +belong to the POT file.
> +
> +For execution, the PO files get compiled to ``compiled message
> +catalogs''. These a usally @code{.mo} files (MO = Machine Object).
usually
@file{.mo}
> +They get installed in an appropriate subdirectory of @code{$(localedir)}.
> +
> +The declaration of an installable message catalog domain therefore looks
> +like this:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> address@hidden smallexample
> +
> +This declaration causes Automake to emit Makefile rules that create
> +the @file{po/maude.pot} file from some of the sources, and that manage
> +translated message catalogs @file{po/address@hidden@var{CC}.po} in the
> +same directory.
> +
> address@hidden AM_POT_TOOLS
> +When you use the @code{POTS} primary, you need to invoke
> address@hidden directly or indirectly from @file{configure.ac}.
> +
> address@hidden Parametrizing xgettext
> address@hidden Parametrizing xgettext
> +
> +The POT file gets created through an invocation of the @code{xgettext}
> +program
> +(@pxref{xgettext Invocation, , xgettext, gettext, GNU gettext tools}).
> +
> address@hidden
> +* A POT file's _SOURCES:: Declaring the source files.
> +* A POT file's _COPYRIGHT_HOLDER:: The message catalogs' copyright holder
> +* A POT file's _MSGID_BUGS_ADDRESS:: Allowing translators to report bugs
> +* Additional xgettext options:: Fine tuning of xgettext
> +* Multiple xgettext invocations:: Advanced uses of xgettext
> address@hidden menu
> +
> address@hidden A POT file's _SOURCES
> address@hidden A POT file's _SOURCES
> +
> address@hidden scans a set of source files. The default set of sources
> +is the union of all _SOURCES variables in the same @code{Makefile.am},
> +except for @code{BUILT_SOURCES}. You can also specify the set of sources
> +explicitly, through a _SOURCES variable and optionally a
> +_SOURCES_EXCLUDE variable. Both variables may include wildcards.
> +
> +This example lists the source files explicitly:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_SOURCES = src/maude.c src/util.c
> address@hidden smallexample
> +This example uses wildcards. @code{**} stands for recursion across all
> +subdirectories. But note that only files known to Automake will be
> +selected. If you have other, unrelated files in the file system, they
> +will not be selected, even if they match the wildcard.
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_SOURCES = src/**/*.c
> +po_maude_pot_SOURCES_EXCLUDE = src/version.c
> address@hidden smallexample
Can we omit this for now? It can still be added at a later point in
time, maybe together with its implementation. Wild cards are
problematic in portable makefiles (the git Automake manual has more
details than the latest stable) and we are still thinking about ways
to let Automake provide some kind of file name expansion for users
(at automake run time).
> +When a C file is automatically generated by a tool, like @code{flex} or
> address@hidden, that doesn't introduce translatable strings by itself,
> +it is recommended to list in _SOURCES the real source file (ending in
> address@hidden in the case of @code{flex}, or in @file{.y} in the case of
> address@hidden), not the generated C file.
> +
> address@hidden A POT file's _COPYRIGHT_HOLDER
> address@hidden A POT file's _COPYRIGHT_HOLDER
> +
> +Every POT file needs to carry a copyright notice designating the
> +copyright holder, so that the translators know for whom they are doing
> +the translation. The copyright holder is declared like this:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_COPYRIGHT_HOLDER = Yoyodyne, Inc.
> address@hidden smallexample
> +
> +The value of this variable is the copyright holder that gets inserted
> +into the header of the POT file. Set this to the copyright holder of the
> +surrounding package. (Note that the msgid strings, extracted from the
@samp{msgid} ? Is it worth explaining this term here?
> +package's sources, belong to the copyright holder of the package.)
s/ / /
> +Translators are expected to transfer the copyright for their translations
> +to this person or entity, or to disclaim their copyright. The empty
> +string stands for the public domain; in this case the translators are
> +expected to disclaim their copyright.
> +
> address@hidden A POT file's _MSGID_BUGS_ADDRESS
> address@hidden A POT file's _MSGID_BUGS_ADDRESS
> +
> +It is important that translators can report problems that they find
> +in the English messages to be translated. A bug reporting address can
> +be declared like this:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_MSGID_BUGS_ADDRESS = bug-maude@@yoyodyne.com
There was an example domain for email addresses, but it wasn't
example.com IIRC. Using some arbitrary other one is not nice for spam,
so in case you don't know, let's use the Automake bug address instead.
> address@hidden smallexample
> +
> +The value of this variable is the email address or URL to which the
> +translators shall report bugs in the untranslated strings:
> +
> address@hidden
> address@hidden
> +Strings which are not entire sentences, see the maintainer guidelines in
> +the GNU gettext documentation,
> address@hidden Strings, , Preparing Translatable Strings, gettext,
> +GNU gettext tools}.
> address@hidden
> +Strings which use unclear terms or require additional context to be
> +understood.
> address@hidden
> +Strings which make invalid assumptions about notation of date, time or
> +money.
> address@hidden
> +Pluralisation problems.
So far most of the manual has been using US English spelling, let's keep
it that way.
> address@hidden
> +Incorrect English spelling.
sorry for the unintended pun. ;-)
> address@hidden
> +Incorrect formatting.
> address@hidden itemize
> +
> +It can be your email address, or a mailing list address where translators
> +can write to without being subscribed, or the URL of a web page through
> +which the translators can contact you.
> +
> +You don't need to specify the bug reporting address here if you have
> +already done so through the third argument of @code{AC_INIT}, see
> address@hidden configure, , Initializing @code{configure}, autoconf,
This is where s/see @xref/@pxref/ would make sense. :-)
> +The Autoconf Manual}.
> +
> address@hidden Additional xgettext options
> address@hidden Additional @code{xgettext} options
> +
> +Additional command-line options for the @code{xgettext} invocation can be
> +specified through the _XGETTEXT_OPTIONS variable. For example:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_XGETTEXT_OPTIONS = \
> + --keyword=_ --flag=_:1:pass-c-format \
> + --keyword=N_ --flag=N_:1:pass-c-format
> address@hidden smallexample
> +
> address@hidden AM_XGETTEXT_OPTION
> +Additional command-line options for the @code{xgettext} invocation can
Is it worth starting with "Alternatively, additional ..." to avoid
sounding like a typo/repetition?
> +also be specified in @file{configure.ac} or in Autoconf macros, through
> +the @code{AM_XGETTEXT_OPTION} macro.
> +
> address@hidden AM_XGETTEXT_OPTION (@var{option})
> +
> +The @code{AM_XGETTEXT_OPTION} macro registers a command-line option to be
> +used in the invocations of @code{xgettext} for POT files.
> +
> +For example, if you have a source file that defines a function
> address@hidden whose fifth argument is a format string, you can use
> address@hidden
> +AM_XGETTEXT_OPTION([--flag=error_at_line:5:c-format])
> address@hidden smallexample
> address@hidden
> +to instruct @code{xgettext} to mark all translatable strings in
> address@hidden invocations that occur as fifth argument to this function
> +as @samp{c-format}.
> +
> +See @pxref{xgettext Invocation, , xgettext, gettext, GNU gettext tools}
s/See @pxref/@xref/ as above.
> +for the list of options that @code{xgettext} accepts.
> address@hidden defmac
> +
> address@hidden Multiple xgettext invocations
> address@hidden Multiple @code{xgettext} invocations
> +
> +If you need to construct a POT file from multiple @code{xgettext}
> +invocations, for example because your package has source code in
> +different programming languages and you need different options for
> +each language, the easiest way to achieve this is through intermediate
> +POT files that get combined into the final POT file:
> +
> address@hidden
> +locale_POTS = maude.pot
> +noinst_POTS = maude1.pot maude2.pot
> +maude_pot_SOURCES = maude1.pot maude2.pot
> +maude1_pot_SOURCES = maude.c
> +maude2_pot_SOURCES = maude.lisp
> address@hidden smallexample
> +
> address@hidden Message catalog translations
> address@hidden Message catalog translations
> +
> +In the simplest case, you (the developer) receive translations through a
> +translation project, such as the Translation Project
> address@hidden://translationproject.org/}, the KDE localization project
> address@hidden://i18n.kde.org/}, or the GNOME localization project
> address@hidden://l10n.gnome.org/}, and include them in your release tarball.
> +In this case, you should declare the set of translations in a _LINGUAS
> +variable, like this:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_LINGUAS = de fr de_AT nl
> address@hidden smallexample
> +
> +This variable should contain the list of languages for which a
> +translation is present, using the notation @var{ll} or @address@hidden
> +
> +Alternatively, you can also have the translation downloaded from a
> +translation project when the package is built. This has the advantage
> +that translations that were complete after the package was released can
> +be included. It has two drawbacks, however:
> +
> address@hidden
> address@hidden
> +No translations can be installed if the build machine does not have a
> +connection to the Internet.
> address@hidden
> +No translations can be installed if the GNU gettext tools are not
> +installed on the build machine.
> address@hidden itemize
Can't we have the cake and eat it too? As in: ship translations but at
configure time also try to download newer ones? That would avoid both
of these downsides, no?
> +To enable this mechanism, specify a _TP_URL variable or a _TP_RSYNC_URI
> +variable, or both. Example:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_TP_URL = http://translationproject.org/latest/
> +po_maude_pot_TP_RSYNC_URI = translationproject.org::tp/latest/
> address@hidden smallexample
What is a _TP_RSYNC_URI? This strikes me as quite specialized name,
can't we find some more general method?
Hmm, maybe this is best decided when we see an actual or proposed
implementation for this.
> +Currently, only the Translation Project is supported in this way.
> +
> address@hidden Message catalog installation
> address@hidden Message catalog installation
> +
> +Normally, when you want translated message catalogs to be installed, you
> +specify the POT file in the @code{locale_POTS} variable. If you don't
> +want them installed (for example, if they are part of a test suite only),
> +you specify the POT file in the @code{noinst_POTS} variable.
> +
> address@hidden
> +* A POT file's _CATALOGFORMAT:: Formats of compiled message catalogs
> +* A POT file's _LOCALE_CATEGORIES:: Using different locale categories
> address@hidden menu
> +
> address@hidden A POT file's _CATALOGFORMAT
> address@hidden A POT file's _CATALOGFORMAT
> +
> +Before installation or distribution, message catalogs need to be compiled
> +to ``compiled message catalogs''. The format of these compiled message
> +catalogs depends on the programming language or base runtime libraries
> +that your package is using. Several formats are supported:
> +
> address@hidden @code
> address@hidden mo
> +This is the GNU @code{.mo} format. It is used for C, C++, and many other
@file{.mo}
> +programming languages.
> address@hidden qm
> +This is the message catalog format of the Qt library
> address@hidden://qt.nokia.com/}.
> address@hidden properties
> +This is the Java @code{.properties} format. For details, see
@file
> address@hidden, , Java, gettext, GNU gettext tools}.
@ref, or s/see @xref/@pxref/
> address@hidden class
> +This is the Java @code{.class} format. For details, see
@file
> address@hidden, , Java, gettext, GNU gettext tools}.
see above
> address@hidden resources.dll
> +This is the message catalog format of C#.
> address@hidden msg
> +This is the message catalog format of Tcl.
> address@hidden table
> +
> +For example:
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_CATALOGFORMAT = msg
> address@hidden smallexample
> +
> +If you don't specify this variable, the default is @code{mo}.
> +
> address@hidden A POT file's _LOCALE_CATEGORIES
> address@hidden A POT file's _LOCALE_CATEGORIES
> +
> +A locale consists of multiple functional areas called ``locale
> +categories''. The locale category for message output is called
> address@hidden, the locale category for date and time formatting is
> +called @code{LC_TIME}, and so on.
> +(@pxref{Aspects, , Aspects in Native Language Support, gettext,
> +GNU gettext tools}). Message catalogs are usually used only with
> +the @code{LC_MESSAGES} locale category, but they can rarely also
> +be useful in other categories.
they can sometimes be useful in other categories
> This set of categories can be specified
> +through the _LOCALE_CATEGORIES variable. For example, GNU coreutils uses
> +its message catalogs for the @code{LC_MESSAGES} and @code{LC_TIME}
> +locale categories:
> +
> address@hidden
> +po_coreutils_pot_LOCALE_CATEGORIES = LC_MESSAGES LC_TIME
> address@hidden smallexample
> +
> +This variable allows a coreutils program, when run in an environment
> +where the @code{LC_TIME} locale category is bound to a different locale
> +than the @code{LC_MESSAGE} locale category, to properly obey the
> address@hidden setting.
> +
> +If not specified, this variable defaults to @code{LC_MESSAGES} only.
> +
> address@hidden Other POT files details
> address@hidden Other POT files details
> +
> +There are two more variables associated with a POT file.
> +
> address@hidden
> +* A POT file's _USE_MSGCTXT:: Declaring whether msgctxt is used
> +* A POT file's _MSGMERGE_OPTIONS:: Specifying options for msgmerge
> address@hidden menu
> +
> address@hidden A POT file's _USE_MSGCTXT
> address@hidden A POT file's _USE_MSGCTXT
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_USE_MSGCTXT = no
> address@hidden smallexample
> +
> +This tells whether the POT file contains messages with an @code{msgctxt}
> +context. Possible values are ``yes'' and ``no''. Set this to yes if the
@samp{yes} and @samp{no}
> +package uses functions taking also a message context, like the
> address@hidden function, or if in _XGETTEXT_OPTIONS you define keywords
> +with a context argument. If set to yes, GNU gettext tools older than
> +gettext 0.15 will not be considered. The default is @code{yes}.
> +
> address@hidden A POT file's _MSGMERGE_OPTIONS
> address@hidden A POT file's _MSGMERGE_OPTIONS
> +
> address@hidden
> +locale_POTS = po/maude.pot
> +po_maude_pot_MSGMERGE_OPTIONS =
> address@hidden smallexample
> +
> +These options get passed to @code{msgmerge}.
@command{msgmerge}
> +Useful options are in particular:
> +
> address@hidden @code
> address@hidden --previous
> +to keep previous msgids of translated messages,
> address@hidden --quiet
> +to reduce the verbosity.
> address@hidden table
> +
> +If this variable contains @code{--previous}, GNU gettext tools older than
> +gettext 0.16 will not be considered. The default is @code{--previous}.
> +
> +
> @node Other GNU Tools
> @chapter Other GNU Tools
The chapter would be even better with some @cindex entries for the
concepts and @vindex for variables, primaries, etc.
With some of the semantics, I don't yet see how the implementation would
work, but it's no problem to fix those things when we get to them.
Especially the moving of Autoconf macros from gettext to Automake will
be tricky to implement without transition ugliness ...
Thanks!
Ralf