bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnulib] address@hidden: GNU Coding Standards, internatialisatio


From: Bruno Haible
Subject: Re: [bug-gnulib] address@hidden: GNU Coding Standards, internatialisation and plurals]
Date: Tue, 23 May 2006 14:21:05 +0200
User-agent: KMail/1.5

Hello,

Michael Thayer writes:

> This has the problem that not all languages treat singular and plural
> the same way as English.  For example, Arabic uses singular, dual and
> plural rather than just singular and plural.  Russian uses a different
> case depending on whether the number ends in 1, in 2-4 or in 5-9, 0 or
> 11-19.  And I believe many languages treat zero as singular (although
> it is probably better to have a sentence like "No files processed" for
> zero).  Newer versions of Gettext take these things into account.

You are entirely right. This has been fixed in gettext and glibc since
1999 or 2000; we missed to update standards.texi.

> Another thing worth mentioning is that it is better to limit strings
> to be translated to one number argument per sentence unit (i.e.
> "Searched %d directories.  Found %d files"  or "Searched %d
> directories and found %d files" rather than "Found %d files in %d
> directories") as some languages may prefer to express the arguments in
> reverse order ("In %d directories found %d files"), which printf would
> probably not take kindly to.

printf has support for swapping argument order, through the "%2$d ... %1$d"
syntax. This is probably out of scope for standards.texi; it is explained
in gettext.texi.

Bruce Korb writes:
> Actually, in reasonably recent versions of printf, this is handled
> via "... %2$d ... %1$d...".  That syntax was added specifically for
> this issue.  printf libraries that do not support this are now very
> old.

Well, NetBSD and Woe32 still ship with such libraries. But <libintl.h>
contained in GNU gettext has a fix for it.

Paul Eggert writes:
> It might be helpful to recommend diagnostics that don't use singular
> or plural at all, e.g.:
> 
>   printf ("Files processed: %d", nfiles);
> 
> This is equally awkward in almost all languages (:-), and bypasses the
> singular/plural/etc. mess.

ngettext solves this. The only case where such a rewriting is useful is
when the %d is a text field or number spinner where the user can enter
0, 1, or any integer, and the label surrounding the text field is static
(isn't updated each time the user changes the value).

Karl,

Find here a patch.


2006-05-22  Bruno Haible  <address@hidden>

        * standards.texi (Internationalization): Change the example with
        plurals to use the ngettext function, and move it to the end. Add
        a new example showing the need for entire sentences.

*** standards.texi.bak  2006-04-10 00:54:34.000000000 +0200
--- standards.texi      2006-05-23 04:26:39.000000000 +0200
***************
*** 2952,2958 ****
  name} for the package.  The text domain name is used to separate the
  translations for this package from the translations for other packages.
  Normally, the text domain name should be the same as the name of the
! package---for example, @samp{fileutils} for the GNU file utilities.
  
  @cindex message text, and internationalization
  To enable gettext to work well, avoid writing code that makes
--- 2952,2958 ----
  name} for the package.  The text domain name is used to separate the
  translations for this package from the translations for other packages.
  Normally, the text domain name should be the same as the name of the
! package---for example, @samp{coreutils} for the GNU core utilities.
  
  @cindex message text, and internationalization
  To enable gettext to work well, avoid writing code that makes
***************
*** 2965,3008 ****
  Here is an example of what not to do:
  
  @example
! printf ("%d file%s processed", nfiles,
!         nfiles != 1 ? "s" : "");
  @end example
  
! @noindent
! The problem with that example is that it assumes that plurals are made
! by adding `s'.  If you apply gettext to the format string, like this,
  
  @example
! printf (gettext ("%d file%s processed"), nfiles,
!         nfiles != 1 ? "s" : "");
  @end example
  
  @noindent
! the message can use different words, but it will still be forced to use
! `s' for the plural.  Here is a better way:
! 
! @example
! printf ((nfiles != 1 ? "%d files processed"
!          : "%d file processed"),
!         nfiles);
! @end example
  
! @noindent
! This way, you can apply gettext to each of the two strings
! independently:
  
  @example
! printf ((nfiles != 1 ? gettext ("%d files processed")
!          : gettext ("%d file processed")),
!         nfiles);
  @end example
  
- @noindent
- This can be any method of forming the plural of the word for ``file'', and
- also handles languages that require agreement in the word for
- ``processed''.
- 
  A similar problem appears at the level of sentence structure with this
  code:
  
--- 2965,2994 ----
  Here is an example of what not to do:
  
  @example
! printf ("%s is full", capacity > 5000000 ? "disk" : "floppy disk");
  @end example
  
! If you apply gettext to all strings, like this,
  
  @example
! printf (gettext ("%s is full"),
!         capacity > 5000000 ? gettext ("disk") : gettext ("floppy disk"));
  @end example
  
  @noindent
! the translator will hardly know that "disk" and "floppy disk" are meant to
! be substituted in the other string.  Worse, in some languages (like French)
! the construction will not work: the translation of the word "full" depends
! on the gender of the first part of the sentence; it happens to be not the
! same for "disk" as for "floppy disk".
  
! Complete sentences can be translated without problems:
  
  @example
! printf (capacity > 5000000 ? gettext ("disk is full")
!         : gettext ("floppy disk is full"));
  @end example
  
  A similar problem appears at the level of sentence structure with this
  code:
  
***************
*** 3024,3029 ****
--- 3010,3052 ----
          : "#  Implicit rule search has not been done.\n");
  @end example
  
+ Another example is this one:
+ 
+ @example
+ printf ("%d file%s processed", nfiles,
+         nfiles != 1 ? "s" : "");
+ @end example
+ 
+ @noindent
+ The problem with this example is that it assumes that plurals are made
+ by adding `s'.  If you apply gettext to the format string, like this,
+ 
+ @example
+ printf (gettext ("%d file%s processed"), nfiles,
+         nfiles != 1 ? "s" : "");
+ @end example
+ 
+ @noindent
+ the message can use different words, but it will still be forced to use
+ `s' for the plural.  Here is a better way, with gettext being applied to
+ the two strings independently:
+ 
+ @example
+ printf ((nfiles != 1 ? gettext ("%d files processed")
+          : gettext ("%d file processed")),
+         nfiles);
+ @end example
+ 
+ @noindent
+ But this still doesn't work for languages like Polish, which has three
+ plural forms: one for nfiles == 1, one for nfiles == 2, 3, 4, 22, 23, 24, ...
+ and one for the rest.  The GNU @code{ngettext} function solves this problem:
+ 
+ @example
+ printf (ngettext ("%d files processed", "%d file processed", nfiles),
+         nfiles);
+ @end example
+ 
  
  @node Character Set
  @section Character Set





reply via email to

[Prev in Thread] Current Thread [Next in Thread]