[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gettext] fewer open() calls done by gettext()
From: |
Bruno Haible |
Subject: |
[bug-gettext] fewer open() calls done by gettext() |
Date: |
Tue, 17 Jan 2012 00:17:26 +0100 |
User-agent: |
KMail/4.7.4 (Linux/3.1.0-1.2-desktop; KDE/4.7.4; x86_64; ; ) |
Hi Ulrich,
Since the beginning, gettext()'s lookup of message catalogs has
searched the paths
$LOCALEDIR/$ll_$CC.$CHARSET/LC_MESSAGES/$domain.mo
$LOCALEDIR/$ll_$CC/LC_MESSAGES/$domain.mo
$LOCALEDIR/$ll.$CHARSET/LC_MESSAGES/$domain.mo
$LOCALEDIR/$ll/LC_MESSAGES/$domain.mo
if the locale is specified as $ll_$CC.$CHARSET.
In a typical program (attached below), this leads to 6 system calls,
and the .mo file is usually only found at the last of these 6 calls:
$ strace ./prog 2>&1 | grep ^open | grep prog.mo
open("/tmp/./fr_FR.UTF-8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such
file or directory)
open("/tmp/./fr_FR.utf8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such
file or directory)
open("/tmp/./fr_FR/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/tmp/./fr.UTF-8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file
or directory)
open("/tmp/./fr.utf8/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file
or directory)
open("/tmp/./fr/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or
directory)
I would suggest to reduce this to 2 calls:
$ strace ./prog 2>&1 | grep ^open | grep prog.mo
open("/tmp/./fr_FR/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or
directory)
open("/tmp/./fr/LC_MESSAGES/prog.mo", O_RDONLY) = -1 ENOENT (No such file or
directory)
Rationale:
The use-case of storing different .mo files in
fr/LC_MESSAGES/prog.mo and fr.UTF-8/LC_MESSAGES/prog.mo
or
fr/LC_MESSAGES/prog.mo and fr.ISO-8859-1/LC_MESSAGES/prog.mo
or
fr.UTF-8/LC_MESSAGES/prog.mo and fr.ISO-8859-1/LC_MESSAGES/prog.mo
is when translators would want to use different kinds of characters
(quotation characters or so), i.e. have one PO file for the UTF-8
locale and a different PO file for the more restricted character set.
Or when Japanese people did not trust the conversion between JISX character
sets and Unicode and therefore wanted to maintain a separate PO file
for EUC-JP.
But
1. Translators never did this.
2. In the future, translators will even less need it than in the past.
Nowadays most PO files (even Japanese ones) are submitted in UTF-8
encodings, and most users are in UTF-8 locales. It will therefore
never make sense any more to have a PO file specialized for a non-
Unicode locale charset.
Do you think this optimization is worth doing?
If this is OK with you, I can prepare the patch of intl/l10nflist.c
(of course, taking care to not modify the behaviour of locale/findlocale.c).
Bruno
How to reproduce:
$ gcc -Wall prog.c -o prog
$ strace ./prog 2>&1 | grep ^open | grep prog.mo
============================== prog.c ================================
#include <libintl.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
int main ()
{
int n = 2;
setenv ("LC_ALL", "fr_FR.UTF-8", 1);
if (setlocale (LC_ALL, "") == NULL)
/* Couldn't set locale. */
exit (77);
textdomain ("prog");
bindtextdomain ("prog", ".");
printf (gettext ("'Your command, please?', asked the waiter."));
printf ("\n");
printf (ngettext ("a piece of cake", "%d pieces of cake", n), n);
printf ("\n");
printf (gettext ("%s is replaced by %s."), "FF", "EUR");
printf ("\n");
exit (0);
}
======================================================================
- [bug-gettext] fewer open() calls done by gettext(),
Bruno Haible <=