[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [sharutils] unwieldy msgids, unnecessary reformatting
From: |
Bruce Korb |
Subject: |
Re: [sharutils] unwieldy msgids, unnecessary reformatting |
Date: |
Sun, 13 Jan 2013 14:07:15 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130105 Thunderbird/17.0.2 |
Hi Benno,
On 01/13/13 13:05, Benno Schulenberg wrote:
>
> (I've reincluded the list in the CC.)
I omitted the entire list because I'm expecting a somewhat boring
discussion. I'd be more interested in translator feedback, because
you-all are more directly affected.
> On Thu, Jan 10, 2013, at 20:47, Bruce Korb wrote:
>> With short fragments, it is easier to translate, but these short
>> fragments get woven into usage text in ways that are disparaged
>> by the docs I've read for i18n text.
The "short fragments" are the long option names and the short
(~40 character) description, e.g. here is the real source that
describes the "level-of-compression" option:
> flag = {
> name = level-of-compression;
> value = g;
> arg-type = number;
> arg-name = LEVEL;
> arg-range = '1->9';
> arg-default = 9;
> descrip = 'pass @file{LEVEL} for compression';
> doc = <<- _EODoc_
> Some compression programs allow for a level of compression. The
> default is @code{9}, but this option allows you to specify something
> else. This value is used by @command{gzip}, @command{bzip2} and
> @command{xz}, but not @command{compress}.
> _EODoc_;
> };
the option line in long usage appears as:
-g, --level-of-compression=num pass LEVEL for compression
That string does not appear anywhere in the source.
It gets pulled together and formatted from the "g", "level-of-compression",
"number", "1->9" and "pass @file{LEVEL} for compression" strings.
So in order to make something that is translatable, I create
a program, emit the help, capture that help text and
put it into the final program. Those strings plus the "doc" string
show up in man pages and texi docs.
>> Specifically, little bits
>> of the usage are emitted with the expectation that a consistent
>> amount of horizontal space is used. That works IFF the source
>> language is the display language.
>
> When a certain indentation needs to be maintained, this is the
> responsibility of the translator. Half of the time I use a slightly
> different indentation than the original, and use it consistently.
>
>> My solution for this woven text problem is to build a version
>> without a combined usage text, print the help with bit-at-a-time
>> text and suck that output into a combined usage string and
>> rebuild. In the rebuild, the short strings will never be
>> used.
>
> In the source code I see for example:
>
> static char const shar_opt_strs[10449] =" [[[enormous string]]] ";
That is intermediary source. I obviously do not hand edit a 10K string.
> In my opinion this is madness... If you want to add a space
> or a word somewhere, you have to figure out and change fifty
> indexes by hand... !
At the top of that file, you will see:
> /* -*- buffer-read-only: t -*- vi: set ro:
> *
> * DO NOT EDIT THIS FILE (shar-opts.c)
> *
> * It has been AutoGen-ed January 11, 2013 at 11:39:24 AM by AutoGen
> 5.17.2pre7
so if you want to add a space, do not do it in that file.
> Is it AutoOpts that requires that the help text be provided as a
> single huge character array?
AutoOpts only requires the strings associated with each option and
the program as a whole. On the theory that gluing all these strings
together would be untranslatable and/or sometimes not yield an
aesthetically pleasing help string, I provided a way of overriding
the computation of the usage text by providing _as an alternative_
the entire usage text as a single string. What I am proposing here
is emitting this long usage a paragraph at a time.
>> Since there is only one source for both texts, getting this to
>> work depends upon coming up with a paragraph splitting algorithm
>> that would split out an exactly matching paragraph. A desirable
>> goal, but might not be easy to do. Please suggest an algorithm
>> while I try to puzzle one out, too. (e.g. separate on every
>> double newline and every line starting with white space.
>> Does that yield something more usable?)
>
> _If the help text needs to be a single huge string, why not "add"
> (concatenate) many small gettexttized strings?
The pieces of the help text are derived from too many sources.
Gluing together little strings is strongly discouraged for
translatable text. Therefore, I am suggesting the splitting up
of the monster string according to a well defined algorithm.
viz. start a new "paragraph" whenever a non-empty line is
preceded by two line breaks or a non-empty line starts with
a few space characters. I *think* that yields something wieldy.
I could also split them one string per line. That is likely
somewhat easier for me, but seems like it would make the
translation task a bit harder. e.g. there would be no guarantee
that every line would be unique and the same line of text might
translate differently in different contexts. I do think splitting
on "paragraphs" would make the translation effort easier, but I
would take whatever suggestion you make.
> hugestring = _("shar (GNU sharutils) 4.13.3\n")
> + _("Copyright (C) 1994-2013 Free Software Foundation, Inc., all rights
> reserved.\n")
> + _("This is free software. It is licensed for use, modification and\n")
> + ...
>
> I have no idea how to actually do this, but it should be possible,
> and then let the program itself work out what the lengths of all
> these substrings are (_if you actually need the indexes).
The indexes are a relatively unimportant implementation detail.
In order to produce libraries that minimize the number of fixups
required at load time, I produced some functions that assemble
massive text strings and #define-d values that reference that huge
table. I could also make for static global strings that go by the
name used in the #define. That eliminates all the offset stuff,
but then the link/loader has more fixup work to do.
- [sharutils] unwieldy msgids, unnecessary reformatting, Benno Schulenberg, 2013/01/09
- Message not available
- Re: [sharutils] unwieldy msgids, unnecessary reformatting, Benno Schulenberg, 2013/01/13
- Re: [sharutils] unwieldy msgids, unnecessary reformatting,
Bruce Korb <=
- Re: [sharutils] unwieldy msgids, unnecessary reformatting, Benno Schulenberg, 2013/01/14
- Re: [sharutils] unwieldy msgids, unnecessary reformatting, Bruce Korb, 2013/01/14
- Re: [sharutils] unwieldy msgids, unnecessary reformatting, Benno Schulenberg, 2013/01/15
- Re: [sharutils] unwieldy msgids, unnecessary reformatting, Bruce Korb, 2013/01/15