[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] Latin-2 woes...
From: |
Ted Harding |
Subject: |
Re: [Groff] Latin-2 woes... |
Date: |
Thu, 12 Oct 2000 10:10:00 +0100 (BST) |
On 11-Oct-00 Lukasz Wiechec wrote:
>> Nevertheless, the general solution is to write a proper input encoding
>> file to map Latin-2 to glyph names; something like
>>
>> .char £ \[/L]
>> .char ³ \[/l]
>> ...
>
> I don't quite follow. '.char' isn't a groff macro, is it ? (4 letters
> ?)
1. '.char' is indeed a groff macro (groff names can be any length)
but it will not work in "compatibility" mode since traditional
troff names can be at most two letters long.
.char \[name] string
defines a character whose name is "name" and which is constructed
by formatting "string". You then use it by putting '\[name]' where
you want it in the input text.
If a character with name "name" already exists, then it is replaced
by the definition. Any character can be treated in this way
(for instance, you can redefine the ordinary English character "a",
as I do for Cyrillic, for instance:
.de Cyrillic
.ft AntCy
.ftr Cy AntCy
.char \(yu \N'192'
.char a \N'193'
.............
2. The groff command '.rchar \[name]' removes the definition. This
gives rise to one of two situations.
a) The character-name "name" can be found in the font file for
the current font or a currently-searchable Special font. In
this case the character with that name will be used as though
the ".char" definition had never been given.
b) The character-name "name" will disappear for ever. In particular,
if you give one definition for '.char \[name] ... ' and then
another definition '.char \[name] ... ' followed later by
'.rchar \[name]' then the first definition is gone too.
Character definitions do not "stack", and if you want the first
one again then you must redefine it.
Therefore, for instance, if (using my .Cyrillic which defines Cyrillic
characters as above, and my ./Cyrillic which undoes all the definitions)
I write
.Cyrillic
Vladimir Putin
./Cyrillic
Vladimir Putin
the first "Vladimir Putin" will print in Cyrillic, and the second will
print in English, since all these characters can still be found in the
standard fonts. On the other hand,
.Cyrillic
\[Ch]e\[ch]ni\[ya]
./Cyrillic
\[Ch]e\[ch]ni\[ya]
will first print the Cyrillic version of the name which is written
"Chechniya" in English, and then print "eni" in English, since \[Ch],
\[ch] and \[ya] have been removed and are not in the standard font files,
while e, n and i can still be found.
Werner's suggestion anounts to the following.
In the standard PostScript fonts, there are characters with PostScript
names "Lslash" and "lslash": the glyphs for these form part of standard
PostScript and so do not need separate definition.
If you look in one of the devps font files (say .../groff/font/devps/TR)
you will find
....
/l 278,683 2 0234 lslash
....
/L 611,662 2 0237 Lslash
....
so these characters have groff names \[/l] and \[/L] already defined
in the font files: therefore ".char anything \[/L]" means that whenever
a character named "anything" occurs in the input stream, the string
"\[/L]" is used instead. Therefore this will print as intended.
In the Latin-1 encoding (which is what groff recognises), the "£" sign
occurs in the same position as "Lslash" in Latin-2. Therefore, when
groff sees the input-byte corresponding to "£" in Latin-1, it consults
the ".char" definition and replaces it with "\[/L]. This does not involve
any builtin "knowledge" of Latin-2 encoding: it is the user who has
supplied this in the ".char" definition:
.char £ \[/L]
Unfortunately for Poles and others, the ready-made glyphs in the
standard PostScript fonts do not cover all the needed possibilities,
and you will then have to define your characters with strings which
cause them to be directly constructed. For this, fortunately the
standard PostScript fonts contain glyphs for a variety of accents.
For instance, the ms macros allow a character to be used as an
"accent-under" for the preceding character, in particular "ogonek"
(PostScript name) for which the groff name is "\[ho]" ("hook"):
ho 333,0,165 0 0230 ogonek
So you can define the string \*[ogonek] which performs the
correct placement of this accent under the preceding character by
.acc*under-def ogonek \[ho]
.char \[o,] o\*[ogonek]
and then use \[o,] whenever you need "o-ogonek".
I had been meaning to pick up on this topic earlier, giving a
complete repertoire of Latin-2 translations, but I have been too
taken up with other things recently to finish it. However, I will
do it later.
Meanwhile, I hope the above at least points the way.
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <address@hidden>
Fax-to-email: +44 (0)870 284 7749
Date: 12-Oct-00 Time: 10:10:00
------------------------------ XFMail ------------------------------
- [Groff] Hello to you all !, Lukasz Wiechec, 2000/10/04
- Re: [Groff] Hello to you all !, Werner LEMBERG, 2000/10/06
- Re: [Groff] Hello to you all !, Lukasz Wiechec, 2000/10/08
- Re: [Groff] Hello to you all !, Werner LEMBERG, 2000/10/08
- Re: [Groff] Hello to you all ! a digression..., Thomas Baruchel, 2000/10/08
- Re: [Groff] Latin-2 woes..., Lukasz Wiechec, 2000/10/08
- Re: [Groff] Latin-2 woes..., Werner LEMBERG, 2000/10/10
- Re: [Groff] Latin-2 woes..., Lukasz Wiechec, 2000/10/11
- Re: [Groff] Latin-2 woes..., Ralph Corderoy, 2000/10/12
- Re: [Groff] Latin-2 woes...,
Ted Harding <=
- Re: [Groff] Latin-2 woes..., Werner LEMBERG, 2000/10/12