groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Bullets in manual pages and -K groff option


From: Alexander E. Patrakov
Subject: Re: [Groff] Bullets in manual pages and -K groff option
Date: Thu, 26 Jan 2006 21:35:57 +0500
User-agent: Debian Thunderbird 1.0.2 (X11/20051002)

Bruno Haible wrote:

As for the "iconv" program from glibc, the situation is worse. I have
prepared a patch against Glibc-2.3.6 (attached) that transliterates the
offending characters produced by Groff into their ASCII equivalents if
there is no any other suitable fallback. You can try it without
rebuilding glibc by applying it to the installed copy of the
"translit_neutral" file (in /usr/share/i18n/locales) and rebuilding all
locales with localedef. The patch works in all locales except "C" (see
below) ... Is this patch a right solution?

The BULLET, PRIME and DOT/ELLIPSIS parts are probably acceptable.

The ACUTE ACCENT part looks wrong.
But libiconv also transliterates it to "'" :)

 1. An acute accent is not a quoting character. Anyone using an acute
    accent for quoting is abusing this character.
Agreed, Groff should be fixed. Also it probably should use Unicode bullets (not middle dots) for bullets.

 2. U+0027 is an apostrophe, a small vertical line, that doesn't change
    when mirrored left<->right.
That's still better than a question mark or nothing.

When you submit a patch for "translit_neutral", you also need to make
the corresponding changes to locale/C-translit.h.in.
Corrected patch attached. Parts that you disagreed with are commented out.

I would split this into two different patches, simply to increase the
chances of having at least one of them accepted. - As I said above,
transliterating ACUTE ACCENT to APOSTROPHE is simply wrong.
I will split the ACUTE ACCENT part to a separate patch as soon as you comment upon the behaviour of libiconv in this case.

The revised text of the bug report:

Subject: Transliterate quotes and bullets in all locales.
Component: localedata
Description:
The iconv function from libiconv performs some useful transliterations (e.g., replacing the quote-like characters with their ASCII equivalents and the middle dot with ASCII dot) in all locales. Iconv implementation from Glibc doesn't always do this. Such deficiency is going to hurt future Groff users, as described in [link to this thread]. Attached is a patch that implements the needed transliteration rules. The ACUTE ACCENT part has been commented out because Bruno Haible thinks it should not be done in this way, but I disagree with him because APOSTROPHE is better than a question mark as a replacement for ACUTE ACCENT. Implement this as you wish, but note that Groff does abuse this ACUTE ACCENT, and without the commented-out parts iconv does replace it with a question mark in some locales.

--
Alexander E. Patrakov
Submitted By: Alexander E. Patrakov
Date: 2006-01-26
Initial Package Version: 2.3.6
Upstream Status: Discussing
Origin: Alexander E. Patrakov
Description: Transliterates some characters (e.g., ones created by groff -Tutf8)
into their ASCII approximations.
--- glibc-2.3.6/locale/C-translit.h.in  2002-04-20 13:16:46.000000000 +0600
+++ glibc-2.3.6/locale/C-translit.h.in  2006-01-26 19:50:35.000000000 +0500
@@ -25,7 +25,9 @@
 "\x00ab"       "<<"    /* <U00AB> LEFT-POINTING DOUBLE ANGLE QUOTATION MARK */
 "\x00ad"       "-"     /* <U00AD> SOFT HYPHEN */
 "\x00ae"       "(R)"   /* <U00AE> REGISTERED SIGN */
+/* "\x00b4"    "'" */  /* <U00B4> ACUTE ACCENT */
 "\x00b5"       "u"     /* <U00B5> MICRO SIGN */
+"\x00b7"       "."     /* <U00B7> MIDDLE DOT */
 "\x00b8"       ","     /* <U00B8> CEDILLA */
 "\x00bb"       ">>"    /* <U00BB> RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK */
 "\x00bc"       " 1/4 " /* <U00BC> VULGAR FRACTION ONE QUARTER */
@@ -52,9 +54,12 @@
 "\x01f1"       "DZ"    /* <U01F1> LATIN CAPITAL LETTER DZ */
 "\x01f2"       "Dz"    /* <U01F2> LATIN CAPITAL LETTER D WITH SMALL LETTER Z */
 "\x01f3"       "dz"    /* <U01F3> LATIN SMALL LETTER DZ */
+"\x02b9"       "'"     /* <U02B9> MODIFIER LETTER PRIME */
+"\x02ba"       "''"    /* <U02BA> MODIFIER LETTER DOUBLE PRIME */
 "\x02bc"       "'"     /* <U02BC> MODIFIER LETTER APOSTROPHE */
 "\x02c6"       "^"     /* <U02C6> MODIFIER LETTER CIRCUMFLEX ACCENT */
 "\x02c8"       "'"     /* <U02C8> MODIFIER LETTER VERTICAL LINE */
+/* "\x02ca"    "`" */  /* <U02CA> MODIFIER LETTER ACUTE ACCENT */
 "\x02cb"       "`"     /* <U02CB> MODIFIER LETTER GRAVE ACCENT */
 "\x02cd"       "_"     /* <U02CD> MODIFIER LETTER LOW MACRON */
 "\x02d0"       ":"     /* <U02D0> MODIFIER LETTER TRIANGULAR COLON */
@@ -88,6 +93,9 @@
 "\x2025"       ".."    /* <U2025> TWO DOT LEADER */
 "\x2026"       "..."   /* <U2026> HORIZONTAL ELLIPSIS */
 "\x202f"       " "     /* <U202F> NARROW NO-BREAK SPACE */
+"\x2032"       "'"     /* <U2032> PRIME */
+"\x2033"       "''"    /* <U2033> DOUBLE PRIME */
+"\x2034"       "'''"   /* <U2034> TRIPLE PRIME */
 "\x2035"       "`"     /* <U2035> REVERSED PRIME */
 "\x2036"       "``"    /* <U2036> REVERSED DOUBLE PRIME */
 "\x2037"       "```"   /* <U2037> REVERSED TRIPLE PRIME */
@@ -199,6 +207,7 @@
 "\x2215"       "/"     /* <U2215> DIVISION SLASH */
 "\x2216"       "\\"    /* <U2216> SET MINUS */
 "\x2217"       "*"     /* <U2217> ASTERISK OPERATOR */
+"\x2219"       "o"     /* <U2219> BULLET OPERATOR */
 "\x2223"       "|"     /* <U2223> DIVIDES */
 "\x2236"       ":"     /* <U2236> RATIO */
 "\x223c"       "~"     /* <U223C> TILDE OPERATOR */
@@ -206,8 +215,10 @@
 "\x2265"       ">="    /* <U2265> GREATER-THAN OR EQUAL TO */
 "\x226a"       "<<"    /* <U226A> MUCH LESS-THAN */
 "\x226b"       ">>"    /* <U226B> MUCH GREATER-THAN */
+"\x22c5"       "."     /* <U22C5> DOT OPERATOR */
 "\x22d8"       "<<<"   /* <U22D8> VERY MUCH LESS-THAN */
 "\x22d9"       ">>>"   /* <U22D9> VERY MUCH GREATER-THAN */
+"\x22ef"       "..."   /* <U22EF> MIDLINE HORIZONTAL ELLIPSIS */
 "\x2400"       "NUL"   /* <U2400> SYMBOL FOR NULL */
 "\x2401"       "SOH"   /* <U2401> SYMBOL FOR START OF HEADING */
 "\x2402"       "STX"   /* <U2402> SYMBOL FOR START OF TEXT */
--- glibc-2.3.6/localedata/locales/translit_neutral     2002-04-20 
13:14:27.000000000 +0600
+++ glibc-2.3.6/localedata/locales/translit_neutral     2006-01-26 
19:39:01.000000000 +0500
@@ -26,6 +26,10 @@
 <U00AD> <U002D>
 % REGISTERED SIGN
 <U00AE> "<U0028><U0052><U0029>"
+% ACUTE ACCENT
+% <U00B4> <U0027>
+% MIDDLE DOT
+<U00B7> <U002E>
 % CEDILLA
 <U00B8> <U002C>
 % RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
@@ -39,9 +43,9 @@
 % LATIN SMALL LETTER AE
 <U00E6> "<U0061><U0065>"
 % MODIFIER LETTER PRIME
-<U02B9> <U2032>;<U00B4>
+<U02B9> <U2032>;<U00B4>;<U0027>
 % MODIFIER LETTER DOUBLE PRIME
-<U02BA> <U2033>;"<U00B4><U00B4>"
+<U02BA> <U2033>;"<U00B4><U00B4>";"<U0027><U0027>"
 % MODIFIER LETTER TURNED COMMA
 <U02BB> <U2018>
 % MODIFIER LETTER APOSTROPHE
@@ -56,6 +60,7 @@
 <U02C9> <U00AF>
 % MODIFIER LETTER ACUTE ACCENT
 <U02CA> <U00B4>
+% <U02CA> <U00B4>;<U0027>
 % MODIFIER LETTER GRAVE ACCENT
 <U02CB> <U0060>
 % MODIFIER LETTER LOW MACRON
@@ -101,11 +106,11 @@
 % NARROW NO-BREAK SPACE
 <U202F> <U00A0>;<U0020>
 % PRIME
-<U2032> <U00B4>
+<U2032> <U00B4>;<U0027>
 % DOUBLE PRIME
-<U2033> "<U2032><U2032>";"<U00B4><U00B4>"
+<U2033> "<U2032><U2032>";"<U00B4><U00B4>";"<U0027><U0027>"
 % TRIPLE PRIME
-<U2034> "<U2032><U2032><U2032>";"<U00B4><U00B4><U00B4>"
+<U2034> "<U2032><U2032><U2032>";"<U00B4><U00B4><U00B4>";"<U0027><U0027><U0027>"
 % REVERSED PRIME
 <U2035> <U0060>
 % REVERSED DOUBLE PRIME
@@ -155,7 +160,7 @@
 % ASTERISK OPERATOR
 <U2217> <U002A>
 % BULLET OPERATOR
-<U2219> <U2022>;<U00B7>
+<U2219> <U2022>;<U00B7>;<U006F>
 % DIVIDES
 <U2223> <U007C>
 % RATIO
@@ -171,13 +176,13 @@
 % MUCH GREATER-THAN
 <U226B> "<U003E><U003E>"
 % DOT OPERATOR
-<U22C5> <U00B7>
+<U22C5> <U00B7>;<U002E>
 % VERY MUCH LESS-THAN
 <U22D8> "<U003C><U003C><U003C>"
 % VERY MUCH GREATER-THAN
 <U22D9> "<U003E><U003E><U003E>"
 % MIDLINE HORIZONTAL ELLIPSIS
-<U22EF> "<U00B7><U00B7><U00B7>"
+<U22EF> "<U00B7><U00B7><U00B7>";"<U002E><U002E><U002E>"
 % SYMBOL FOR NULL
 <U2400> "<U004E><U0055><U004C>"
 % SYMBOL FOR START OF HEADING

reply via email to

[Prev in Thread] Current Thread [Next in Thread]