groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 25/25: [troff]: Implement new `phw` request.


From: G. Branden Robinson
Subject: [groff] 25/25: [troff]: Implement new `phw` request.
Date: Sat, 4 Nov 2023 01:02:58 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit 0b40885e71810ac068ae73b4448cbdd8a64dd777
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Fri Nov 3 20:32:48 2023 -0500

    [troff]: Implement new `phw` request.
    
    * src/roff/troff/env.cpp (print_hyphenation_exceptions): Add.
      (init_hyphen_requests): Wire up `phw` request name to
      `print_hyphenation_exceptions()`.
    
    * doc/groff.texi (Manipulating Hyphenation, Debugging):
    * man/groff.7.man (Request short reference):
    * man/groff_diff.7.man (New requests):
    * NEWS: Document it.
    
    Inspired by a debugging process (that ultimately involved input
    character encoding confusion) on the groff mailing list, raised by
    Walter Alejandro Iglesias.  See
    <https://lists.gnu.org/archive/html/groff/2023-09/msg00032.html> and
    <https://lists.gnu.org/archive/html/groff/2023-10/msg00008.html> and
    follow-ups.  It my opinion it should have been easier to ask the
    formatter where it thought a hyphenation exception's hyphenation points
    were.
    
    Illustration:
    
    $ printf '.hw foo-bar\n.phw\n' | ./build/test-groff 2>&1 | grep -C3 foo
    -cohen  *
    micro-organ-ism         *
    -duane  *
    foo-bar
    -engle  *
    micro-organ-isms        *
    -engel  *
    
    ANNOUNCE: Acknowledge Walter Alejandro Iglesias.
---
 ANNOUNCE               |  1 +
 ChangeLog              | 23 +++++++++++++++++++++++
 NEWS                   |  3 +++
 doc/groff.texi         | 17 ++++++++++++++++-
 man/groff.7.man        | 20 ++++++++++++++++++++
 man/groff_diff.7.man   | 21 +++++++++++++++++++++
 src/roff/troff/env.cpp | 37 +++++++++++++++++++++++++++++++++++++
 7 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/ANNOUNCE b/ANNOUNCE
index 7c47278e3..9c7294263 100644
--- a/ANNOUNCE
+++ b/ANNOUNCE
@@ -192,4 +192,5 @@ Phil Chadwick
 Ralph Corderoy
 Thérèse Godefroy
 Thorsten Glaser
+Walter Alejandro Iglesias
 наб
diff --git a/ChangeLog b/ChangeLog
index 2f0740e03..dea574c20 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,26 @@
+2023-11-03  G. Branden Robinson <g.branden.robinson@gmail.com>
+
+       [troff]: Implement new `phw` request.
+
+       * src/roff/troff/env.cpp (print_hyphenation_exceptions): Add.
+       (init_hyphen_requests): Wire up `phw` request name to
+       `print_hyphenation_exceptions()`.
+
+       * doc/groff.texi (Manipulating Hyphenation, Debugging):
+       * man/groff.7.man (Request short reference):
+       * man/groff_diff.7.man (New requests):
+       * NEWS: Document it.
+
+       Inspired by a debugging process (that ultimately involved
+       input character encoding confusion) on the groff mailing list,
+       raised by Walter Alejandro Iglesias.  See
+       <https://lists.gnu.org/archive/html/groff/2023-09/\
+       msg00032.html> and
+       <https://lists.gnu.org/archive/html/groff/2023-10/\
+       msg00008.html> and follow-ups.  It my opinion it should have
+       been easier to ask the formatter where it thought a hyphenation
+       exception's hyphenation points were.
+
 2023-11-03  G. Branden Robinson <g.branden.robinson@gmail.com>
 
        * src/roff/troff/reg.cpp (alter_format): Slightly refactor.
diff --git a/NEWS b/NEWS
index 8bc2081ed..65dc8617e 100644
--- a/NEWS
+++ b/NEWS
@@ -26,6 +26,9 @@ o The output now reports unbreakable spaces (those produced 
with the
 o A new read-only, string-valued register, `.trap`, interpolates the
   name of the next vertical position trap that will be sprung.
 
+o A new request, `phw`, reports to the standard error stream the current
+  list of hyphenation exceptions.
+
 eqn
 ---
 
diff --git a/doc/groff.texi b/doc/groff.texi
index c3ce61b89..974f244d9 100644
--- a/doc/groff.texi
+++ b/doc/groff.texi
@@ -8548,7 +8548,7 @@ even possible is an unsolved problem in computer 
science:@:
 unusual words found in technical literature.  We can instruct GNU
 @code{troff} how to hyphenate specific words if the need arises.
 
-@c TODO: Add requests `phw`, `rhw`?
+@c TODO: Add request `rhw`?
 @cindex hyphenation exceptions
 @Defreq {hw, word @dots{}}
 Define each @dfn{hyphenation exception} @var{word} with each hyphen `-'
@@ -8579,6 +8579,9 @@ below) and environment (@pxref{Environments}); invoking 
the @code{hw}
 request in the absence of a hyphenation language is an error.
 
 The request is ignored if there are no parameters.
+
+You can obtain a report of hyphenation exceptions on the standard error
+stream with the @code{phw} request.  @xref{Debugging}.
 @endDefreq
 
 These are known as hyphenation @slanted{exceptions} in the expectation
@@ -16924,6 +16927,18 @@ Report the state of the current environment followed 
by that of all
 other environments to the standard error stream.
 @endDefreq
 
+@Defreq {phw, }
+@cindex dumping hyphenation exceptions (@code{phw})
+@cindex hyphenation exceptions, dumping (@code{phw})
+@cindex exceptions, hyphenation, dumping (@code{phw})
+Report, to the standard error stream, the list of hyphenation
+exceptions.  Each hyphenation point is marked with @samp{-}.  Words that
+will not be hyphenated at all are prefixed with @samp{-}.  Those to
+which the hyphenation mode applies (meaning those defined in a
+hyphenation pattern file rather than with the @code{hw} request) are
+suffixed with a tab and asterisk (@code{*}).
+@endDefreq
+
 @Defreq {pm, }
 @cindex dumping symbol table (@code{pm})
 @cindex symbol table, dumping (@code{pm})
diff --git a/man/groff.7.man b/man/groff.7.man
index 2b1ffb6ba..45c76833f 100644
--- a/man/groff.7.man
+++ b/man/groff.7.man
@@ -3796,6 +3796,26 @@ Report the state of the current environment followed by 
that of all
 other environments to the standard error stream.
 .
 .TPx
+.REQ .phw
+Report,
+to the standard error stream,
+the list of hyphenation exceptions.
+.
+Each hyphenation point is marked with
+.RB \[lq] \- \[rq].
+.
+Words that will not be hyphenated at all are prefixed with
+.RB \[lq] \- \[rq].
+.
+Those to which the hyphenation mode applies
+(meaning those defined in a hyphenation pattern file rather than with
+the
+.B hw
+request)
+are suffixed with a tab and asterisk
+.RB ( * ).
+.
+.TPx
 .REQ .pi "program"
 Pipe output to
 .I program
diff --git a/man/groff_diff.7.man b/man/groff_diff.7.man
index 943ee0e35..556da1b39 100644
--- a/man/groff_diff.7.man
+++ b/man/groff_diff.7.man
@@ -3027,6 +3027,27 @@ other environments to the standard error stream.
 .
 .
 .TP
+.B .phw
+Report,
+to the standard error stream,
+the list of hyphenation exceptions.
+.
+Each hyphenation point is marked with
+.RB \[lq] \- \[rq].
+.
+Words that will not be hyphenated at all are prefixed with
+.RB \[lq] \- \[rq].
+.
+Those to which the hyphenation mode applies
+(meaning those defined in a hyphenation pattern file rather than with
+the
+.B hw
+request)
+are suffixed with a tab and asterisk
+.RB ( * ).
+.
+.
+.TP
 .B .pnr
 Write the names and values of all currently defined registers to the
 standard error stream.
diff --git a/src/roff/troff/env.cpp b/src/roff/troff/env.cpp
index 30a6d82a7..133e843d8 100644
--- a/src/roff/troff/env.cpp
+++ b/src/roff/troff/env.cpp
@@ -3666,6 +3666,42 @@ static void add_hyphenation_exceptions()
   skip_line();
 }
 
+static void print_hyphenation_exceptions()
+{
+  dictionary_iterator iter(current_language->exceptions);
+  symbol entry;
+  unsigned char *hypoint;
+  // Pathologically, we could have a hyphenation point after every
+  // character in a word except the last.  The word may have a trailing
+  // space; see `hyphen_trie::read_patterns_file()`.
+  const size_t bufsz = WORD_MAX * 2;
+  char wordbuf[bufsz];
+  while(iter.get(&entry, reinterpret_cast<void **>(&hypoint))) {
+    assert(!entry.is_null());
+    assert(hypoint != 0 /* nullptr */);
+    string word = entry.contents();
+    (void) memset(wordbuf, '\0', bufsz);
+    size_t i = 0, j = 0, len = word.length();
+    bool is_mode_independent = false;
+    while (i < len) {
+      if ((hypoint != 0 /* nullptr */) && (*hypoint == i)) {
+       wordbuf[j++] = '-';
+       hypoint++;
+      }
+      if (word[i] == ' ') {
+       assert(i == (len - 1));
+       is_mode_independent = true;
+      }
+      wordbuf[j++] = word[i++];
+    }
+    errprint("%1", wordbuf);
+    if (is_mode_independent)
+      errprint("\t*");
+    errprint("\n");
+  }
+  skip_line();
+}
+
 struct trie_node {
   char c;
   trie_node *down;
@@ -4159,6 +4195,7 @@ const char *hyphenation_language_reg::get_string()
 void init_hyphen_requests()
 {
   init_request("hw", add_hyphenation_exceptions);
+  init_request("phw", print_hyphenation_exceptions);
   init_request("hla", select_hyphenation_language);
   init_request("hpf", hyphenation_patterns_file);
   init_request("hpfa", hyphenation_patterns_file_append);



reply via email to

[Prev in Thread] Current Thread [Next in Thread]