m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

argv_ref patch 20: make m4wrap FIFO


From: Eric Blake
Subject: argv_ref patch 20: make m4wrap FIFO
Date: Tue, 18 Mar 2008 20:23:04 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.12) Gecko/20080213 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Another round of patches.  This fixes m4wrap to obey POSIX (and match both
Solaris and BSD m4 behavior) with regards to multiple m4wraps being
processed in fifo order.  I've tested autoconf with this change in
behavior, and fortunately we updated it a while ago to be order-agnostic;
however, I'm afraid that there may be other m4 clients out there that were
(non-portably) depending on GNU's lifo behavior.  Hence, the manual
includes a portable formula for restoring lifo behavior, even on non-GNU
m4 implementations.

The patch reuses the plumbing introduced for back-referenced text to be
rescanned in fifo order; however, it needed one more piece, which was the
ability to store location changes within the fifo chain.  There should be
no memory impact from this patch, and m4wrap might be slightly slower.  On
the other hand, this patch has a noticeable speedup (1-2%) due to an
optimization I noticed while working on supporting locations in the input
engine.  Before the argv_ref series, next_char would consume a builtin, so
you had to call peek_input, check if it was a builtin, and if not, call
next_char to consume it.  But as of a couple patches ago, next_char will
infloop on a builtin token, behaving like peek_input (you have to call
init_macro_token to advance past it).  As a result, it is now safe to
blindly skip peek_input and call next_char at the start of next_token, and
thus avoid a lot of ungetc and other function calls.

2008-03-19  Eric Blake  <address@hidden>

        Stage 20: make m4wrap obey POSIX fifo ordering.
        Improve input engine to support location changes within symbol
        chains, then convert m4wrap to always build symbol chain.  Also,
        avoid wasted peek at start of next_token, for fewer ungetc calls.
        Memory impact: none.
        Speed impact: noticeable improvement, from fewer function calls.
        * src/m4.h (enum token_chain_type): Add CHAIN_LOC.
        (struct token_chain): Add u_l member.
        (wrap_args): New prototype.
        * src/input.c (push_wrapup_init, push_wrapup_finish): Rewrite to
        guarantee a FIFO chain in the wrapup stack.
        (pop_input, peek_input, next_char_1): Support location link.
        (next_char): Add parameter.
        (init_macro_token, init_argv_token): Require user to consume empty
        input.
        (skip_line, match_input): Adjust callers.
        (next_token): Always consume first character.
        * src/macro.c (arg_text): Tighten assertion.
        (wrap_args): New method.
        * src/builtin.c (m4_m4wrap): Use it.
        (define_macro): Issue warning when ignoring builtin token during
        macro definition.
        * doc/m4.texinfo (M4wrap, Location, Incompatibilities)
        (Improved m4wrap): Adjust examples to corrected behavior.
        * NEWS: Document this fix.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkfgeQcACgkQ84KuGfSFAYCqhwCghaDh2YZGvOYljvlI1jj5yDXv
otIAnRHpXDM88212ZmvVjO9289U/1kpJ
=RZlo
-----END PGP SIGNATURE-----
>From 1761b0d68f12c701abfdcf0a36d955f787849e3c Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Mon, 17 Mar 2008 16:03:57 -0600
Subject: [PATCH] Stage 20a: reduce unget's in input engine.

* m4/input.c (struct input_funcs): Alter read_func prototype.
(next_char, file_read, buildin_read, string_read, composite_read):
Add allow_argv parameter.
(init_builtin_token, init_argv_symbol): Require all prior input to
be consumed.
(m4_skip_line, match_input, consume_syntax): Adjust callers.
(m4__next_token): Consume first byte without peek.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog  |   15 ++++++++
 m4/input.c |  114 +++++++++++++++++++++++++++++++++--------------------------
 2 files changed, 79 insertions(+), 50 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index d7580a1..8f58981 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,20 @@
 2008-03-17  Eric Blake  <address@hidden>
 
+       Stage 20a: reduce unget's in input engine.
+       Now that out-of-range input placeholders like CHAR_BUILTIN are
+       consumed outside of next_char, next_token should always consume
+       rather than peek at the first character.  Fewer peeks results in
+       less ungetc overhead.
+       Memory impact: none.
+       Speed impact: noticeable improvement, from fewer function calls.
+       * m4/input.c (struct input_funcs): Alter read_func prototype.
+       (next_char, file_read, buildin_read, string_read, composite_read):
+       Add allow_argv parameter.
+       (init_builtin_token, init_argv_symbol): Require all prior input to
+       be consumed.
+       (m4_skip_line, match_input, consume_syntax): Adjust callers.
+       (m4__next_token): Consume first byte without peek.
+
        Update for fresh bootstrap.
        * ltdl/m4/gnulib-cache.m4: Updated copyright from upstream.
 
diff --git a/m4/input.c b/m4/input.c
index 7d27dad..71b9014 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -93,20 +93,24 @@
    between input blocks must update the context accordingly.  */
 
 static int     file_peek               (m4_input_block *, m4 *, bool);
-static int     file_read               (m4_input_block *, m4 *, bool, bool);
+static int     file_read               (m4_input_block *, m4 *, bool, bool,
+                                        bool);
 static void    file_unget              (m4_input_block *, int);
 static bool    file_clean              (m4_input_block *, m4 *, bool);
 static void    file_print              (m4_input_block *, m4 *, m4_obstack *);
 static int     builtin_peek            (m4_input_block *, m4 *, bool);
-static int     builtin_read            (m4_input_block *, m4 *, bool, bool);
+static int     builtin_read            (m4_input_block *, m4 *, bool, bool,
+                                        bool);
 static void    builtin_unget           (m4_input_block *, int);
 static void    builtin_print           (m4_input_block *, m4 *, m4_obstack *);
 static int     string_peek             (m4_input_block *, m4 *, bool);
-static int     string_read             (m4_input_block *, m4 *, bool, bool);
+static int     string_read             (m4_input_block *, m4 *, bool, bool,
+                                        bool);
 static void    string_unget            (m4_input_block *, int);
 static void    string_print            (m4_input_block *, m4 *, m4_obstack *);
 static int     composite_peek          (m4_input_block *, m4 *, bool);
-static int     composite_read          (m4_input_block *, m4 *, bool, bool);
+static int     composite_read          (m4_input_block *, m4 *, bool, bool,
+                                        bool);
 static void    composite_unget         (m4_input_block *, int);
 static bool    composite_clean         (m4_input_block *, m4 *, bool);
 static void    composite_print         (m4_input_block *, m4 *, m4_obstack *);
@@ -115,7 +119,7 @@ static      void    init_builtin_token      (m4 *, 
m4_symbol_value *);
 static void    append_quote_token      (m4 *, m4_obstack *,
                                         m4_symbol_value *);
 static bool    match_input             (m4 *, const char *, bool);
-static int     next_char               (m4 *, bool, bool);
+static int     next_char               (m4 *, bool, bool, bool);
 static int     peek_char               (m4 *, bool);
 static bool    pop_input               (m4 *, bool);
 static void    unget_input             (int);
@@ -138,9 +142,11 @@ struct input_funcs
 
   /* Read input, return an unsigned char, CHAR_BUILTIN if it is a
      builtin, or CHAR_RETRY if none available.  If ALLOW_QUOTE, then
-     CHAR_QUOTE may be returned.  If SAFE, then do not alter the
-     current file or line.  */
-  int  (*read_func)    (m4_input_block *, m4 *, bool allow_quote, bool safe);
+     CHAR_QUOTE may be returned.  If ALLOW_ARGV, then CHAR_ARGV may be
+     returned.  If SAFE, then do not alter the current file or
+     line.  */
+  int  (*read_func)    (m4_input_block *, m4 *, bool allow_quote,
+                        bool allow_argv, bool safe);
 
   /* Unread a single unsigned character or CHAR_BUILTIN, must be the
      same character previously read by read_func.  */
@@ -268,7 +274,7 @@ file_peek (m4_input_block *me, m4 *context M4_GNUC_UNUSED,
 
 static int
 file_read (m4_input_block *me, m4 *context, bool allow_quote M4_GNUC_UNUSED,
-          bool safe M4_GNUC_UNUSED)
+          bool allow_argv M4_GNUC_UNUSED, bool safe M4_GNUC_UNUSED)
 {
   int ch;
 
@@ -394,7 +400,8 @@ builtin_peek (m4_input_block *me, m4 *context 
M4_GNUC_UNUSED,
 
 static int
 builtin_read (m4_input_block *me, m4 *context M4_GNUC_UNUSED,
-             bool allow_quote M4_GNUC_UNUSED, bool safe M4_GNUC_UNUSED)
+             bool allow_quote M4_GNUC_UNUSED, bool allow_argv M4_GNUC_UNUSED,
+             bool safe M4_GNUC_UNUSED)
 {
   /* Not consumed here - wait until init_builtin_token.  */
   return me->u.builtin ? CHAR_BUILTIN : CHAR_RETRY;
@@ -453,7 +460,8 @@ string_peek (m4_input_block *me, m4 *context M4_GNUC_UNUSED,
 
 static int
 string_read (m4_input_block *me, m4 *context M4_GNUC_UNUSED,
-            bool allow_quote M4_GNUC_UNUSED, bool safe M4_GNUC_UNUSED)
+            bool allow_quote M4_GNUC_UNUSED, bool allow_argv M4_GNUC_UNUSED,
+            bool safe M4_GNUC_UNUSED)
 {
   if (!me->u.u_s.len)
     return CHAR_RETRY;
@@ -757,9 +765,11 @@ composite_peek (m4_input_block *me, m4 *context, bool 
allow_argv)
 }
 
 static int
-composite_read (m4_input_block *me, m4 *context, bool allow_quote, bool safe)
+composite_read (m4_input_block *me, m4 *context, bool allow_quote,
+               bool allow_argv, bool safe)
 {
   m4__symbol_chain *chain = me->u.u_c.chain;
+  size_t argc;
   while (chain)
     {
       if (allow_quote && chain->quote_age == m4__quote_age (M4SYNTAX))
@@ -782,7 +792,8 @@ composite_read (m4_input_block *me, m4 *context, bool 
allow_quote, bool safe)
            return CHAR_BUILTIN;
          break;
        case M4__CHAIN_ARGV:
-         if (chain->u.u_a.index == m4_arg_argc (chain->u.u_a.argv))
+         argc = m4_arg_argc (chain->u.u_a.argv);
+         if (chain->u.u_a.index == argc)
            {
              m4__arg_adjust_refcount (context, chain->u.u_a.argv, false);
              break;
@@ -792,6 +803,11 @@ composite_read (m4_input_block *me, m4 *context, bool 
allow_quote, bool safe)
              chain->u.u_a.comma = false;
              return ','; /* FIXME - support M4_SYNTAX_COMMA.  */
            }
+         /* Only return a reference in the quoting is correct and the
+            reference has more than one argument left.  */
+         if (allow_argv && chain->quote_age == m4__quote_age (M4SYNTAX)
+             && chain->u.u_a.quotes && chain->u.u_a.index + 1 < argc)
+           return CHAR_ARGV;
          /* Rather than directly parse argv here, we push another
             input block containing the next unparsed argument from
             argv.  */
@@ -804,7 +820,7 @@ composite_read (m4_input_block *me, m4 *context, bool 
allow_quote, bool safe)
          chain->u.u_a.index++;
          chain->u.u_a.comma = true;
          m4_push_string_finish ();
-         return next_char (context, allow_quote, !safe);
+         return next_char (context, allow_quote, allow_argv, !safe);
        default:
          assert (!"composite_read");
          abort ();
@@ -1085,9 +1101,6 @@ m4_pop_wrapup (m4 *context)
 static void
 init_builtin_token (m4 *context, m4_symbol_value *token)
 {
-  int ch = next_char (context, false, true);
-  assert (ch == CHAR_BUILTIN);
-
   if (isp->funcs == &builtin_funcs)
     {
       assert (isp->u.builtin);
@@ -1159,11 +1172,10 @@ init_argv_symbol (m4 *context, m4_obstack *obs, 
m4_symbol_value *value)
 {
   m4__symbol_chain *src_chain;
   m4__symbol_chain *chain;
-  int ch = next_char (context, true, true);
+  int ch;
   const m4_string_pair *comments = m4_get_syntax_comments (M4SYNTAX);
 
-  assert (ch == CHAR_QUOTE && value->type == M4_SYMBOL_VOID
-         && isp->funcs == &composite_funcs
+  assert (value->type == M4_SYMBOL_VOID && isp->funcs == &composite_funcs
          && isp->u.u_c.chain->type == M4__CHAIN_ARGV
          && obs && obstack_object_size (obs) == 0);
 
@@ -1201,7 +1213,7 @@ init_argv_symbol (m4 *context, m4_obstack *obs, 
m4_symbol_value *value)
          || (!m4_has_syntax (M4SYNTAX, *comments->str1,
                              M4_SYNTAX_COMMA | M4_SYNTAX_CLOSE)
              && *comments->str1 != *src_chain->u.u_a.quotes->str1));
-  ch = peek_char (context, false);
+  ch = peek_char (context, true);
   if (!m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_COMMA | M4_SYNTAX_CLOSE))
     {
       isp->u.u_c.chain = src_chain;
@@ -1217,10 +1229,12 @@ init_argv_symbol (m4 *context, m4_obstack *obs, 
m4_symbol_value *value)
    next_char () is used to read and advance the input to the next
    character.  If ALLOW_QUOTE, and the current input matches the
    current quote age, return CHAR_QUOTE and leave consumption of data
-   for append_quote_token.  If RETRY, then avoid returning CHAR_RETRY
-   by popping input.  */
+   for append_quote_token; otherwise, if ALLOW_ARGV, and the current
+   input matches an argv reference with the correct quoting, return
+   CHAR_ARGV and leave consumption of data for init_argv_symbol.  If
+   RETRY, then avoid returning CHAR_RETRY by popping input.  */
 static int
-next_char (m4 *context, bool allow_quote, bool retry)
+next_char (m4 *context, bool allow_quote, bool allow_argv, bool retry)
 {
   int ch;
 
@@ -1240,7 +1254,8 @@ next_char (m4 *context, bool allow_quote, bool retry)
        }
 
       assert (isp->funcs->read_func);
-      while (((ch = isp->funcs->read_func (isp, context, allow_quote, !retry))
+      while (((ch = isp->funcs->read_func (isp, context, allow_quote,
+                                          allow_argv, !retry))
              != CHAR_RETRY)
             || !retry)
        {
@@ -1273,7 +1288,7 @@ peek_char (m4 *context, bool allow_argv)
       if (ch != CHAR_RETRY)
        {
 /*       if (IS_IGNORE (ch)) */
-/*         return next_char (context, false, true); */
+/*         return next_char (context, false, true, true); */
          return ch;
        }
 
@@ -1283,7 +1298,7 @@ peek_char (m4 *context, bool allow_argv)
 
 /* The function unget_input () puts back a character on the input
    stack, using an existing input_block if possible.  This is not safe
-   to call except immediately after next_char(context, allow, false).  */
+   to call except immediately after next_char(context, aq, aa, false).  */
 static void
 unget_input (int ch)
 {
@@ -1301,7 +1316,8 @@ m4_skip_line (m4 *context, const char *name)
   const char *file = m4_get_current_file (context);
   int line = m4_get_current_line (context);
 
-  while ((ch = next_char (context, false, true)) != CHAR_EOF && ch != '\n')
+  while ((ch = next_char (context, false, false, true)) != CHAR_EOF
+        && ch != '\n')
     ;
   if (ch == CHAR_EOF)
     /* current_file changed; use the previous value we cached.  */
@@ -1346,14 +1362,14 @@ match_input (m4 *context, const char *s, bool consume)
   if (s[1] == '\0')
     {
       if (consume)
-       next_char (context, false, true);
+       next_char (context, false, false, true);
       return true;                     /* short match */
     }
 
-  next_char (context, false, true);
+  next_char (context, false, false, true);
   for (n = 1, t = s++; (ch = peek_char (context, false)) == to_uchar (*s++); )
     {
-      next_char (context, false, true);
+      next_char (context, false, false, true);
       n++;
       if (*s == '\0')          /* long match */
        {
@@ -1391,29 +1407,29 @@ static bool
 consume_syntax (m4 *context, m4_obstack *obs, unsigned int syntax)
 {
   int ch;
-  bool allow_quote = m4__safe_quotes (M4SYNTAX);
+  bool allow = m4__safe_quotes (M4SYNTAX);
   assert (syntax);
   while (1)
     {
       /* It is safe to call next_char without first checking
         peek_char, except at input source boundaries, which we detect
         by CHAR_RETRY.  We exploit the fact that CHAR_EOF,
-        CHAR_BUILTIN, and CHAR_QUOTE do not satisfy any syntax
-        categories.  */
-      while ((ch = next_char (context, allow_quote, false)) != CHAR_RETRY
+        CHAR_BUILTIN, CHAR_QUOTE, and CHAR_ARGV do not satisfy any
+        syntax categories.  */
+      while ((ch = next_char (context, allow, allow, false)) != CHAR_RETRY
             && m4_has_syntax (M4SYNTAX, ch, syntax))
        {
          assert (ch < CHAR_EOF);
          obstack_1grow (obs, ch);
        }
-      if (ch == CHAR_RETRY || ch == CHAR_QUOTE)
+      if (ch == CHAR_RETRY || ch == CHAR_QUOTE || ch == CHAR_ARGV)
        {
          ch = peek_char (context, false);
          if (m4_has_syntax (M4SYNTAX, ch, syntax))
            {
              assert (ch < CHAR_EOF);
              obstack_1grow (obs, ch);
-             next_char (context, false, true);
+             next_char (context, false, false, true);
              continue;
            }
          return ch == CHAR_EOF;
@@ -1499,15 +1515,14 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
   do {
     obstack_free (&token_stack, token_bottom);
 
-    /* Must consume an input character, but not until CHAR_BUILTIN is
-       handled.  */
-    ch = peek_char (context, allow_argv && m4__quote_age (M4SYNTAX));
+    /* Must consume an input character.  */
+    ch = next_char (context, false, allow_argv && m4__quote_age (M4SYNTAX),
+                   true);
     if (ch == CHAR_EOF)                        /* EOF */
       {
 #ifdef DEBUG_INPUT
        xfprintf (stderr, "next_token -> EOF\n");
 #endif
-       next_char (context, false, true);
        return M4_TOKEN_EOF;
       }
 
@@ -1528,15 +1543,13 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
        return M4_TOKEN_ARGV;
       }
 
-    /* Consume character we already peeked at.  */
-    next_char (context, false, true);
     file = m4_get_current_file (context);
     *line = m4_get_current_line (context);
 
     if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_ESCAPE))
       {                                        /* ESCAPED WORD */
        obstack_1grow (&token_stack, ch);
-       if ((ch = next_char (context, false, true)) < CHAR_EOF)
+       if ((ch = next_char (context, false, false, true)) < CHAR_EOF)
          {
            obstack_1grow (&token_stack, ch);
            if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_ALPHA))
@@ -1564,7 +1577,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
        type = M4_TOKEN_STRING;
        while (1)
          {
-           ch = next_char (context, obs && m4__quote_age (M4SYNTAX), true);
+           ch = next_char (context, obs && m4__quote_age (M4SYNTAX), false,
+                           true);
            if (ch == CHAR_EOF)
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in string"));
@@ -1581,7 +1595,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                    ch = peek_char (context, false);
                    if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_RQUOTE))
                      {
-                       ch = next_char (context, false, true);
+                       ch = next_char (context, false, false, true);
 #ifdef DEBUG_INPUT
                        m4_print_token (context, "next_token", M4_TOKEN_MACDEF,
                                        token);
@@ -1623,7 +1637,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
        assert (!m4__quote_age (M4SYNTAX));
        while (1)
          {
-           ch = next_char (context, false, true);
+           ch = next_char (context, false, false, true);
            if (ch == CHAR_EOF)
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in string"));
@@ -1641,7 +1655,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                    if (MATCH (context, ch, context->syntax->quote.str2,
                               false))
                      {
-                       ch = next_char (context, false, true);
+                       ch = next_char (context, false, false, true);
                        MATCH (context, ch, context->syntax->quote.str2, true);
 #ifdef DEBUG_INPUT
                        m4_print_token (context, "next_token", M4_TOKEN_MACDEF,
@@ -1681,7 +1695,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
        obstack_1grow (obs_safe, ch);
        while (1)
          {
-           ch = next_char (context, false, true);
+           ch = next_char (context, false, false, true);
            if (ch == CHAR_EOF)
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in comment"));
@@ -1713,7 +1727,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                      context->syntax->comm.len1);
        while (1)
          {
-           ch = next_char (context, false, true);
+           ch = next_char (context, false, false, true);
            if (ch == CHAR_EOF)
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in comment"));
-- 
1.5.4


>From 6608fa6d084d320401f049b259adcf6b383eaa43 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 18 Mar 2008 14:00:39 -0600
Subject: [PATCH] Stage 20b: make m4wrap obey POSIX fifo ordering.

* m4/m4module.h (m4_wrap_args): Add prototype.
* m4/m4private.h (enum m4__symbol_chain_type): Add M4__CHAIN_LOC.
(struct m4__symbol_chain): Add struct u_l.
* m4/input.c (m4_push_wrapup_init, m4_push_wrapup_finish): Use
new link type.
(composite_peek, composite_read, composite_clean): Handle location
link.
* m4/macro.c (m4_wrap_args): New function.
* modules/m4.c (m4wrap): Use it.
* doc/m4.texinfo (M4wrap): Sync with branch and POSIX.
(Extensions): Document extension of multiple arguments.
(Location, Improved m4wrap): Adjust example to match FIFO order.
* tests/builtins.at (wrap): Likewise.
* NEWS: Document this change.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog         |   22 +++++++++++++++++
 NEWS              |    6 +++-
 doc/m4.texinfo    |   62 +++++++++++++++++++++++++++++++++++-------------
 m4/input.c        |   65 ++++++++++++++++++++++++++++++++------------------
 m4/m4module.h     |    2 +-
 m4/m4private.h    |   13 +++++++---
 m4/macro.c        |   68 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 modules/m4.c      |    8 +-----
 tests/builtins.at |    2 +-
 9 files changed, 193 insertions(+), 55 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8f58981..9d3c860 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,25 @@
+2008-03-18  Eric Blake  <address@hidden>
+
+       Stage 20b: make m4wrap obey POSIX fifo ordering.
+       Improve input engine to support location changes within symbol
+       chains, then convert m4wrap to always build symbol chain.
+       Memory impact: none.
+       Speed impact: slight penalty, from more m4wrap bookkeeping.
+       * m4/m4module.h (m4_wrap_args): Add prototype.
+       * m4/m4private.h (enum m4__symbol_chain_type): Add M4__CHAIN_LOC.
+       (struct m4__symbol_chain): Add struct u_l.
+       * m4/input.c (m4_push_wrapup_init, m4_push_wrapup_finish): Use
+       new link type.
+       (composite_peek, composite_read, composite_clean): Handle location
+       link.
+       * m4/macro.c (m4_wrap_args): New function.
+       * modules/m4.c (m4wrap): Use it.
+       * doc/m4.texinfo (M4wrap): Sync with branch and POSIX.
+       (Extensions): Document extension of multiple arguments.
+       (Location, Improved m4wrap): Adjust example to match FIFO order.
+       * tests/builtins.at (wrap): Likewise.
+       * NEWS: Document this change.
+
 2008-03-17  Eric Blake  <address@hidden>
 
        Stage 20a: reduce unget's in input engine.
diff --git a/NEWS b/NEWS
index 6e2fa40..eea5287 100644
--- a/NEWS
+++ b/NEWS
@@ -91,8 +91,6 @@ promoted to 2.0.
   - FIXME: POSIX recommends using ${10} instead of $10 for the tenth
   positional argument.  We should deprecate $10.
 
- - FIXME: `m4wrap' semantics need an update to FIFO.
-
 ** Removed builtins
 
 *** The experimental `epatsubst' and `eregexp' builtins have been removed
@@ -216,6 +214,10 @@ promoted to 2.0.
 ** Fix regression introduced in 1.4.10b where using `builtin' or `indir'
    to perform nested `shift' calls triggered an assertion failure.
 
+** Fix the `m4wrap' builtin to accumulate wrapped text in FIFO order, as
+   required by POSIX.  The manual mentions a way to restore the LIFO order
+   present in earlier GNU M4 versions.
+
 ** Enhance the `ifdef', `ifelse', and `shift' builtins, as well as all
    user macros, to transparently handle builtin tokens generated by `defn'.
 
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 175d923..6e836f6 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -5204,10 +5204,18 @@ normal input has been exhausted.  This feature is 
normally used to
 initiate cleanup actions before normal exit, e.g., deleting temporary
 files.
 
address@hidden {Builtin (m4)} m4wrap (@var{string}, @dots{})
 To save input text, use the builtin @code{m4wrap}:
-which stores @var{string} and the rest of the arguments in a safe place,
-to be reread when end of input is reached.
+
address@hidden {Builtin (m4)} m4wrap (@var{string}, @dots{})
+Stores @var{string} in a safe place, to be reread when end of input is
+reached.  As a @acronym{GNU} extension, additional arguments are
+concatenated with a space to the @var{string}.
+
+Successive invocations of @code{m4wrap} accumulate saved text in
+first-in, first-out order, as required by @acronym{POSIX}.
+
+The expansion of @code{m4wrap} is void.
+The macro @code{m4wrap} is recognized only with parameters.
 @end deffn
 
 @example
@@ -5225,16 +5233,27 @@ This is the first and last normal input line.
 The saved input is only reread when the end of normal input is seen, and
 not if @code{m4exit} is used to exit @code{m4}.
 
address@hidden FIXME: this contradicts POSIX, which requires that "If the
address@hidden m4wrap macro is used multiple times, the arguments specified
address@hidden shall be processed in the order in which the m4wrap macros were
address@hidden processed."
-It is safe to call @code{m4wrap} from saved text, but then the order in
-which the saved text is reread is undefined.  If @code{m4wrap} is not used
-recursively, the saved pieces of text are reread in the opposite order
-in which they were saved (LIFO---last in, first out).
+It is safe to call @code{m4wrap} from wrapped text, where all the
+recursively wrapped text is deferred until the current wrapped text is
+exhausted.  As of M4 1.4.11, when @code{m4wrap} is not used recursively,
+the saved pieces of text are reread in the same order in which they were
+saved (FIFO---first in, first out), as required by @acronym{POSIX}.
+
address@hidden
+m4wrap(`1
+')
address@hidden
+m4wrap(`2', `3
+')
address@hidden
+^D
address@hidden
address@hidden 3
address@hidden example
 
-It is possible to emulate @acronym{POSIX} behavior even
+However, earlier versions had reverse ordering (LIFO---last in, first
+out), as this behavior is more like the semantics of the C function
address@hidden  It is possible to emulate @acronym{POSIX} behavior even
 with older versions of @acronym{GNU} M4 by including the file
 @address@hidden/@/examples/@/wrapfifo.m4} from the
 distribution:
@@ -5310,13 +5329,13 @@ Invocations of @code{m4wrap} at the same recursion 
level are
 concatenated and rescanned as usual:
 
 @example
-define(`aa', `AA
+define(`ab', `AB
 ')
 @result{}
-m4wrap(`a')m4wrap(`a')
+m4wrap(`a')m4wrap(`b')
 @result{}
 ^D
address@hidden
address@hidden
 @end example
 
 @noindent
@@ -7778,9 +7797,9 @@ m4wrap(`__line__
 ')
 @result{}
 ^D
address@hidden
 @result{}6
 @result{}6
address@hidden
 @end example
 
 The @address@hidden macro behaves like @samp{$0} in shell
@@ -8287,6 +8306,15 @@ once, but @acronym{GNU} @code{m4} correctly handles 
multiple instances
 of @samp{-} on the command line.
 
 @item
address@hidden requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
+(first-in, first-out) order, and most other implementations obey this.
+However, versions of @acronym{GNU} @code{m4} earlier than 1.4.11 used
+LIFO order.  Furthermore, @acronym{POSIX} states that only the first
+argument to @code{m4wrap} is saved for later evaluation, but
address@hidden @code{m4} saves and processes all arguments, with output
+separated by spaces.
+
address@hidden
 @acronym{POSIX} states that builtins that require arguments, but are
 called without arguments, have undefined behavior.  Traditional
 implementations simply behave as though empty strings had been passed.
@@ -8943,8 +8971,8 @@ builtin(`m4wrap', ``'define(`bar', 
``$0:'-$1-$*-$#-')bar(`a', `b')
 ')
 @result{}
 ^D
address@hidden:-a-a,b-2-
 @result{}m4wrap0:---0-
address@hidden:-a-a,b-2-
 @end example
 
 Additionally, the computation of @code{_m4wrap_level} and creation of
diff --git a/m4/input.c b/m4/input.c
index 71b9014..a7f1da9 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -204,11 +204,7 @@ static m4_obstack token_stack;
 /* Obstack for storing input file names.  */
 static m4_obstack file_names;
 
-/* Wrapup input stack.
-
-   FIXME - m4wrap should be FIFO, which implies a queue, not a stack.
-   While fixing this, m4wrap should also remember what the current
-   file and line are for each chunk of wrapped text.  */
+/* Wrapup input stack.  */
 static m4_obstack *wrapup_stack;
 
 /* Current stack, from input or wrapup.  */
@@ -755,6 +751,8 @@ composite_peek (m4_input_block *me, m4 *context, bool 
allow_argv)
          chain->u.u_a.comma = true;
          m4_push_string_finish ();
          return peek_char (context, allow_argv);
+       case M4__CHAIN_LOC:
+         break;
        default:
          assert (!"composite_peek");
          abort ();
@@ -821,6 +819,12 @@ composite_read (m4_input_block *me, m4 *context, bool 
allow_quote,
          chain->u.u_a.comma = true;
          m4_push_string_finish ();
          return next_char (context, allow_quote, allow_argv, !safe);
+       case M4__CHAIN_LOC:
+         me->file = chain->u.u_l.file;
+         me->line = chain->u.u_l.line;
+         input_change = true;
+         me->u.u_c.chain = chain->next;
+         return next_char (context, allow_quote, allow_argv, !safe);
        default:
          assert (!"composite_read");
          abort ();
@@ -885,6 +889,8 @@ composite_clean (m4_input_block *me, m4 *context, bool 
cleanup)
            }
          m4__arg_adjust_refcount (context, chain->u.u_a.argv, false);
          break;
+       case M4__CHAIN_LOC:
+         return false;
        default:
          assert (!"composite_clean");
          abort ();
@@ -1001,14 +1007,36 @@ m4_obstack *
 m4_push_wrapup_init (m4 *context)
 {
   m4_input_block *i;
+  m4__symbol_chain *chain;
 
-  i = (m4_input_block *) obstack_alloc (wrapup_stack, sizeof *i);
-  i->prev = wsp;
-
-  i->funcs = &string_funcs;
-  i->file = m4_get_current_file (context);
-  i->line = m4_get_current_line (context);
-  wsp = i;
+  assert (obstack_object_size (wrapup_stack) == 0);
+  if (wsp)
+    {
+      i = wsp;
+      assert (i->funcs == &composite_funcs && i->u.u_c.end
+             && i->u.u_c.end->type != M4__CHAIN_LOC);
+    }
+  else
+    {
+      i = (m4_input_block *) obstack_alloc (wrapup_stack, sizeof *i);
+      i->prev = wsp;
+      i->funcs = &composite_funcs;
+      i->file = m4_get_current_file (context);
+      i->line = m4_get_current_line (context);
+      i->u.u_c.chain = i->u.u_c.end = NULL;
+      wsp = i;
+    }
+  chain = (m4__symbol_chain *) obstack_alloc (wrapup_stack, sizeof *chain);
+  if (i->u.u_c.end)
+    i->u.u_c.end->next = chain;
+  else
+    i->u.u_c.chain = chain;
+  i->u.u_c.end = chain;
+  chain->next = NULL;
+  chain->type = M4__CHAIN_LOC;
+  chain->quote_age = 0;
+  chain->u.u_l.file = m4_get_current_file (context);
+  chain->u.u_l.line = m4_get_current_line (context);
   return wrapup_stack;
 }
 
@@ -1016,17 +1044,8 @@ m4_push_wrapup_init (m4 *context)
 void
 m4_push_wrapup_finish (void)
 {
-  m4_input_block *i = wsp;
-  if (obstack_object_size (wrapup_stack) == 0)
-    {
-      wsp = i->prev;
-      obstack_free (wrapup_stack, i);
-    }
-  else
-    {
-      i->u.u_s.len = obstack_object_size (wrapup_stack);
-      i->u.u_s.str = (char *) obstack_finish (wrapup_stack);
-    }
+  m4__make_text_link (wrapup_stack, &wsp->u.u_c.chain, &wsp->u.u_c.end);
+  assert (wsp->u.u_c.end->type != M4__CHAIN_LOC);
 }
 
 
diff --git a/m4/m4module.h b/m4/m4module.h
index 6cbe185..357baca 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -1,5 +1,4 @@
 /* GNU m4 -- A simple macro processor
-
    Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 1999, 2000, 2003,
    2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
 
@@ -374,6 +373,7 @@ extern void m4_push_arg             (m4 *, m4_obstack *, 
m4_macro_args *,
                                         size_t);
 extern void    m4_push_args            (m4 *, m4_obstack *, m4_macro_args *,
                                         bool, bool);
+extern void    m4_wrap_args            (m4 *, m4_macro_args *);
 
 
 /* --- RUNTIME DEBUGGING --- */
diff --git a/m4/m4private.h b/m4/m4private.h
index 5ff7c95..86f18e8 100644
--- a/m4/m4private.h
+++ b/m4/m4private.h
@@ -1,7 +1,6 @@
 /* GNU m4 -- A simple macro processor
-
-   Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999, 2004, 2005,
-   2006, 2007, 2008 Free Software Foundation, Inc.
+   Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999, 2004,
+   2005, 2006, 2007, 2008 Free Software Foundation, Inc.
 
    This file is part of GNU M4.
 
@@ -212,7 +211,8 @@ enum m4__symbol_chain_type
 {
   M4__CHAIN_STR,       /* Link contains a string, u.u_s is valid.  */
   M4__CHAIN_FUNC,      /* Link contains builtin token, u.builtin is valid.  */
-  M4__CHAIN_ARGV       /* Link contains a $@ reference, u.u_a is valid.  */
+  M4__CHAIN_ARGV,      /* Link contains a $@ reference, u.u_a is valid.  */
+  M4__CHAIN_LOC                /* Link contains m4wrap location, u.u_l is 
valid.  */
 };
 
 /* Composite symbols are built of a linked list of chain objects.  */
@@ -240,6 +240,11 @@ struct m4__symbol_chain
       bool_bitfield has_func : 1;      /* True if argv includes func.  */
       const m4_string_pair *quotes;    /* NULL for $*, quotes for 
address@hidden  */
     } u_a;                     /* M4__CHAIN_ARGV.  */
+    struct
+    {
+      const char *file;        /* File where subsequent links originate.  */
+      int line;                /* Line where subsequent links originate.  */
+    } u_l;                     /* M4__CHAIN_LOC.  */
   } u;
 };
 
diff --git a/m4/macro.c b/m4/macro.c
index d03f551..6d1976d 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -1670,6 +1670,74 @@ m4_push_args (m4 *context, m4_obstack *obs, 
m4_macro_args *argv, bool skip,
     arg_mark (argv);
 }
 
+/* Push arguments from ARGV onto the wrap stack for later rescanning.
+   If GNU extensions are disabled, only the first argument is pushed;
+   otherwise, all arguments are pushed and separated with a space.  */
+void
+m4_wrap_args (m4 *context, m4_macro_args *argv)
+{
+  size_t i;
+  m4_obstack *obs;
+  m4_symbol_value *value;
+  m4__symbol_chain *chain;
+  size_t limit = m4_get_posixly_correct_opt (context) ? 2 : argv->argc;
+
+  if (limit == 2 && m4_arg_empty (argv, 1))
+    return;
+
+  obs = m4_push_wrapup_init (context);
+  for (i = 1; i < limit; i++)
+    {
+      if (i != 1)
+       obstack_1grow (obs, ' ');
+      value = m4_arg_symbol (argv, i);
+      switch (value->type)
+       {
+       case M4_SYMBOL_TEXT:
+         obstack_grow (obs, m4_get_symbol_value_text (value),
+                       m4_get_symbol_value_len (value));
+         break;
+       case M4_SYMBOL_FUNC:
+         /* TODO allow builtins.  */
+         assert (false);
+         break;
+       case M4_SYMBOL_COMP:
+         chain = value->u.u_c.chain;
+         while (chain)
+           {
+             switch (chain->type)
+               {
+               case M4__CHAIN_STR:
+                 obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
+                 break;
+               case M4__CHAIN_FUNC:
+                 /* TODO allow builtins.  */
+                 assert (false);
+                 break;
+               case M4__CHAIN_ARGV:
+                 m4_arg_print (context, obs, chain->u.u_a.argv,
+                               chain->u.u_a.index,
+                               m4__quote_cache (M4SYNTAX, NULL,
+                                                chain->quote_age,
+                                                chain->u.u_a.quotes),
+                               chain->u.u_a.flatten, NULL, NULL, false,
+                               false);
+                 break;
+               default:
+                 assert (!"m4_wrap_args");
+                 abort ();
+               }
+             chain = chain->next;
+           }
+         break;
+       default:
+         assert (!"m4_wrap_args");
+         abort ();
+       }
+    }
+  m4_push_wrapup_finish ();
+}
+
 
 /* Define these last, so that earlier uses can benefit from the macros
    in m4private.h.  */
diff --git a/modules/m4.c b/modules/m4.c
index 359839b..02ac090 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -833,13 +833,7 @@ M4BUILTIN_HANDLER (m4exit)
    version only the first.  */
 M4BUILTIN_HANDLER (m4wrap)
 {
-  obs = m4_push_wrapup_init (context);
-  if (m4_get_posixly_correct_opt (context))
-    obstack_grow (obs, M4ARG (1), M4ARGLEN (1));
-  else
-    /* TODO allow pushing builtins.  */
-    m4_arg_print (context, obs, argv, 1, NULL, true, " ", NULL, false, false);
-  m4_push_wrapup_finish ();
+  m4_wrap_args (context, argv);
 }
 
 /* Enable tracing of all specified macros, or all, if none is specified.
diff --git a/tests/builtins.at b/tests/builtins.at
index 08c881b..34143a1 100644
--- a/tests/builtins.at
+++ b/tests/builtins.at
@@ -1227,8 +1227,8 @@ No. 33: The End.
 AT_CHECK_M4([wrap.m4], 0,
 [[
 No. 33: The End.
-Wrapper no. 2
 Wrapper no. 1
+Wrapper no. 2
 Wrapper no. 3
 Wrapper no. 4
 ]])
-- 
1.5.4

>From ca0ae275710ba45df36fd076b72605345293b54b Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Mon, 3 Dec 2007 11:53:45 -0700
Subject: [PATCH] Stage 20: make m4wrap obey POSIX fifo ordering.

* src/m4.h (enum token_chain_type): Add CHAIN_LOC.
(struct token_chain): Add u_l member.
(wrap_args): New prototype.
* src/input.c (push_wrapup_init, push_wrapup_finish): Rewrite to
guarantee a FIFO chain in the wrapup stack.
(pop_input, peek_input, next_char_1): Support location link.
(next_char): Add parameter.
(init_macro_token, init_argv_token): Require user to consume empty
input.
(skip_line, match_input): Adjust callers.
(next_token): Always consume first character.
* src/macro.c (arg_text): Tighten assertion.
(wrap_args): New method.
* src/builtin.c (m4_m4wrap): Use it.
(define_macro): Issue warning when ignoring builtin token during
macro definition.
* doc/m4.texinfo (M4wrap, Location, Incompatibilities)
(Improved m4wrap): Adjust examples to corrected behavior.
* NEWS: Document this fix.

(cherry picked from commit f7f45337fa1bfba9512841e8d3d2251359944681)

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog      |   28 ++++++++++++
 NEWS           |    4 ++
 doc/m4.texinfo |   49 ++++++++++++++--------
 src/builtin.c  |   26 ++++++-----
 src/input.c    |  126 +++++++++++++++++++++++++++++++++++--------------------
 src/m4.h       |   10 ++++-
 src/macro.c    |   68 ++++++++++++++++++++++++++++--
 7 files changed, 230 insertions(+), 81 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 3c4e2ad..49f9b80 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,31 @@
+2008-03-19  Eric Blake  <address@hidden>
+
+       Stage 20: make m4wrap obey POSIX fifo ordering.
+       Improve input engine to support location changes within symbol
+       chains, then convert m4wrap to always build symbol chain.  Also,
+       avoid wasted peek at start of next_token, for fewer ungetc calls.
+       Memory impact: none.
+       Speed impact: noticeable improvement, from fewer function calls.
+       * src/m4.h (enum token_chain_type): Add CHAIN_LOC.
+       (struct token_chain): Add u_l member.
+       (wrap_args): New prototype.
+       * src/input.c (push_wrapup_init, push_wrapup_finish): Rewrite to
+       guarantee a FIFO chain in the wrapup stack.
+       (pop_input, peek_input, next_char_1): Support location link.
+       (next_char): Add parameter.
+       (init_macro_token, init_argv_token): Require user to consume empty
+       input.
+       (skip_line, match_input): Adjust callers.
+       (next_token): Always consume first character.
+       * src/macro.c (arg_text): Tighten assertion.
+       (wrap_args): New method.
+       * src/builtin.c (m4_m4wrap): Use it.
+       (define_macro): Issue warning when ignoring builtin token during
+       macro definition.
+       * doc/m4.texinfo (M4wrap, Location, Incompatibilities)
+       (Improved m4wrap): Adjust examples to corrected behavior.
+       * NEWS: Document this fix.
+
 2008-03-17  Eric Blake  <address@hidden>
 
        Update for fresh bootstrap.
diff --git a/NEWS b/NEWS
index 7dc3aba..53f7282 100644
--- a/NEWS
+++ b/NEWS
@@ -8,6 +8,10 @@ Foundation, Inc.
 ** Fix regression introduced in 1.4.10b where using `builtin' or `indir'
    to perform nested `shift' calls triggered an assertion failure.
 
+** Fix the `m4wrap' builtin to accumulate wrapped text in FIFO order, as
+   required by POSIX.  The manual mentions a way to restore the LIFO order
+   present in earlier GNU M4 versions.
+
 ** Enhance the `ifdef', `ifelse', and `shift' builtins, as well as all
    user macros, to transparently handle builtin tokens generated by `defn'.
 
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index f0fbb96..67f1765 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -4582,6 +4582,9 @@ Stores @var{string} in a safe place, to be reread when 
end of input is
 reached.  As a @acronym{GNU} extension, additional arguments are
 concatenated with a space to the @var{string}.
 
+Successive invocations of @code{m4wrap} accumulate saved text in
+first-in, first-out order, as required by @acronym{POSIX}.
+
 The expansion of @code{m4wrap} is void.
 The macro @code{m4wrap} is recognized only with parameters.
 @end deffn
@@ -4601,18 +4604,27 @@ This is the first and last normal input line.
 The saved input is only reread when the end of normal input is seen, and
 not if @code{m4exit} is used to exit @code{m4}.
 
address@hidden FIXME: this contradicts POSIX, which requires that "If the
address@hidden m4wrap macro is used multiple times, the arguments specified
address@hidden shall be processed in the order in which the m4wrap macros were
address@hidden processed."
-It is safe to call @code{m4wrap} from saved text, but then the order in
-which the saved text is reread is undefined.  If @code{m4wrap} is not used
-recursively, the saved pieces of text are reread in the opposite order
-in which they were saved (LIFO---last in, first out).  However, this
-behavior is likely to change in a future release, to match
address@hidden, so you should not depend on this order.
-
-It is possible to emulate @acronym{POSIX} behavior even
+It is safe to call @code{m4wrap} from wrapped text, where all the
+recursively wrapped text is deferred until the current wrapped text is
+exhausted.  As of M4 1.4.11, when @code{m4wrap} is not used recursively,
+the saved pieces of text are reread in the same order in which they were
+saved (FIFO---first in, first out), as required by @acronym{POSIX}.
+
address@hidden
+m4wrap(`1
+')
address@hidden
+m4wrap(`2', `3
+')
address@hidden
+^D
address@hidden
address@hidden 3
address@hidden example
+
+However, earlier versions had reverse ordering (LIFO---last in, first
+out), as this behavior is more like the semantics of the C function
address@hidden  It is possible to emulate @acronym{POSIX} behavior even
 with older versions of @acronym{GNU} M4 by including the file
 @address@hidden/@/examples/@/wrapfifo.m4} from the
 distribution:
@@ -4688,13 +4700,13 @@ Invocations of @code{m4wrap} at the same recursion 
level are
 concatenated and rescanned as usual:
 
 @example
-define(`aa', `AA
+define(`ab', `AB
 ')
 @result{}
-m4wrap(`a')m4wrap(`a')
+m4wrap(`a')m4wrap(`b')
 @result{}
 ^D
address@hidden
address@hidden
 @end example
 
 @noindent
@@ -6613,9 +6625,9 @@ m4wrap(`__line__
 ')
 @result{}
 ^D
address@hidden
 @result{}6
 @result{}6
address@hidden
 @end example
 
 The @address@hidden macro behaves like @samp{$0} in shell
@@ -7057,7 +7069,8 @@ of @samp{-} on the command line.
 
 @item
 @acronym{POSIX} requires @code{m4wrap} (@pxref{M4wrap}) to act in FIFO
-(first-in, first-out) order, but @acronym{GNU} @code{m4} currently uses
+(first-in, first-out) order, and most other implementations obey this.
+However, versions of @acronym{GNU} @code{m4} earlier than 1.4.11 used
 LIFO order.  Furthermore, @acronym{POSIX} states that only the first
 argument to @code{m4wrap} is saved for later evaluation, but
 @acronym{GNU} @code{m4} saves and processes all arguments, with output
@@ -7745,8 +7758,8 @@ builtin(`m4wrap', ``'define(`bar', 
``$0:'-$1-$*-$#-')bar(`a', `b')
 ')
 @result{}
 ^D
address@hidden:-a-a,b-2-
 @result{}m4wrap0:---0-
address@hidden:-a-a,b-2-
 @end example
 
 Additionally, the computation of @code{_m4wrap_level} and creation of
diff --git a/src/builtin.c b/src/builtin.c
index a441c4c..b5541cf 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -667,7 +667,14 @@ define_macro (int argc, macro_arguments *argv, 
symbol_lookup mode)
 
   switch (arg_type (argv, 2))
     {
+    case TOKEN_COMP:
+      m4_warn (0, me, _("cannot concatenate builtins"));
+      /* TODO fall through instead.  */
+      break;
+
     case TOKEN_TEXT:
+      /* TODO flatten TOKEN_COMP value, or support concatenation of
+        builtins in definitions.  */
       define_user_macro (ARG (1), ARG_LEN (1), ARG (2), mode);
       break;
 
@@ -1608,25 +1615,20 @@ m4_m4exit (struct obstack *obs, int argc, 
macro_arguments *argv)
   exit (exit_code);
 }
 
-/*-------------------------------------------------------------------------.
-| Save the argument text until EOF has been seen, allowing for user       |
-| specified cleanup action.  GNU version saves all arguments, the standard |
-| version only the first.                                                 |
-`-------------------------------------------------------------------------*/
+/*-----------------------------------------------------------------.
+| Save the argument text in FIFO order until EOF has been seen,    |
+| allowing for user specified cleanup action.  Extra arguments are |
+| saved when not in POSIX mode.                                    |
+`-----------------------------------------------------------------*/
 
 static void
 m4_m4wrap (struct obstack *obs, int argc, macro_arguments *argv)
 {
   if (bad_argc (ARG (0), argc, 1, -1))
     return;
-  obs = push_wrapup_init ();
-  if (no_gnu_extensions)
-    obstack_grow (obs, ARG (1), ARG_LEN (1));
-  else
-    /* TODO - allow builtins, rather than always flattening.  */
-    arg_print (obs, argv, 1, NULL, true, " ", NULL, false);
-  push_wrapup_finish ();
+  wrap_args (argv);
 }
+
 
 /* Enable tracing of all specified macros, or all, if none is specified.
    Tracing is disabled by default, when a macro is defined.  This can be
diff --git a/src/input.c b/src/input.c
index b8784d0..86db704 100644
--- a/src/input.c
+++ b/src/input.c
@@ -529,12 +529,36 @@ struct obstack *
 push_wrapup_init (void)
 {
   input_block *i;
-  i = (input_block *) obstack_alloc (wrapup_stack, sizeof *i);
-  i->prev = wsp;
-  i->type = INPUT_STRING;
-  i->file = current_file;
-  i->line = current_line;
-  wsp = i;
+  token_chain *chain;
+
+  assert (obstack_object_size (wrapup_stack) == 0);
+  if (wsp)
+    {
+      i = wsp;
+      assert (i->type == INPUT_CHAIN && i->u.u_c.end
+             && i->u.u_c.end->type != CHAIN_LOC);
+    }
+  else
+    {
+      i = (input_block *) obstack_alloc (wrapup_stack, sizeof *i);
+      i->prev = wsp;
+      i->file = current_file;
+      i->line = current_line;
+      i->type = INPUT_CHAIN;
+      i->u.u_c.chain = i->u.u_c.end = NULL;
+      wsp = i;
+    }
+  chain = (token_chain *) obstack_alloc (wrapup_stack, sizeof *chain);
+  if (i->u.u_c.end)
+    i->u.u_c.end->next = chain;
+  else
+    i->u.u_c.chain = chain;
+  i->u.u_c.end = chain;
+  chain->next = NULL;
+  chain->type = CHAIN_LOC;
+  chain->quote_age = 0;
+  chain->u.u_l.file = current_file;
+  chain->u.u_l.line = current_line;
   return wrapup_stack;
 }
 
@@ -545,17 +569,7 @@ push_wrapup_init (void)
 void
 push_wrapup_finish (void)
 {
-  input_block *i = wsp;
-  if (obstack_object_size (wrapup_stack) == 0)
-    {
-      wsp = i->prev;
-      obstack_free (wrapup_stack, i);
-    }
-  else
-    {
-      i->u.u_s.len = obstack_object_size (wrapup_stack);
-      i->u.u_s.str = (char *) obstack_finish (wrapup_stack);
-    }
+  make_text_link (wrapup_stack, &wsp->u.u_c.chain, &wsp->u.u_c.end);
 }
 
 
@@ -610,6 +624,8 @@ pop_input (bool cleanup)
                return false;
              arg_adjust_refcount (chain->u.u_a.argv, false);
              break;
+           case CHAIN_LOC:
+             return false;
            default:
              assert (!"pop_input");
              abort ();
@@ -837,6 +853,8 @@ peek_input (bool allow_argv)
                  chain->u.u_a.comma = true;
                  push_string_finish ();
                  return peek_input (allow_argv);
+               case CHAIN_LOC:
+                 break;
                default:
                  assert (!"peek_input");
                  abort ();
@@ -863,16 +881,18 @@ peek_input (bool allow_argv)
 | string, so factor that out into a macro for speed.  If             |
 | ALLOW_QUOTE, and the current input matches the current quote age,  |
 | return CHAR_QUOTE and leave consumption of data for                |
-| append_quote_token.                                                |
+| append_quote_token; otherwise, if ALLOW_ARGV and the current input |
+| matches an argv reference with the correct quoting, return         |
+| CHAR_ARGV and leave consuption of data for init_argv_token.        |
 `-------------------------------------------------------------------*/
 
-#define next_char(AQ)                                                  \
+#define next_char(AQ, AA)                                              \
   (isp && isp->type == INPUT_STRING && isp->u.u_s.len && !input_change \
    ? (isp->u.u_s.len--, to_uchar (*isp->u.u_s.str++))                  \
-   : next_char_1 (AQ))
+   : next_char_1 (AQ, AA))
 
 static int
-next_char_1 (bool allow_quote)
+next_char_1 (bool allow_quote, bool allow_argv)
 {
   int ch;
   token_chain *chain;
@@ -929,6 +949,7 @@ next_char_1 (bool allow_quote)
          chain = isp->u.u_c.chain;
          while (chain)
            {
+             unsigned int argc;
              if (allow_quote && chain->quote_age == current_quote_age)
                return CHAR_QUOTE;
              switch (chain->type)
@@ -949,7 +970,8 @@ next_char_1 (bool allow_quote)
                    return CHAR_MACRO;
                  break;
                case CHAIN_ARGV:
-                 if (chain->u.u_a.index == arg_argc (chain->u.u_a.argv))
+                 argc = arg_argc (chain->u.u_a.argv);
+                 if (chain->u.u_a.index == argc)
                    {
                      arg_adjust_refcount (chain->u.u_a.argv, false);
                      break;
@@ -959,6 +981,12 @@ next_char_1 (bool allow_quote)
                      chain->u.u_a.comma = false;
                      return ',';
                    }
+                 /* Only return a reference if the quoting is correct
+                    and the reference has more than one argument
+                    left.  */
+                 if (allow_argv && chain->quote_age == current_quote_age
+                     && chain->u.u_a.quotes && chain->u.u_a.index + 1 < argc)
+                   return CHAR_ARGV;
                  /* Rather than directly parse argv here, we push
                     another input block containing the next unparsed
                     argument from argv.  */
@@ -970,7 +998,13 @@ next_char_1 (bool allow_quote)
                  chain->u.u_a.index++;
                  chain->u.u_a.comma = true;
                  push_string_finish ();
-                 return next_char_1 (allow_quote);
+                 return next_char_1 (allow_quote, allow_argv);
+               case CHAIN_LOC:
+                 isp->file = chain->u.u_l.file;
+                 isp->line = chain->u.u_l.line;
+                 input_change = true;
+                 isp->u.u_c.chain = chain->next;
+                 return next_char_1 (allow_quote, allow_argv);
                default:
                  assert (!"next_char_1");
                  abort ();
@@ -1002,7 +1036,7 @@ skip_line (const char *name)
   const char *file = current_file;
   int line = current_line;
 
-  while ((ch = next_char (false)) != CHAR_EOF && ch != '\n')
+  while ((ch = next_char (false, false)) != CHAR_EOF && ch != '\n')
     ;
   if (ch == CHAR_EOF)
     /* current_file changed to "" if we see CHAR_EOF, use the
@@ -1028,25 +1062,28 @@ skip_line (const char *name)
 static void
 init_macro_token (token_data *td)
 {
-  int ch = next_char (false);
-  assert (ch == CHAR_MACRO);
-  if (td)
-    TOKEN_DATA_TYPE (td) = TOKEN_FUNC;
+  token_chain *chain;
+
   if (isp->type == INPUT_MACRO)
     {
       assert (isp->u.func);
       if (td)
-       TOKEN_DATA_FUNC (td) = isp->u.func;
+       {
+         TOKEN_DATA_TYPE (td) = TOKEN_FUNC;
+         TOKEN_DATA_FUNC (td) = isp->u.func;
+       }
       isp->u.func = NULL;
     }
   else
     {
-      token_chain *chain;
       assert (isp->type == INPUT_CHAIN);
       chain = isp->u.u_c.chain;
       assert (!chain->quote_age && chain->type == CHAIN_FUNC && chain->u.func);
       if (td)
-       TOKEN_DATA_FUNC (td) = chain->u.func;
+       {
+         TOKEN_DATA_TYPE (td) = TOKEN_FUNC;
+         TOKEN_DATA_FUNC (td) = chain->u.func;
+       }
       chain->u.func = NULL;
     }
 }
@@ -1108,9 +1145,9 @@ init_argv_token (struct obstack *obs, token_data *td)
 {
   token_chain *src_chain;
   token_chain *chain;
-  int ch = next_char (true);
+  int ch;
 
-  assert (ch == CHAR_QUOTE && TOKEN_DATA_TYPE (td) == TOKEN_VOID
+  assert (TOKEN_DATA_TYPE (td) == TOKEN_VOID
          && isp->type == INPUT_CHAIN && isp->u.u_c.chain->type == CHAIN_ARGV
          && obs && obstack_object_size (obs) == 0);
 
@@ -1146,7 +1183,7 @@ init_argv_token (struct obstack *obs, token_data *td)
      decreased once the final element is parsed.  */
   assert (*curr_comm.str1 != ',' && *curr_comm.str1 != ')'
          && *curr_comm.str1 != *curr_quote.str1);
-  ch = peek_input (false);
+  ch = peek_input (true);
   if (ch != ',' && ch != ')')
     {
       isp->u.u_c.chain = src_chain;
@@ -1181,14 +1218,14 @@ match_input (const char *s, bool consume)
   if (s[1] == '\0')
     {
       if (consume)
-       next_char (false);
+       next_char (false, false);
       return true;                     /* short match */
     }
 
-  next_char (false);
+  next_char (false, false);
   for (n = 1, t = s++; (ch = peek_input (false)) == to_uchar (*s++); )
     {
-      next_char (false);
+      next_char (false, false);
       n++;
       if (*s == '\0')          /* long match */
        {
@@ -1556,15 +1593,13 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
   if (!line)
     line = &dummy;
 
-  /* Can't consume character until after CHAR_MACRO is handled.  */
   TOKEN_DATA_TYPE (td) = TOKEN_VOID;
-  ch = peek_input (allow_argv && current_quote_age);
+  ch = next_char (false, allow_argv && current_quote_age);
   if (ch == CHAR_EOF)
     {
 #ifdef DEBUG_INPUT
       xfprintf (stderr, "next_token -> EOF\n");
 #endif /* DEBUG_INPUT */
-      next_char (false);
       return TOKEN_EOF;
     }
   if (ch == CHAR_MACRO)
@@ -1588,7 +1623,6 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
       return TOKEN_ARGV;
     }
 
-  next_char (false); /* Consume character we already peeked at.  */
   file = current_file;
   *line = current_line;
   if (MATCH (ch, curr_comm.str1, true))
@@ -1598,7 +1632,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
       obstack_grow (obs_td, curr_comm.str1, curr_comm.len1);
       while (1)
        {
-         ch = next_char (false);
+         ch = next_char (false, false);
          if (ch == CHAR_EOF)
            /* Current_file changed to "" if we see CHAR_EOF, use the
               previous value we stored earlier.  */
@@ -1629,7 +1663,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
             && (isalnum (ch) || ch == '_'))
        {
          obstack_1grow (&token_stack, ch);
-         next_char (false);
+         next_char (false, false);
        }
       type = TOKEN_WORD;
     }
@@ -1652,7 +1686,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
              obstack_blank (&token_stack, -1);
              break;
            }
-         next_char (false);
+         next_char (false, false);
        }
 
       obstack_1grow (&token_stack, '\0');
@@ -1697,7 +1731,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
       type = TOKEN_STRING;
       while (1)
        {
-         ch = next_char (obs != NULL && current_quote_age);
+         ch = next_char (obs != NULL && current_quote_age, false);
          if (ch == CHAR_EOF)
            /* Current_file changed to "" if we see CHAR_EOF, use
               the previous value we stored earlier.  */
@@ -1721,7 +1755,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
                      xfprintf (stderr, "next_token -> MACDEF (%s)\n",
                                bp->name);
 #endif
-                     ch = next_char (false);
+                     ch = next_char (false, false);
                      MATCH (ch, curr_quote.str2, true);
                      return TOKEN_MACDEF;
                    }
diff --git a/src/m4.h b/src/m4.h
index ef45359..3e7fc76 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -283,7 +283,8 @@ enum token_chain_type
 {
   CHAIN_STR,   /* Link contains a string, u.u_s is valid.  */
   CHAIN_FUNC,  /* Builtin function definition, u.func is valid.  */
-  CHAIN_ARGV   /* Link contains a $@ reference, u.u_a is valid.  */
+  CHAIN_ARGV,  /* Link contains a $@ reference, u.u_a is valid.  */
+  CHAIN_LOC    /* Link contains location of m4wrap, u.u_l is valid.  */
 };
 
 /* Composite tokens are built of a linked list of chains.  Each link
@@ -315,6 +316,12 @@ struct token_chain
          const string_pair *quotes;    /* NULL for $*, quotes for 
address@hidden  */
        }
       u_a;
+      struct
+       {
+         const char *file;     /* File where subsequent links originate.  */
+         int line;             /* Line where subsequent links originate.  */
+       }
+      u_l;
     }
   u;
 };
@@ -508,6 +515,7 @@ void push_arg (struct obstack *, macro_arguments *, 
unsigned int);
 void push_arg_quote (struct obstack *, macro_arguments *, unsigned int,
                     const string_pair *);
 void push_args (struct obstack *, macro_arguments *, bool, bool);
+void wrap_args (macro_arguments *);
 
 /* Grab the text at argv index I.  Assumes macro_argument *argv is in
    scope, and aborts if the argument is not text.  */
diff --git a/src/macro.c b/src/macro.c
index f794d86..6a6a90c 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -946,11 +946,8 @@ arg_text (macro_arguments *argv, unsigned int index)
            case CHAIN_STR:
              obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
              break;
-           case CHAIN_FUNC:
-             /* TODO concatenate builtins.  */
-             assert (!"implemented");
-             abort ();
            case CHAIN_ARGV:
+             assert (!chain->u.u_a.has_func || argv->flatten);
              arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
                         quote_cache (NULL, chain->quote_age,
                                      chain->u.u_a.quotes),
@@ -1515,3 +1512,66 @@ push_args (struct obstack *obs, macro_arguments *argv, 
bool skip, bool quote)
   if (push_token (token, -1, argv->inuse))
     arg_mark (argv);
 }
+
+/* Push arguments from ARGV, which can include builtins, onto the wrap
+   stack for later rescanning.  If GNU extensions are disabled, only
+   the first argument is pushed; otherwise, all arguments are pushed
+   and separated with a space.  */
+void
+wrap_args (macro_arguments *argv)
+{
+  int i;
+  struct obstack *obs;
+  token_data *token;
+  token_chain *chain;
+
+  if ((argv->argc == 2 || no_gnu_extensions) && arg_empty (argv, 1))
+    return;
+
+  obs = push_wrapup_init ();
+  for (i = 1; i < (no_gnu_extensions ? 2 : argv->argc); i++)
+    {
+      if (i != 1)
+       obstack_1grow (obs, ' ');
+      token = arg_token (argv, i, NULL, false);
+      switch (TOKEN_DATA_TYPE (token))
+       {
+       case TOKEN_TEXT:
+         obstack_grow (obs, TOKEN_DATA_TEXT (token), TOKEN_DATA_LEN (token));
+         break;
+       case TOKEN_FUNC:
+         /* TODO allow builtins through m4wrap.  */
+         assert (false);
+       case TOKEN_COMP:
+         chain = token->u.u_c.chain;
+         while (chain)
+           {
+             switch (chain->type)
+               {
+               case CHAIN_STR:
+                 obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
+                 break;
+               case CHAIN_FUNC:
+                 /* TODO allow builtins through m4wrap.  */
+                 assert (false);
+                 break;
+               case CHAIN_ARGV:
+                 arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
+                            quote_cache (NULL, chain->quote_age,
+                                         chain->u.u_a.quotes),
+                            chain->u.u_a.flatten, NULL, NULL, false);
+                 break;
+               default:
+                 assert (!"wrap_args");
+                 abort ();
+               }
+             chain = chain->next;
+           }
+         break;
+       default:
+         assert (!"wrap_args");
+         abort ();
+       }
+    }
+  push_wrapup_finish ();
+}
-- 
1.5.4


reply via email to

[Prev in Thread] Current Thread [Next in Thread]