[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Changes to m4/doc/m4.texinfo,v
From: |
Eric Blake |
Subject: |
Changes to m4/doc/m4.texinfo,v |
Date: |
Sat, 03 Feb 2007 23:45:44 +0000 |
CVSROOT: /sources/m4
Module name: m4
Changes by: Eric Blake <ericb> 07/02/03 23:45:44
Index: doc/m4.texinfo
===================================================================
RCS file: /sources/m4/m4/doc/m4.texinfo,v
retrieving revision 1.94
retrieving revision 1.95
diff -u -b -r1.94 -r1.95
--- doc/m4.texinfo 23 Jan 2007 14:28:22 -0000 1.94
+++ doc/m4.texinfo 3 Feb 2007 23:45:43 -0000 1.95
@@ -464,7 +464,8 @@
@error{}and an error message
@end example
-The sequence @samp{^D} in an example indicates the end of the input file.
+The sequence @samp{^D} in an example indicates the end of the input
+file. The sequence @address@hidden refers to the newline character.
The majority of these examples are self-contained, and you can run them
with similar results. In fact, the testsuite that is bundled in the
@acronym{GNU} M4 package consists in part of the examples
@@ -1142,9 +1143,11 @@
call will be read and parsed into tokens again.
@code{m4} expands a macro as soon as possible. If it finds a macro call
-when collecting the arguments to another, it will expand the second
-call first. For a running example, examine how @code{m4} handles this
-input:
+when collecting the arguments to another, it will expand the second call
+first. This process continues until there are no more macro calls to
+expand and all the input has been consumed.
+
+For a running example, examine how @code{m4} handles this input:
@comment ignore
@example
@@ -1179,11 +1182,134 @@
@result{}Result is 32768
@end example
-The order in which @code{m4} expands the macros can be explored using
-the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
+As a more complicated example, we will contrast an actual code example
+from the Gnulib address@hidden from a patch in
address@hidden://lists.gnu.org/archive/html/bug-gnulib/@/2007-01/@/msg00389.html},
+and a followup patch in
address@hidden://lists.gnu.org/archive/html/bug-gnulib/@/2007-02/@/msg00000.html}},
+showing both a buggy approach and the desired results. The user desires
+to output a shell assignment statement that takes its argument and turns
+it into a shell variable by converting it to uppercase and prepending a
+prefix. The original attempt looks like this:
+
address@hidden
+changequote([,])dnl
+define([gl_STRING_MODULE_INDICATOR],
+ [
+ dnl comment
+ GNULIB_]translit([$1],[a-z],[A-Z])[=1
+ ])dnl
+ gl_STRING_MODULE_INDICATOR([strcase])
address@hidden @w{ }
address@hidden GNULIB_strcase=1
address@hidden @w{ }
address@hidden example
+
+Oops -- the argument did not get capitalized. And although the manual
+is not able to easily show it, both lines that appear empty actually
+contain two trailing spaces. By stepping through the parse, it is easy
+to see what happened. First, @code{m4} sees the token
address@hidden, which it recognizes as a macro, followed by
address@hidden(}, @samp{[}, @samp{,}, @samp{]}, and @samp{)} to form the
+argument list. The macro expands to the empty string, but changes the
+quoting characters to something more useful for generating shell code
+(unbalanced @samp{`} and @samp{'} appear all the time in shell scripts,
+but unbalanced @samp{[]} tend to be rare). Also in the first line,
address@hidden sees the token @samp{dnl}, which it recognizes as a builtin
+macro that consumes the rest of the line, resulting in no output for
+that line.
+
+The second line starts a macro definition. @code{m4} sees the token
address@hidden, which it recognizes as a macro, followed by a @samp{(},
address@hidden, and @samp{,}. Because an unquoted
+comma was encountered, the first argument is known to be the expansion
+of the single-quoted string token, or @samp{gl_STRING_MODULE_INDICATOR}.
+Next, @code{m4} sees @address@hidden, @samp{ }, and @samp{ }, but this
+whitespace is discarded as part of argument collection. Then comes a
+rather lengthy single-quoted string token, @address@hidden@ @ @ @ dnl
address@hidden@ @ @ @ GNULIB_]}. This is followed by the token
address@hidden, which @code{m4} recognizes as a macro name, so a nested
+macro expansion has started.
+
+The arguments to the @code{translit} are found by the tokens @samp{(},
address@hidden, @samp{,}, @samp{[a-z]}, @samp{,}, @samp{[A-Z]}, and finally
address@hidden)}. All three string arguments are expanded (or in other words,
+the quotes are stripped), and since neither @samp{$} nor @samp{1} need
+capitalization, the result of the macro is @samp{$1}. This expansion is
+rescanned, resulting in the two literal characters @samp{$} and
address@hidden
+
+Scanning of the outer macro resumes, and picks up with
address@hidden@key{NL}@ @ ]}, and finally @samp{)}. The collected pieces of
+expanded text are concatenated, with the end result that the macro
address@hidden is now defined to be the sequence
address@hidden@key{NL}@ @ @ @ dnl address@hidden@ @ @ @ address@hidden@ @ }.
+Once again, @samp{dnl} is recognized and avoids a newline in the output.
+
+The final line is then parsed, beginning with @samp{ } and @samp{ }
+that are output literally. Then @samp{gl_STRING_MODULE_INDICATOR} is
+recognized as a macro name, with an argument list of @samp{(},
address@hidden, and @samp{)}. Since the definition of the macro
+contains the sequence @samp{$1}, that sequence is replaced with the
+argument @samp{strcase} prior to starting the rescan. The rescan sees
address@hidden@key{NL}} and four spaces, which are output literally, then
address@hidden, which discards the text @samp{ address@hidden Next
+comes four more spaces, also output literally, and the token
address@hidden, which resulted from the earlier parameter
+substitution. Since that is not a macro name, it is output literally,
+followed by the literal tokens @samp{=}, @samp{1}, @address@hidden, and
+two more spaces. Finally, the original @address@hidden seen after the
+macro invocation is scanned and output literally.
+
+Now for a corrected approach. This rearranges the use of newlines and
+whitespace so that less whitespace is output (which, although harmless
+to shell scripts, can be visually unappealing), and fixes the quoting
+issues so that the capitalization occurs when the macro
address@hidden is invoked, rather then when it is
+defined.
+
address@hidden
+changequote([,])dnl
+define([gl_STRING_MODULE_INDICATOR],
+ [dnl comment
+ GNULIB_[]translit([$1], [a-z], [A-Z])=1dnl
+])dnl
+ gl_STRING_MODULE_INDICATOR([strcase])
address@hidden GNULIB_STRCASE=1
address@hidden example
+
+The parsing of the first line is unchanged. The second line sees the
+name of the macro to define, then sees the discarded @address@hidden
+and two spaces, as before. But this time, the next token is
address@hidden address@hidden@ @ GNULIB_[]translit([$1], [a-z],
+[A-Z])address@hidden, which includes nested quotes, followed by
address@hidden)} to end the macro definition and @samp{dnl} to skip the
+newline. No early expansion of @code{translit} occurs, so the entire
+string becomes the definition of the macro.
+
+The final line is then parsed, beginning with two spaces that are
+output literally, and an invocation of
address@hidden with the argument @samp{strcase}.
+Again, the @samp{$1} in the macro definition is substituted prior to
+rescanning. Rescanning first encounters @samp{dnl}, and discards
address@hidden address@hidden Then two spaces are output literally. Next
+comes the token @samp{GNULIB_}, but that is not a macro, so it is
+output literally. The token @samp{[]} is an empty string, so it does
+not affect output. Then the token @samp{translit} is encountered.
+
+This time, the arguments to @code{translit} are parsed as @samp{(},
address@hidden, @samp{,}, @samp{ }, @samp{[a-z]}, @samp{,}, @samp{ },
address@hidden, and @samp{)}. The two spaces are discarded, and the
+translit results in the desired result @samp{STRCASE}. This is
+rescanned, but since it is not a macro name, it is output literally.
+Then the scanner sees @samp{=} and @samp{1}, which are output
+literally, followed by @samp{dnl} which discards the rest of the
+definition of @code{gl_STRING_MODULE_INDICATOR}. The newline at the
+end of output is the literal @address@hidden that appeared after the
+invocation of the macro.
-This process continues until there are no more macro calls to expand and
-all the input has been consumed.
+The order in which @code{m4} expands the macros can be further explored
+using the trace facilities of @acronym{GNU} @code{m4} (@pxref{Trace}).
@node Regular expression syntax
@section How @code{m4} interprets regular expressions
@@ -1524,14 +1650,37 @@
foo(`() (() (')
@end example
-It is, however, in certain cases necessary or convenient to leave out
-quotes for some arguments, and there is nothing wrong in doing it. It
-just makes life a bit harder, if you are not careful. For consistency,
-this manual follows the rule of thumb that each layer of parentheses
-introduces another layer of single quoting, except when showing the
-consequences of quoting rules. This is done even when the quoted string
-cannot be a macro, such as with integers when you have not changed the
-syntax via @code{changesyntax} (@pxref{Changesyntax}).
+It is, however, in certain cases necessary (because nested expansion
+must occur to create the arguments for the outer macro) or convenient
+(because it uses fewer characters) to leave out quotes for some
+arguments, and there is nothing wrong in doing it. It just makes life a
+bit harder, if you are not careful to follow a consistent quoting style.
+For consistency, this manual follows the rule of thumb that each layer
+of parentheses introduces another layer of single quoting, except when
+showing the consequences of quoting rules. This is done even when the
+quoted string cannot be a macro, such as with integers when you have not
+changed the syntax via @code{changesyntax} (@pxref{Changesyntax}).
+
+The quoting rule of thumb of one level of quoting per parentheses has a
+nice property: when a macro name appears inside parentheses, you can
+determine when it will be expanded. If it is not quoted, it will be
+expanded prior to the outer macro, so that its expansion becomes the
+argument. If it is single-quoted, it will be expanded after the outer
+macro. And if it is double-quoted, it will be used as literal text
+instead of a macro name.
+
address@hidden
+define(`active', `ACT, IVE')
address@hidden
+define(`show', `$1 $1')
address@hidden
+show(active)
address@hidden ACT
+show(`active')
address@hidden, IVE ACT, IVE
+show(``active'')
address@hidden active
address@hidden example
@node Macro expansion
@section Macro expansion
- Changes to m4/doc/m4.texinfo,v,
Eric Blake <=