from Gary V. Vaughan
The experimental `changeword' feature never took off, and has no obvious advantages over `changesyntax' to compensate the enormous speed penalty it carries: * configure.in (ENABLE_CHANGEWORD): Removed. * m4/m4module.h (m4_set_word_regexp): Removed. * m4/m4private.h (m4_token_data): Removed original_text field. * m4/utility.c (m4_token_data_orig_text): Removed. * m4/input,c: Removed all conditional ENABLE_CHANGEWORD code. * m4/macro.c: Ditto. * src/main.c: Ditto. * modules/Makefile.am (changeword.la): Removed. * modules/changeword.c: File removed. * doc/m4.texinfo: References to changeword and --word-regexp removed. * po/POTFILES.in: modules/changeword.c removed. * tests/atlocal.in (ENABLE_CHANGEWORD): Removed. * tests/builtins.at (changeword): Test removed. Index: configure.in =================================================================== RCS file: /cvsroot/m4/m4/configure.in,v retrieving revision 1.18 diff -b -u -r1.18 configure.in --- configure.in 2001/08/27 08:02:19 1.18 +++ configure.in 2001/08/30 21:41:55 @@ -97,18 +97,6 @@ M4_AC_SYS_STACKOVF -AC_MSG_CHECKING([[if changeword is wanted]]) -AC_ARG_ENABLE([[changeword]], -[ --enable-changeword enable -W and changeword() builtin], -[if test "$enableval" = yes; then - AC_MSG_RESULT(yes) - AC_DEFINE(ENABLE_CHANGEWORD, 1, - [Define to 1 if the changeword(REGEXP) functionality is required.]) -else - AC_MSG_RESULT(no) -fi], [AC_MSG_RESULT(no)]) -AC_SUBST([ENABLE_CHANGEWORD], $enable_changeword) - AC_MSG_CHECKING(for modules to preload) m4_pattern_allow([^m4_default_preload$]) m4_default_preload="m4 traditional gnu" Index: doc/m4.texinfo =================================================================== RCS file: /cvsroot/m4/m4/doc/m4.texinfo,v retrieving revision 1.8 diff -b -u -r1.8 m4.texinfo --- doc/m4.texinfo 2001/08/17 12:06:17 1.8 +++ doc/m4.texinfo 2001/08/30 21:42:20 @@ -177,7 +177,6 @@ * Changequote:: Changing the quote characters * Changecom:: Changing the comment delimiters * Changesyntax:: Changing the lexical structure of the input -* Changeword:: Changing the lexical structure of words * M4wrap:: Saving input until end of input File inclusion @@ -360,12 +359,6 @@ @samp{m4_define} instead of @samp{define}, and @samp{m4___file__} instead of @samp{__file__}. address@hidden -W @var{REGEXP} address@hidden address@hidden -Use an alternative syntax for macro names. This experimental -option might not be present on all GNU @code{m4} installations. -(@pxref{Changeword} and @ref{Experiments}). - @item -M @var{DIRECTORY} @itemx address@hidden Specify an alternate @var{DIRECTORY} to search for modules. This option @@ -824,11 +817,6 @@ whatsoever on user defined macros. For example, with this option, one has to write @code{m4_dnl} and even @code{m4_m4exit}. -If your version of GNU @code{m4} has the @code{changeword} feature -compiled in, you have far more flexibility in specifying the -syntax of macro names, both builtin and user-defined. @xref{Changeword}, -for more information on this experimental feature. - Of course, the simplest way to prevent a name from being interpreted as a call to an existing macro is to quote it. The remainder of this section studies a little more deeply how quoting affects macro @@ -1844,7 +1832,6 @@ * Changequote:: Changing the quote characters * Changecom:: Changing the comment delimiters * Changesyntax:: Changing the lexical structure of the input -* Changeword:: Changing the lexical structure of words * M4wrap:: Saving input until end of input @end menu @@ -2300,126 +2287,6 @@ address@hidden Changeword address@hidden Changing the lexical structure of words - address@hidden lexical structure of words address@hidden words, lexical structure of address@hidden -The macro @code{changeword} and all associated functionality is -experimental (@pxref{Experiments}). It is only available if the address@hidden option was given to @code{configure}, at GNU address@hidden installation time. The functionality might change or even go -away in the future. @emph{Do not rely on it}. Please direct your -comments about it the same way you would do for bugs. address@hidden quotation - address@hidden {Builtin (changeword)} changeword (@var{regexp}) -A file being processed by @code{m4} is split into quoted strings, words -(potential macro names) and simple tokens (any other single character). -Initially a word is defined by the following regular expression: - address@hidden ignore address@hidden -[_a-zA-Z][_a-zA-Z0-9]* address@hidden example - -Using @code{changeword}, you can change this regular expression. - -The expansion of @code{changeword} is void. address@hidden deffn - -Relaxing @code{m4}'s lexical rules might be useful (for example) if you -wanted to apply translations to a file of numbers: - address@hidden ignore address@hidden -changeword(`[_a-zA-Z0-9]+') -define(`1', `0') -1 address@hidden -0 address@hidden address@hidden example - -The syntax for regular expressions is the same as in GNU Emacs. address@hidden, , Syntax of Regular Expressions, emacs, The GNU Emacs -Manual}. - -Tightening the lexical rules is less useful, because it will generally -make some of the builtins unavailable. You could use it to prevent -accidental call of builtins, for example: - address@hidden ignore address@hidden -define(`_indir', defn(`indir')) -changeword(`_[_a-zA-Z0-9]*') -esyscmd(`foo') -_indir(`esyscmd', `ls') address@hidden example - -Because @code{m4} constructs its words a character at a time, there -is a restriction on the regular expressions that may be passed to address@hidden This is that if your regular expression accepts address@hidden, it must also accept @samp{f} and @samp{fo}. - address@hidden has another function. If the regular expression -supplied contains any subexpressions in parentheses, then text outside -the first of these is discarded before symbol lookup. So: - address@hidden ignore address@hidden -changecom(`/*', `*/') -changeword(`#\([_a-zA-Z0-9]*\)') -#esyscmd(`ls') address@hidden example - address@hidden now requires a @samp{#} mark at the beginning of every -macro invocation, so one can use @code{m4} to preprocess shell -scripts without getting @code{shift} commands swallowed, and plain -text without losing various common words. - address@hidden's macro substitution is based on text, while @TeX{}'s is based -on tokens. @code{changeword} can throw this difference into relief. For -example, here is the same idea represented in @TeX{} and @code{m4}. -First, the @TeX{} version: - address@hidden ignore address@hidden address@hidden@address@hidden@} -\catcode`\@@=0 -\catcode`\\=12 -@@a address@hidden@@bye address@hidden example - address@hidden -Then, the @code{m4} version: - address@hidden ignore address@hidden -define(`a', `errprint(`Hello')') -changeword(`@@\([_a-zA-Z0-9]*\)') -@@a address@hidden(Hello) address@hidden example - -In the @TeX{} example, the first line defines a macro @code{a} to -print the message @samp{Hello}. The second line defines @key{@@} to -be usable instead of @key{\} as an escape character. The third line -defines @key{\} to be a normal printing character, not an escape. -The fourth line invokes the macro @code{a}. So, when @TeX{} is run -on this file, it displays the message @samp{Hello}. - -When the @code{m4} example is passed through @code{m4}, it outputs address@hidden(Hello)}. The reason for this is that @TeX{} does -lexical analysis of macro definition when the macro is @emph{defined}. address@hidden just stores the text, postponing the lexical analysis until -the macro is @emph{used}. - -You should note that using @code{changeword} will slow @code{m4} down -by a factor of about seven. - @node M4wrap @section Saving input @@ -2921,17 +2788,6 @@ Expands to the empty string. @end deffn address@hidden changeword -This module provides the implementation for the experimental address@hidden feature. Use of this builtin requires special compile -time support from the GNU @code{m4} binary and will generate an error if -that support was not compiled in for some reason. @xref{Changeword}, -for more details. The module also defines the following macro: - address@hidden {Macro (changeword)} __changeword__ -Expands to the empty string. address@hidden deffn - @end table @@ -3690,7 +3546,7 @@ Some care is necessary because not every effort has been made for this to work in all cases. In particular, the trace attribute of -macros is not handled, nor the current setting of @code{changeword}. +macros is not handled. Also, interactions for some options of @code{m4} being used in one call and not for the next, have not been fully analyzed yet. On the other end, you may be confident that stacks of @code{pushdef}'ed definitions @@ -3964,22 +3820,6 @@ The implementation does not seem to slow down @code{m4}, more likely the contrary. - address@hidden Changeword - -An experimental feature, which would improve @code{m4} usefulness, -allows for changing the syntax for what is a @dfn{word} in @code{m4}. -You should use: - address@hidden ignore address@hidden -./configure --enable-changeword address@hidden example - address@hidden -if you want this feature compiled in. The current implementation -slows down @code{m4} considerably and is hardly acceptable. So, it -might go away, do not count on it yet. @section Multiple precision arithmetic Index: m4/input.c =================================================================== RCS file: /cvsroot/m4/m4/m4/input.c,v retrieving revision 1.7 diff -b -u -r1.7 input.c --- m4/input.c 2001/08/20 20:25:25 1.7 +++ m4/input.c 2001/08/30 21:42:24 @@ -137,10 +137,6 @@ that a string is parsed equally whether there is a $ or not. The character $ is used by convention in user macros. */ -#ifdef ENABLE_CHANGEWORD -# include "regex.h" -#endif - static void check_use_macro_escape (void); static int file_peek (void); static int file_read (void); @@ -261,19 +257,8 @@ /* TRUE iff some character has M4_SYNTAX_ESCAPE */ static boolean use_macro_escape; -#ifdef ENABLE_CHANGEWORD - -#define DEFAULT_WORD_REGEXP "[_a-zA-Z][_a-zA-Z0-9]*" - -static char *word_start; -static struct re_pattern_buffer word_regexp; -static int default_word_regexp; -static struct re_registers regs; -#endif /* ENABLE_CHANGEWORD */ - - /* push_file () pushes an input file on the input stack, saving the current file name and line number. If next is non-NULL, this push invalidates a call to push_string_init (), whose storage are @@ -806,13 +791,6 @@ single_comments = TRUE; use_macro_escape = FALSE; - -#ifdef ENABLE_CHANGEWORD - if (user_word_regexp) - m4_set_word_regexp (user_word_regexp); - else - m4_set_word_regexp (DEFAULT_WORD_REGEXP); -#endif } void @@ -1005,49 +983,6 @@ check_use_macro_escape(); } -#ifdef ENABLE_CHANGEWORD - -void -m4_set_word_regexp (const char *regexp) -{ - int i; - char test[2]; - const char *msg; - static struct re_pattern_buffer new_regexp; - - if (!strcmp (regexp, DEFAULT_WORD_REGEXP)) - { - default_word_regexp = TRUE; - return; - } - - msg = re_compile_pattern (regexp, strlen (regexp), &new_regexp); - - if (msg != NULL) - { - M4ERROR ((warning_status, 0, - _("Bad regular expression `%s': %s"), regexp, msg)); - return; - } - - default_word_regexp = FALSE; - - word_regexp = new_regexp; - - if (word_start == NULL) - word_start = xmalloc (256); - - word_start[0] = '\0'; - test[1] = '\0'; - for (i = 1; i < 256; i++) - { - test[0] = i; - if (re_search (&word_regexp, test, 1, 0, 0, ®s) >= 0) - strcat (word_start, test); - } -} - -#endif /* ENABLE_CHANGEWORD */ /* Parse and return a single token from the input stream. A token can @@ -1067,10 +1002,6 @@ int ch; int quote_level; m4_token_t type; -#ifdef ENABLE_CHANGEWORD - int startpos; - char *orig_text = 0; -#endif do { obstack_free (&token_stack, token_bottom); @@ -1144,11 +1075,7 @@ type = M4_TOKEN_SIMPLE; /* escape before eof */ } } - else if ( -#ifdef ENABLE_CHANGEWORD - default_word_regexp && -#endif - (M4_IS_ALPHA (ch))) + else if (M4_IS_ALPHA (ch)) { obstack_1grow (&token_stack, ch); while ((ch = next_char ()) != CHAR_EOF && (M4_IS_ALNUM(ch))) @@ -1160,46 +1087,6 @@ type = use_macro_escape ? M4_TOKEN_STRING : M4_TOKEN_WORD; } - -#ifdef ENABLE_CHANGEWORD - - else if (!default_word_regexp && strchr (word_start, ch)) - { - obstack_1grow (&token_stack, ch); - while (1) - { - ch = m4_peek_input (); - if (ch == CHAR_EOF) - break; - obstack_1grow (&token_stack, ch); - startpos = re_search (&word_regexp, obstack_base (&token_stack), - obstack_object_size (&token_stack), 0, 0, - ®s); - if (startpos != 0 || - regs.end [0] != obstack_object_size (&token_stack)) - { - *(((char *) obstack_base (&token_stack) - + obstack_object_size (&token_stack)) - 1) = '\0'; - break; - } - next_char (); - } - - obstack_1grow (&token_stack, '\0'); - orig_text = obstack_finish (&token_stack); - - if (regs.start[1] != -1) - obstack_grow (&token_stack,orig_text + regs.start[1], - regs.end[1] - regs.start[1]); - else - obstack_grow (&token_stack, orig_text,regs.end[0]); - - type = M4_TOKEN_WORD; - } - -#endif /* ENABLE_CHANGEWORD */ - - else if (M4_IS_LQUOTE(ch)) /* QUOTED STRING, SINGLE QUOTES */ { quote_level = 1; @@ -1303,12 +1190,6 @@ M4_TOKEN_DATA_TYPE (td) = M4_TOKEN_TEXT; M4_TOKEN_DATA_TEXT (td) = obstack_finish (&token_stack); - -#ifdef ENABLE_CHANGEWORD - if (orig_text == NULL) - orig_text = M4_TOKEN_DATA_TEXT (td); - M4_TOKEN_DATA_ORIG_TEXT (td) = orig_text; -#endif #ifdef DEBUG_INPUT print_token("next_token", type, td); Index: m4/m4module.h =================================================================== RCS file: /cvsroot/m4/m4/m4/m4module.h,v retrieving revision 1.13 diff -b -u -r1.13 m4module.h --- m4/m4module.h 2001/08/20 20:25:25 1.13 +++ m4/m4module.h 2001/08/30 21:42:25 @@ -137,7 +137,6 @@ extern m4_token_data_t m4_token_data_type (m4_token_data*); extern char *m4_token_data_text (m4_token_data*); -extern char *m4_token_data_orig_text (m4_token_data*); extern m4_builtin_func *m4_token_data_func (m4_token_data*); extern boolean m4_token_data_func_traced (m4_token_data*); @@ -180,7 +179,6 @@ int warning_status; /* -E */ int nesting_limit; /* -L */ int discard_comments; /* -c */ -const char *user_word_regexp; /* -W */ /* left and right quote, begin and end comment */ m4_string lquote; @@ -395,9 +393,6 @@ extern void m4_set_quotes (const char *, const char *); extern void m4_set_comment (const char *, const char *); extern void m4_set_syntax (char, const unsigned char *); -#ifdef ENABLE_CHANGEWORD -extern void m4_set_word_regexp (const char *); -#endif int m4_current_diversion; int m4_output_current_line; Index: m4/m4private.h =================================================================== RCS file: /cvsroot/m4/m4/m4/m4private.h,v retrieving revision 1.4 diff -b -u -r1.4 m4private.h --- m4/m4private.h 2001/08/16 22:21:30 1.4 +++ m4/m4private.h 2001/08/30 21:42:25 @@ -39,9 +39,6 @@ union { struct { char *text; -#ifdef ENABLE_CHANGEWORD - char *original_text; -#endif } u_t; struct { m4_builtin_func *func; @@ -53,9 +50,6 @@ #define M4_TOKEN_DATA_TYPE(Td) ((Td)->type) #define M4_TOKEN_DATA_HANDLE(Td) ((Td)->handle) #define M4_TOKEN_DATA_TEXT(Td) ((Td)->u.u_t.text) -#ifdef ENABLE_CHANGEWORD -# define M4_TOKEN_DATA_ORIG_TEXT(Td) ((Td)->u.u_t.original_text) -#endif #define M4_TOKEN_DATA_FUNC(Td) ((Td)->u.u_f.func) #define M4_TOKEN_DATA_FUNC_TRACED(Td) ((Td)->u.u_f.traced) Index: m4/macro.c =================================================================== RCS file: /cvsroot/m4/m4/m4/macro.c,v retrieving revision 1.6 diff -b -u -r1.6 macro.c --- m4/macro.c 2001/08/20 20:25:25 1.6 +++ m4/macro.c 2001/08/30 21:42:26 @@ -76,13 +76,8 @@ && SYMBOL_BLIND_NO_ARGS (symbol) && !M4_IS_OPEN(m4_peek_input ()))) { -#ifdef ENABLE_CHANGEWORD - m4_shipout_text (obs, M4_TOKEN_DATA_ORIG_TEXT (td), - strlen (M4_TOKEN_DATA_ORIG_TEXT (td))); -#else m4_shipout_text (obs, M4_TOKEN_DATA_TEXT (td), strlen (M4_TOKEN_DATA_TEXT (td))); -#endif } else expand_macro (symbol); Index: m4/utility.c =================================================================== RCS file: /cvsroot/m4/m4/m4/utility.c,v retrieving revision 1.7 diff -b -u -r1.7 utility.c --- m4/utility.c 2001/08/20 20:25:25 1.7 +++ m4/utility.c 2001/08/30 21:42:26 @@ -62,9 +62,6 @@ /* Artificial limit for expansion_level in macro.c. */ int nesting_limit = 250; -/* User provided regexp for describing m4 words. */ -const char *user_word_regexp = NULL; - /* If nonzero, comments are discarded in the token parser. */ int discard_comments = 0; @@ -96,16 +93,6 @@ return M4_TOKEN_DATA_TEXT(name); } -char * -m4_token_data_orig_text (m4_token_data *name) -{ -#ifdef ENABLE_CHANGEWORD - return M4_TOKEN_DATA_ORIG_TEXT(name); -#else - return NULL; -#endif -} - m4_builtin_func * m4_token_data_func (m4_token_data *name) { Index: modules/Makefile.am =================================================================== RCS file: /cvsroot/m4/m4/modules/Makefile.am,v retrieving revision 1.11 diff -b -u -r1.11 Makefile.am --- modules/Makefile.am 2001/08/27 07:49:30 1.11 +++ modules/Makefile.am 2001/08/30 21:42:28 @@ -31,12 +31,9 @@ LIBS = $(top_builddir)/m4/libm4.la LDFLAGS = -no-undefined -pkglibexec_LTLIBRARIES = changeword.la gnu.la load.la m4.la \ +pkglibexec_LTLIBRARIES = gnu.la load.la m4.la \ mpeval.la traditional.la perl.la \ modtest.la shadow.la stdlib.la time.la - -changeword_la_SOURCES = changeword.c -changeword_la_LDFLAGS = -module gnu_la_SOURCES = gnu.c EXTRA_gnu_la_SOURCES = format.c Index: modules/changeword.c =================================================================== RCS file: changeword.c diff -N changeword.c --- /tmp/cvsY22d1C Thu Aug 30 14:42:29 2001 +++ /dev/null Sat Apr 14 17:46:23 2001 @@ -1,95 +0,0 @@ -/* GNU m4 -- A simple macro processor - Copyright 2000 Free Software Foundation, Inc. - - This program is free software; you can redistribute it and/or modify - it under the terms of the GNU General Public License as published by - the Free Software Foundation; either version 2 of the License, or - (at your option) any later version. - - This program is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - GNU General Public License for more details. - - You should have received a copy of the GNU General Public License - along with this program; if not, write to the Free Software - Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA - 02111-1307 USA -*/ - -#if HAVE_CONFIG_H -# include