m4-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

index [was: m4 regex usage]


From: Eric Blake-1
Subject: index [was: m4 regex usage]
Date: Mon, 1 Oct 2007 14:39:42 -0700 (PDT)

> 2007-09-29  Eric Blake  <address@hidden>
> 
>       Optimize for Autoconf usage pattern.
>       * src/builtin.c (m4_regexp, m4_patsubst): Handle empty regex
>       faster.

Less noticeable, but now that Autoconf is doing more single-character
index searches, we might as well cater to that, too.  It's appalling how
many platforms still provide O(n*m) strstr implementations, especially
when you consider that Knuth-Morris-Pratt was published in the 70s;
but even on platforms with O(n+m) strstr, strchr tends to be faster
for this special case.

From: Eric Blake <address@hidden>
Date: Mon, 1 Oct 2007 14:30:25 -0600
Subject: [PATCH] Another Autoconf usage pattern optimization.

* src/builtin.c (m4_index): Optimize search for one byte.
* doc/m4.texinfo (Index macro, Regexp, Patsubst): Test the new
code paths.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog      |    7 +++++++
 doc/m4.texinfo |   21 ++++++++++++++++++---
 src/builtin.c  |   19 +++++++++++++++----
 3 files changed, 40 insertions(+), 7 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index f29b557..396a64f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2007-10-01  Eric Blake  <address@hidden>
+
+       Another Autoconf usage pattern optimization.
+       * src/builtin.c (m4_index): Optimize search for one byte.
+       * doc/m4.texinfo (Index macro, Regexp, Patsubst): Test the new
+       code paths.
+
 2007-09-29  Eric Blake  <address@hidden>
 
        Optimize for Autoconf usage pattern.
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index a24d36e..6c76b7b 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -4610,12 +4610,17 @@ index(`gnus, gnats, and armadillos', `dag')
 @result{}-1
 @end example
 
-Omitting @var{substring} evokes a warning, but still produces output.
+Omitting @var{substring} evokes a warning, but still produces output;
+contrast this with an empty @var{substring}.
 
 @example
 index(`abc')
 @error{}m4:stdin:1: Warning: too few arguments to builtin `index'
 @result{}0
+index(`abc', `')
address@hidden
+index(`abc', `b')
address@hidden
 @end example
 
 @node Regexp
@@ -4688,12 +4693,17 @@ regexp(`abc', `\(\(d\)?\)\(c\)', `\1\2\3\4\5\6')
 @result{}c
 @end example
 
-Omitting @var{regexp} evokes a warning, but still produces output.
+Omitting @var{regexp} evokes a warning, but still produces output;
+contrast this with an empty @var{regexp} argument.
 
 @example
 regexp(`abc')
 @error{}m4:stdin:1: Warning: too few arguments to builtin `regexp'
 @result{}0
+regexp(`abc', `')
address@hidden
+regexp(`abc', `', `def')
address@hidden
 @end example
 
 @node Substr
@@ -4904,12 +4914,17 @@ patreg(`aba abb 121', `\(.\)\(.\)\1', `\2\1\2')
 @result{}bab
 @end example
 
-Omitting @var{regexp} evokes a warning, but still produces output.
+Omitting @var{regexp} evokes a warning, but still produces output;
+contrast this with an empty @var{regexp} argument.
 
 @example
 patsubst(`abc')
 @error{}m4:stdin:1: Warning: too few arguments to builtin `patsubst'
 @result{}abc
+patsubst(`abc', `')
address@hidden
+patsubst(`abc', `', `-')
address@hidden
 @end example
 
 @node Format
diff --git a/src/builtin.c b/src/builtin.c
index 65f4585..8974b1a 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -1677,8 +1677,9 @@ static void
 m4_index (struct obstack *obs, int argc, token_data **argv)
 {
   const char *haystack;
-  const char *result;
-  int retval;
+  const char *needle;
+  const char *result = NULL;
+  int retval = -1;
 
   if (bad_argc (argv[0], argc, 3, 3))
     {
@@ -1689,8 +1690,18 @@ m4_index (struct obstack *obs, int argc, token_data
**argv)
     }
 
   haystack = ARG (1);
-  result = strstr (haystack, ARG (2));
-  retval = result ? result - haystack : -1;
+  needle = ARG (2);
+
+  /* Optimize searching for the empty string (always 0) and one byte
+     (strchr tends to be more efficient than strstr).  */
+  if (!needle[0])
+    retval = 0;
+  else if (!needle[1])
+    result = strchr (haystack, *needle);
+  else
+    result = strstr (haystack, needle);
+  if (result)
+    retval = result - haystack;
 
   shipout_int (obs, retval);
 }
-- 
1.5.3.2


-- 
View this message in context: 
http://www.nabble.com/Re%3A-Multi-Line-Definitions-tf4540504.html#a12988525
Sent from the Gnu - M4 - Discuss mailing list archive at Nabble.com.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]