bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

please keep 'memmem' module simple


From: Simon Josefsson
Subject: please keep 'memmem' module simple
Date: Tue, 08 Jan 2008 15:47:20 +0100
User-agent: Gnus/5.110007 (No Gnus v0.7) Emacs/22.1 (gnu/linux)

I reviewed the memmem situation for gnutls, and the recent changes are
somewhat large.  Memmem now pulls in:

! lgl/m4/eealloc.m4
! lgl/m4/malloca.m4
! lgl/m4/memchr.m4
! lgl/m4/memcmp.m4
! lgl/m4/memmem.m4
! lgl/malloca.c
! lgl/malloca.h
! lgl/malloca.valgrind
! lgl/memchr.c
! lgl/memcmp.c
! lgl/memmem.c

What's worse is that the memmem module actually gets compiled on glibc
systems, causing the size increase on all users on our primary platform.

The 'memmem' function in glibc isn't documented to be in linear speed,
and the old complexity hasn't shown itself to be a bottleneck in gnutls.
The current gnulib implementation thus seems to offer something more
than just a replacement function from glibc.

Thus, I think it is appropriate to rename the current efficient 'memmem'
gnulib module to 'memmem-fast' and revert to the glibc implementation in
the gnulib 'memmem' module.

I don't want to belittle the work that went into the memmem
optimization, it seems to be very fine work.  I just think that gnulib
users should have a choice of whether they need a basic memmem or an
optimized memmem.

Thoughts?

The patch below may be a bit difficult to parse, but the idea is to
rename the current memmem module to memmem-fast and to re-add the old
memmem function.  And fixing the description fields for the modules, and
the maintainer field (who wants to be gnulib maintainer of the fast
version?  I don't think it makes sense for it to read either glibc or
me).

/Simon

diff --git a/lib/memmem.c b/lib/memmem.c
index 5aa704c..e882222 100644
--- a/lib/memmem.c
+++ b/lib/memmem.c
@@ -1,4 +1,4 @@
-/* Copyright (C) 1991,92,93,94,96,97,98,2000,2004,2007,2008 Free Software 
Foundation, Inc.
+/* Copyright (C) 1991,92,93,94,96,97,98,2000,2004,2007 Free Software 
Foundation, Inc.
    This file is part of the GNU C Library.
 
    This program is free software; you can redistribute it and/or modify
@@ -21,149 +21,24 @@
 
 #include <stddef.h>
 #include <string.h>
-#include <stdbool.h>
-
-#include "malloca.h"
 
 #ifndef _LIBC
 # define __builtin_expect(expr, val)   (expr)
 #endif
 
-/* Knuth-Morris-Pratt algorithm.
-   See http://en.wikipedia.org/wiki/Knuth-Morris-Pratt_algorithm
-   Return a boolean indicating success.  */
-
-static bool
-knuth_morris_pratt (const unsigned char *haystack,
-                    const unsigned char *last_haystack,
-                    const unsigned char *needle, size_t m,
-                    const unsigned char **resultp)
-{
-  /* Allocate the table.  */
-  size_t *table = (size_t *) nmalloca (m, sizeof (size_t));
-  if (table == NULL)
-    return false;
-  /* Fill the table.
-     For 0 < i < m:
-       0 < table[i] <= i is defined such that
-       forall 0 < x < table[i]: needle[x..i-1] != needle[0..i-1-x],
-       and table[i] is as large as possible with this property.
-     This implies:
-     1) For 0 < i < m:
-          If table[i] < i,
-          needle[table[i]..i-1] = needle[0..i-1-table[i]].
-     2) For 0 < i < m:
-          rhaystack[0..i-1] == needle[0..i-1]
-          and exists h, i <= h < m: rhaystack[h] != needle[h]
-          implies
-          forall 0 <= x < table[i]: rhaystack[x..x+m-1] != needle[0..m-1].
-     table[0] remains uninitialized.  */
-  {
-    size_t i, j;
-
-    /* i = 1: Nothing to verify for x = 0.  */
-    table[1] = 1;
-    j = 0;
-
-    for (i = 2; i < m; i++)
-      {
-       /* Here: j = i-1 - table[i-1].
-          The inequality needle[x..i-1] != needle[0..i-1-x] is known to hold
-          for x < table[i-1], by induction.
-          Furthermore, if j>0: needle[i-1-j..i-2] = needle[0..j-1].  */
-       unsigned char b = needle[i - 1];
-
-       for (;;)
-         {
-           /* Invariants: The inequality needle[x..i-1] != needle[0..i-1-x]
-              is known to hold for x < i-1-j.
-              Furthermore, if j>0: needle[i-1-j..i-2] = needle[0..j-1].  */
-           if (b == needle[j])
-             {
-               /* Set table[i] := i-1-j.  */
-               table[i] = i - ++j;
-               break;
-             }
-           /* The inequality needle[x..i-1] != needle[0..i-1-x] also holds
-              for x = i-1-j, because
-                needle[i-1] != needle[j] = needle[i-1-x].  */
-           if (j == 0)
-             {
-               /* The inequality holds for all possible x.  */
-               table[i] = i;
-               break;
-             }
-           /* The inequality needle[x..i-1] != needle[0..i-1-x] also holds
-              for i-1-j < x < i-1-j+table[j], because for these x:
-                needle[x..i-2]
-                = needle[x-(i-1-j)..j-1]
-                != needle[0..j-1-(x-(i-1-j))]  (by definition of table[j])
-                   = needle[0..i-2-x],
-              hence needle[x..i-1] != needle[0..i-1-x].
-              Furthermore
-                needle[i-1-j+table[j]..i-2]
-                = needle[table[j]..j-1]
-                = needle[0..j-1-table[j]]  (by definition of table[j]).  */
-           j = j - table[j];
-         }
-       /* Here: j = i - table[i].  */
-      }
-  }
-
-  /* Search, using the table to accelerate the processing.  */
-  {
-    size_t j;
-    const unsigned char *rhaystack;
-    const unsigned char *phaystack;
-
-    *resultp = NULL;
-    j = 0;
-    rhaystack = haystack;
-    phaystack = haystack;
-    /* Invariant: phaystack = rhaystack + j.  */
-    while (phaystack != last_haystack)
-      if (needle[j] == *phaystack)
-       {
-         j++;
-         phaystack++;
-         if (j == m)
-           {
-             /* The entire needle has been found.  */
-             *resultp = rhaystack;
-             break;
-           }
-       }
-      else if (j > 0)
-       {
-         /* Found a match of needle[0..j-1], mismatch at needle[j].  */
-         rhaystack += table[j];
-         j -= table[j];
-       }
-      else
-       {
-         /* Found a mismatch at needle[0] already.  */
-         rhaystack++;
-         phaystack++;
-       }
-  }
+#undef memmem
 
-  freea (table);
-  return true;
-}
-
-/* Return the first occurrence of NEEDLE in HAYSTACK.  Return HAYSTACK
-   if NEEDLE_LEN is 0, otherwise NULL if NEEDLE is not found in
-   HAYSTACK.  */
+/* Return the first occurrence of NEEDLE in HAYSTACK.  */
 void *
-memmem (const void *haystack_start, size_t haystack_len,
-       const void *needle_start, size_t needle_len)
+memmem (haystack, haystack_len, needle, needle_len)
+     const void *haystack;
+     size_t haystack_len;
+     const void *needle;
+     size_t needle_len;
 {
-  /* Abstract memory is considered to be an array of 'unsigned char' values,
-     not an array of 'char' values.  See ISO C 99 section 6.2.6.1.  */
-  const unsigned char *haystack = (const unsigned char *) haystack_start;
-  const unsigned char *needle = (const unsigned char *) needle_start;
-  const unsigned char *last_haystack = haystack + haystack_len;
-  const unsigned char *last_needle = needle + needle_len;
+  const char *begin;
+  const char *const last_possible
+    = (const char *) haystack + haystack_len - needle_len;
 
   if (needle_len == 0)
     /* The first occurrence of the empty string is deemed to occur at
@@ -175,82 +50,12 @@ memmem (const void *haystack_start, size_t haystack_len,
   if (__builtin_expect (haystack_len < needle_len, 0))
     return NULL;
 
-  /* Use optimizations in memchr when possible.  */
-  if (__builtin_expect (needle_len == 1, 0))
-    return memchr (haystack, *needle, haystack_len);
-
-  /* Minimizing the worst-case complexity:
-     Let n = haystack_len, m = needle_len.
-     The naïve algorithm is O(n*m) worst-case.
-     The Knuth-Morris-Pratt algorithm is O(n) worst-case but it needs a
-     memory allocation.
-     To achieve linear complexity and yet amortize the cost of the
-     memory allocation, we activate the Knuth-Morris-Pratt algorithm
-     only once the naïve algorithm has already run for some time; more
-     precisely, when
-       - the outer loop count is >= 10,
-       - the average number of comparisons per outer loop is >= 5,
-       - the total number of comparisons is >= m.
-     But we try it only once.  If the memory allocation attempt failed,
-     we don't retry it.  */
-  {
-    bool try_kmp = true;
-    size_t outer_loop_count = 0;
-    size_t comparison_count = 0;
-
-    /* Speed up the following searches of needle by caching its first
-       byte.  */
-    unsigned char b = *needle++;
-
-    for (;; haystack++)
-      {
-       if (haystack == last_haystack)
-         /* No match.  */
-         return NULL;
-
-       /* See whether it's advisable to use an asymptotically faster
-          algorithm.  */
-       if (try_kmp
-           && outer_loop_count >= 10
-           && comparison_count >= 5 * outer_loop_count)
-         {
-           /* See if needle + comparison_count now reaches the end of
-              needle.  */
-           if (comparison_count >= needle_len)
-             {
-               /* Try the Knuth-Morris-Pratt algorithm.  */
-               const unsigned char *result;
-               if (knuth_morris_pratt (haystack, last_haystack,
-                                       needle - 1, needle_len, &result))
-                 return (void *) result;
-               try_kmp = false;
-             }
-         }
-
-       outer_loop_count++;
-       comparison_count++;
-       if (*haystack == b)
-         /* The first byte matches.  */
-         {
-           const unsigned char *rhaystack = haystack + 1;
-           const unsigned char *rneedle = needle;
-
-           for (;; rhaystack++, rneedle++)
-             {
-               if (rneedle == last_needle)
-                 /* Found a match.  */
-                 return (void *) haystack;
-               if (rhaystack == last_haystack)
-                 /* No match.  */
-                 return NULL;
-               comparison_count++;
-               if (*rhaystack != *rneedle)
-                 /* Nothing in this round.  */
-                 break;
-             }
-         }
-      }
-  }
+  for (begin = (const char *) haystack; begin <= last_possible; ++begin)
+    if (begin[0] == ((const char *) needle)[0] &&
+       !memcmp ((const void *) &begin[1],
+                (const void *) ((const char *) needle + 1),
+                needle_len - 1))
+      return (void *) begin;
 
   return NULL;
 }
diff --git a/m4/memmem-fast.m4 b/m4/memmem-fast.m4
index 9767354..2a764c2 100644
--- a/m4/memmem-fast.m4
+++ b/m4/memmem-fast.m4
@@ -4,7 +4,7 @@ dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
 dnl with or without modifications, as long as this notice is preserved.
 
-AC_DEFUN([gl_FUNC_MEMMEM],
+AC_DEFUN([gl_FUNC_MEMMEM_FAST],
 [
   dnl Persuade glibc <string.h> to declare memmem().
   AC_REQUIRE([AC_USE_SYSTEM_EXTENSIONS])
@@ -50,4 +50,4 @@ AC_DEFUN([gl_FUNC_MEMMEM],
 ])
 
 # Prerequisites of lib/memmem.c.
-AC_DEFUN([gl_PREREQ_MEMMEM], [:])
+AC_DEFUN([gl_PREREQ_MEMMEM_FAST], [:])
diff --git a/m4/memmem.m4 b/m4/memmem.m4
index 9767354..e6a3da1 100644
--- a/m4/memmem.m4
+++ b/m4/memmem.m4
@@ -1,5 +1,5 @@
-# memmem.m4 serial 7
-dnl Copyright (C) 2002, 2003, 2004, 2007, 2008 Free Software Foundation, Inc.
+# memmem.m4 serial 5
+dnl Copyright (C) 2002, 2003, 2004, 2007 Free Software Foundation, Inc.
 dnl This file is free software; the Free Software Foundation
 dnl gives unlimited permission to copy and/or distribute it,
 dnl with or without modifications, as long as this notice is preserved.
@@ -14,37 +14,6 @@ AC_DEFUN([gl_FUNC_MEMMEM],
   AC_CHECK_DECLS_ONCE(memmem)
   if test $ac_cv_have_decl_memmem = no; then
     HAVE_DECL_MEMMEM=0
-  else
-    AC_CACHE_CHECK([whether memmem works in linear time],
-      [gl_cv_func_memmem_works],
-      [AC_RUN_IFELSE([AC_LANG_PROGRAM([
-#include <string.h> /* for memmem */
-#include <stdlib.h> /* for malloc */
-#include <unistd.h> /* for alarm */
-], [[size_t m = 1000000;
-    char *haystack = (char *) malloc (2 * m + 1);
-    char *needle = (char *) malloc (m + 1);
-    void *result = 0;
-    /* Failure to compile this test due to missing alarm is okay,
-       since all such platforms (mingw) also lack memmem.  */
-    alarm (5);
-    if (haystack && needle)
-      {
-       memset (haystack, 'A', 2 * m);
-       haystack[2 * m] = 'B';
-       memset (needle, 'A', m);
-       needle[m] = 'B';
-       result = memmem (haystack, 2 * m + 1, needle, m + 1);
-      }
-    return !result || !memmem ("a", 1, 0, 0);]])],
-       [gl_cv_func_memmem_works=yes], [gl_cv_func_memmem_works=no],
-       [dnl pessimistically assume the worst, since even glibc 2.6.1
-        dnl has quadratic complexity in its memmem
-        gl_cv_func_memmem_works=no])])
-    if test $gl_cv_func_memmem_works = no; then
-      REPLACE_MEMMEM=1
-      AC_LIBOBJ([memmem])
-    fi
   fi
   gl_PREREQ_MEMMEM
 ])
diff --git a/modules/memmem b/modules/memmem
index 738cd6a..c51ee30 100644
--- a/modules/memmem
+++ b/modules/memmem
@@ -8,10 +8,6 @@ m4/memmem.m4
 Depends-on:
 extensions
 string
-stdbool
-malloca
-memchr
-memcmp
 
 configure.ac:
 gl_FUNC_MEMMEM
diff --git a/modules/memmem-fast b/modules/memmem-fast
index 738cd6a..dea243b 100644
--- a/modules/memmem-fast
+++ b/modules/memmem-fast
@@ -1,5 +1,5 @@
 Description:
-memmem() function: locate first substring in a buffer.
+memmem() function: locate first substring in a buffer in an efficient manner.
 
 Files:
 lib/memmem.c
@@ -14,7 +14,7 @@ memchr
 memcmp
 
 configure.ac:
-gl_FUNC_MEMMEM
+gl_FUNC_MEMMEM_FAST
 gl_STRING_MODULE_INDICATOR([memmem])
 
 Makefile.am:
@@ -26,4 +26,4 @@ License:
 LGPLv2+
 
 Maintainer:
-libc, Simon Josefsson
+?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]