m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: poor m4 hash performance


From: Eric Blake
Subject: Re: poor m4 hash performance
Date: Fri, 28 Jul 2006 00:20:08 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Eric Blake <ebb9 <at> byu.net> writes:

> > That algorithm could be improved.  M4 should do all its hash
> > computation using size_t, not a mixture of int and size_t.  And it
> > shouldn't slow itself down to do case-folding, since M4 is
> > case-sensitive.  And it shouldn't assume that int is 32 bits wide.
> 
> Agreed (I had already noticed that using a signed hash was wrong, but had
> not picked up on the case-folding).  I will go ahead and commit your patch.
> 

Ported to head as follows:

2006-07-27  Paul Eggert  <address@hidden>  (tiny change)

        * m4/hash.c (m4_hash_string_hash): Don't case-fold in the hash
        function. Shift by 7, not 3, for consistency with
        gnulib/lib/hash.c. Don't assume hash word is 32 bits.

Index: m4/hash.c
===================================================================
RCS file: /sources/m4/m4/m4/hash.c,v
retrieving revision 1.17
diff -u -p -r1.17 hash.c
--- m4/hash.c   1 May 2005 11:10:05 -0000       1.17
+++ m4/hash.c   28 Jul 2006 00:17:45 -0000
@@ -1,5 +1,5 @@
 /* GNU m4 -- A simple macro processor
-   Copyright (C) 2001 Free Software Foundation, Inc.
+   Copyright (C) 2001, 2006 Free Software Foundation, Inc.
    Written by Gary V. Vaughan <address@hidden>
 
    This program is free software; you can redistribute it and/or modify
@@ -160,7 +160,7 @@ bucket_delete (m4_hash *hash, size_t i)
   --HASH_LENGTH (hash);
 
   NODE_NEXT (node)     = free_list;
-  free_list            = BUCKET_NTH (hash, i);
+  free_list            = BUCKET_NTH (hash, i);
   BUCKET_NTH (hash, i) = 0;
 }
 
@@ -544,21 +544,17 @@ m4_hash_apply (m4_hash *hash, m4_hash_ap
 /* Using a string as the hash key is common enough that we provide
    implementations here for use in client hash table routines.  */
 
-/* Return a hash value for a string, from GNU Emacs.  */
+/* Return a hash value for a string, consistent with gnulib's hash
+   module.  */
 size_t
 m4_hash_string_hash (const void *key)
 {
-  int val = 0;
+  size_t val = 0;
   const char *ptr = (const char *) key;
   char ch;
 
   while ((ch = *ptr++) != '\0')
-    {
-      if (ch >= 0140)
-       ch -= 40;
-      val = ((val << 3) + (val >> 28) + ch);
-    };
-  val = (val < 0) ? -val : val;
+    val = (val << 7) + (val >> (sizeof (val) * CHAR_BIT - 7)) + ch;
   return val;
 }
 






reply via email to

[Prev in Thread] Current Thread [Next in Thread]