guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Does anyone have a better scm_string_hash ?


From: Ludovic Courtès
Subject: Re: Does anyone have a better scm_string_hash ?
Date: Fri, 14 Nov 2003 16:51:55 +0100
User-agent: Mutt/1.5.4i [Guile enabled]

You might want to look at
http://srfi.schemers.org/srfi-13/mail-archive/msg00112.html .

Basically, the idea there is that a hash key for string cn..c0 is:
     h = (c0 + c1*37 + c2*37^2 + ...) % hash_size.
I used that and it *seems* to work quite well.

This can be written as follows:

  k = 0;
  while (*string)
    {
      k  = (k * 37) + (*string);
      k %= hash_size;
      string++;
    }
  return k;

Cheers,
Ludovic.

Today, 15 minutes, 40 seconds ago, Roland Orre wrote:
> I noticed that our data mining software run very very slow with a
> new data base. I localized the problem to scm_string_hash.
> 
> A hash table in this case was loaded with 14166 strings. I have
> a function which creates a reasonable sized hash table, in this
> case the hash table size was 8209.
> 
> 13856 of these strings were hashed to the same index= 1067.
> 303 strings got index = 8061.
> 2 strings got the index = 754.
> 8201 entries were empty.
> 
> We are running guile 1.6 but I checked the scm_string_hash from a recent
> 1.7 CVS also and the function in hash.c there is identical.
> 
> I added a few of the symbols hashing to 1067 below. One can of course
> argue that the symbols in this case should be hashed as numbers.
> Anyway, does anyone have any hint or have a better string hash function?
> 
>       Best regards
>       Roland Orre
> 
> 
> A few strings hashing to entry 1067 for hash table length 8209:
> "01632001" "01627301" "01626801" "01626601" "01626501" "01626401"
> "01626301" "01626101" "01626001" "01625901" "01625801" "01625701"
> "01625401" "01625301" "01625101" "01625001" "01624801" "01624601"
> "01624501" "01624401" "01624301" "01624101" "01624001" "01623901"
> "01623801" "01623701" "01623601" "01623501" "01623401" "01623201"
> "01623101" "01622901" "01622801" "01622701" "01622601" "01622401"
> 
> 
> 
> 
> _______________________________________________
> Guile-user mailing list
> address@hidden
> http://mail.gnu.org/mailman/listinfo/guile-user




reply via email to

[Prev in Thread] Current Thread [Next in Thread]