chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Re: Better algorithm for growing hash tables


From: Ed Watkeys
Subject: Re: [Chicken-users] Re: Better algorithm for growing hash tables
Date: Wed, 3 Aug 2005 00:23:22 -0400


On Aug 2, 2005, at 5:23 PM, Toby Butzon wrote:

Oops, I missed the "well designed hash functions" part. If your hash
function is well designed, it shouldn't make any difference. The safety in "just to be safe" seems to be against poorly designed hash functions. :)

Yeah, I was just writing some code in response to the paper. I implemented the hash function from The Practice of Programming and ran it on the example variable names the author used:

; Requires SRFI-1.

(define (my-hash nhash s)
  (reduce
   (lambda (ch hash)
     (remainder (+ ch (* hash 31)) nhash))
   0
   (map char->integer (string->list s))))

(map (lambda (s) (my-hash 128 s))
     (list "a1" "b1" "d1" "X1" "Z1")) ==> (112 15 77 89 23)

If I reduce the size of the table down to four entries, this function produces only two collisions. Not bad, given that there is one more datum than there are table entries.

The author's laughably bad hash function returned 97 for each string. Allegedly, Knuth wrote the following hash function in TAoCP (not in Scheme of course; presumably in his whacked pseudo-assembly language):

(define (dumb-hash nhash s)
  (remainder
   (* 2654435761
      (reduce
       (lambda (ch hash)
         (+ ch (* hash 256)))
       0
       (map char->integer (string->list s))))
   nhash))

(map (lambda (s) (dumb-hash 128 s))
     (list "a1" "b1" "d1" "X1" "Z1")) ==> (97.0 97.0 97.0 97.0 97.0)

A freshman should be able to spend a minute looking at this function and realize how bad it is -- even without seeing a sample run.

When we're building systems, we often think of hash tables as black boxes that "just work" and invariably work in O(n) time. Probably not a good assumption.

Ed

--
Transmogrify, LLC * <http://xmog.com/>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]