[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[hurd] 59/61: libihash: do not use an integer hash function by default
From: |
Samuel Thibault |
Subject: |
[hurd] 59/61: libihash: do not use an integer hash function by default |
Date: |
Tue, 27 May 2014 08:32:15 +0000 |
This is an automated email from the git hooks/post-receive script.
sthibault pushed a commit to branch upstream
in repository hurd.
commit 5b039a12bf5cfc9c65b8e169ed4503e306f971f3
Author: Justus Winter <address@hidden>
Date: Mon May 26 12:18:08 2014 +0200
libihash: do not use an integer hash function by default
Recently libihash was changed to use an integer hash function on the
keys in an attempt to reduce the rate of collisions (2d898893), which
has long been assumed to be high.
Richard Braun was kind enough to run some benchmarks. He observed:
"1/ Using an extremely simple microbenchmark [1] that merely inserts
keys, either random integers or sequential ones to match the way port
names are managed, it seems that the previous code, despite its
apparent flaws, did quite well.
[1] http://darnassus.sceen.net/gitweb/rbraun/ihtest.git
Using an integer hashing function actually reduces performance on the
sequential integer test case. It makes sense because, considering a
set of consecutive integers starting from 0, and a hash table that
always has more slots than items, a modulo is a perfect hash
function. Even when taking into account that only names for receive
rights are normally managed with libihash, i.e. that keys aren't
actually sequential, they are almost all equally distributed, leading
to very few collisions.
Therefore, as a third option, I've removed the hashing function,
leaving only a fast modulo (an AND) and this variant provided the best
raw results.
2/ I've also built hurd packages multiple times and got average build
times with each variant (previous, new, new without hash function) and
here are the results I obtained respectively : 52m59s, 52m31s, 52m22s."
Do not use the integer hash function on the keys by default.
* libihash/ihash.c (murmur3_mix32): Remove now unused function.
(find_index): Use the fast division method to derive the index.
(add_one): Likewise. Also, update the comment to reflect the current
hashing method.
---
libihash/ihash.c | 22 ++++------------------
1 file changed, 4 insertions(+), 18 deletions(-)
diff --git a/libihash/ihash.c b/libihash/ihash.c
index 4d9cc18..fa29257 100644
--- a/libihash/ihash.c
+++ b/libihash/ihash.c
@@ -32,19 +32,6 @@
#include "ihash.h"
-/* This is the integer finalizer from MurmurHash3. */
-static inline uint32_t
-murmur3_mix32 (uint32_t h, unsigned int bits)
-{
- h ^= h >> 16;
- h *= 0x85ebca6b;
- h ^= h >> 13;
- h *= 0xc2b2ae35;
- h ^= h >> 16;
-
- return h >> (32 - bits);
-}
-
/* Return 1 if the slot with the index IDX in the hash table HT is
empty, and 0 otherwise. */
static inline int
@@ -74,7 +61,7 @@ find_index (hurd_ihash_t ht, hurd_ihash_key_t key)
unsigned int up_idx;
unsigned int mask = ht->size - 1;
- idx = murmur3_mix32 (key, __builtin_ctzl (ht->size));
+ idx = key & mask;
if (ht->items[idx].value == _HURD_IHASH_EMPTY || ht->items[idx].key == key)
return idx;
@@ -205,20 +192,19 @@ hurd_ihash_set_max_load (hurd_ihash_t ht, unsigned int
max_load)
found. The arguments are identical to hurd_ihash_add.
We are using open address hashing. As the hash function we use the
- division method with quadratic probe. This is guaranteed to try
- all slots in the hash table if the prime number is 3 mod 4. */
+ division method with linear probe. */
static inline int
add_one (hurd_ihash_t ht, hurd_ihash_key_t key, hurd_ihash_value_t value)
{
unsigned int idx;
unsigned int first_free;
+ unsigned int mask = ht->size - 1;
- idx = murmur3_mix32 (key, __builtin_ctzl (ht->size));
+ idx = key & mask;
first_free = idx;
if (ht->items[idx].value != _HURD_IHASH_EMPTY && ht->items[idx].key != key)
{
- unsigned int mask = ht->size - 1;
unsigned int up_idx = idx;
do
--
Alioth's /usr/local/bin/git-commit-notice on
/srv/git.debian.org/git/pkg-hurd/hurd.git
- [hurd] 57/61: trans/mtab: fix initialization, (continued)
- [hurd] 57/61: trans/mtab: fix initialization, Samuel Thibault, 2014/05/27
- [hurd] 55/61: libpager: drop unused fields from struct pager, Samuel Thibault, 2014/05/27
- [hurd] 49/61: libdiskfs: lock-less reference counting for peropen objects, Samuel Thibault, 2014/05/27
- [hurd] 30/61: ext2fs: use two distinct pager buckets for the disk and file pager, Samuel Thibault, 2014/05/27
- [hurd] 45/61: include: add lock-less reference counting primitives, Samuel Thibault, 2014/05/27
- [hurd] 58/61: libdiskfs: fix node leak in the name cache, Samuel Thibault, 2014/05/27
- [hurd] 51/61: pfinet: add missing include, Samuel Thibault, 2014/05/27
- [hurd] 38/61: libihash: use fast binary scaling to determine the load, Samuel Thibault, 2014/05/27
- [hurd] 53/61: Avoid compiler warning about empty bodies, Samuel Thibault, 2014/05/27
- [hurd] 60/61: libtrivfs: lock-less reference counting for trivfs_peropen objects, Samuel Thibault, 2014/05/27
- [hurd] 59/61: libihash: do not use an integer hash function by default,
Samuel Thibault <=
- [hurd] 15/61: Include the MIG-generated server header files, Samuel Thibault, 2014/05/27