gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] hash collisions


From: Douglas Ridgway
Subject: [gnugo-devel] hash collisions
Date: Mon, 14 Jun 2004 17:59:02 -0600 (MDT)

On Sat, 15 May 2004, Arend Bayer wrote:

> What would be more useful than discussing the quality of the pseudo random
> numbers generator is measuring the quality of the fixed seed for the
> Zobrist hashing. It is set in init_gnugo() in engine/interface.c.
> 
> Testing the quality of the seed should probably done by computing the
> shortest combinations of hash values that yield zero when XOR'd.

If you write the hash values in a matrix X_ij, where i runs over the
digits 1..MIN_HASHBITS and j runs over colors and board points 1..2*361,
then the hashed value of a position is \sum_j X_ij a_j mod 2, where a_j
are the occupancies of the board points. In other words the Zobrist hash
is a linear transformation, and we are interested in the null space
defined by X a = 0 mod 2. It's easy to work out what this null space is
(Gaussian elimination), but because the null space is large
(2*361-hashbits dimensions), it's not so easy to figure out what's the
smallest element in it. I just looked at the smallest basis vector, but 
there could be some linear combination of basis vectors which is shorter.

For MIN_HASHBITS = 32, there are several nullspace basis vectors of size 
10. Here's one:

(;GM[1]FF[4]SZ[19]AB[na][pa][ra]AW[ea][fa][ja][ka][ma][oa][dc])

which looks like this:

showboard
=
   A B C D E F G H J K L M N O P Q R S T
19 . . . . O O . . . O O . O X O X . X . 19
18 . . . . . . . . . . . . . . . . . . . 18
17 . . . O . . . . . . . . . . . . . . . 17
16 . . . + . . . . . + . . . . . + . . . 16
15 . . . . . . . . . . . . . . . . . . . 15
14 . . . . . . . . . . . . . . . . . . . 14
13 . . . . . . . . . . . . . . . . . . . 13
12 . . . . . . . . . . . . . . . . . . . 12
11 . . . . . . . . . . . . . . . . . . . 11     
10 . . . + . . . . . + . . . . . + . . . 10     
 9 . . . . . . . . . . . . . . . . . . . 9
 8 . . . . . . . . . . . . . . . . . . . 8
 7 . . . . . . . . . . . . . . . . . . . 7
 6 . . . . . . . . . . . . . . . . . . . 6
 5 . . . . . . . . . . . . . . . . . . . 5
 4 . . . + . . . . . + . . . . . + . . . 4
 3 . . . . . . . . . . . . . . . . . . . 3
 2 . . . . . . . . . . . . . . . . . . . 2
 1 . . . . . . . . . . . . . . . . . . . 1
   A B C D E F G H J K L M N O P Q R S T

This hashes to 0.

Informal testing of other random matrices of the same size showed that
this is pretty good:  8 is typical, and 6 was the lowest I saw. I didn't 
work out positions for MIN_HASHBITS = 64, but similar tests of random 
matrices showed 20 as a typical low collision size, and 30 some for 96.

Other notes:

HASHVALUE_PRINT_FORMAT in hash.h should probably define a field length.

Random numbers are good, but may not be absolutely ideal for Zobrist
hashing. For example, consider 0: it's as likely to appear for the hash
value of a point as any other number, but it's a disaster for Zobrist
hashing, as that point would then not ever contribute to the hash, i.e. a
collision of length one. One could imagine using the freedom in picking
nonrandom values to avoid this and other short collisions, or to ensure
that collisions are distributed in some implausible way. Random is simple,
though.

In a web search I found that some chess programmers use fairly short
hashes, and accept a certain level of collisions. Even though this does
cause misevaluations due to false cache hits, it's a fairly small fraction
of nodes which will effect the evaluation of the root node, and can be a
win overall.

doug.
address@hidden






reply via email to

[Prev in Thread] Current Thread [Next in Thread]