gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnugo-devel] hash_1_28.4: revision of hash_1_28.3


From: Arend Bayer
Subject: Re: [gnugo-devel] hash_1_28.4: revision of hash_1_28.3
Date: Sun, 24 Mar 2002 22:02:07 +0100 (CET)

On Sun, 24 Mar 2002, Inge Wallin wrote:

> Arend wrote:
> > I first considered setting the minimum number of hash bits to 96. But I did
> > not get a single mistake/assertion failure when running "make second_batch"
> > with 32 bit hashing. So this suggests that we would get far less than one
> > mistake per game with 32 bit hashing (note that a hash collision does not
> > at all imply a wrong read result lookup yet: the routine, the string in
> > question and stackp have to agree as well). So with 64 bits, we should get
> > less than one mistake per 10^10 games (if we trust the random generator).
> 
> According to a paper that I recently read, the probability of an error
> in a hash table like this can be expressed as:
> 
>                2
>               N
>               --
>               2M
>         1 - e
> 
> (or 1-exp(N^2 / 2M) for those of you with a non-fixed width font)
> 
> where
>   N = the number of positions entered into the hash table during a game
>   M = the number of unique hash keys.
> 
> I will let you do the calculation since I don't know what N might be
> today.  M is, of course, 2^32, 2^64 or 2^96, whatever you choose.
> Does this formula agree with your calculation above?

The number of hash collision is indeed easy to compute. I think for GNU Go
the formula should be 1 - exp( - k * N^2 / M ) \approx k * N^2 / 2M,
where k is the number of moves per game, N is the average number of
positions entered per move, t the average number of hash positions stored,
and M as you say.
I think N is less than 4*10^5, k is 2*10^2. k*N^2/2 is hence  < 10^13

M is 4* 10^9 for 32 bits and 10^19 for 64 bits.

This means we get less than one 64bit hash collision in 10^6 games.

However, it is not possible to guess how likely it is that a wrong read
result lookup is once we have a hash collision, and also you cannot tell
how often this will cause a change in the genmove decision.

Arend




reply via email to

[Prev in Thread] Current Thread [Next in Thread]