[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-gnubg] Feature request: Rollout frequency based on std. errors
From: |
Christopher Yep |
Subject: |
[Bug-gnubg] Feature request: Rollout frequency based on std. errors |
Date: |
Mon, 31 Aug 2009 05:05:58 -0400 |
I don't know how difficult this would be to program, but one possible
enhancement (which the user can optionally select) for rolling out
positions is to rollout moves (candidate plays) with large standard errors
more frequently than moves with small standard errors.
For example, a user may want to rollout four moves (A, B, C, and D). Moves
A and B may usually lead to complex (future) positions which gnubg has a
hard time evaluating, while moves C and D may usually lead to simple
positions (e.g. holding games, races, etc.) which gnubg can easily
evaluate. So, variance reduction is much less effective for moves A and B
than for moves C and D. It may be the case that the standard errors of
moves A and B are about 4 times as large as the standard errors of moves C
and D (for an N-game rollout of each move).
Instead of rolling out all four moves N times, this new feature would
rollout A and B approximately 16 (i.e. 4*4) times more frequently than C
and D, resulting in approximately equal standard errors for each move at
the conclusion of the rollout (and even during the rollout, based on the
algorithms below).
For a single thread, the algorithm works as follows (for the example of
rolling out 4 moves):
1. Rollout A, B, C, and D two times each.
2. Check the standard errors of each move. Rollout one additional game of
the move with the highest standard error.
3. Repeat step 2 until the rollout is complete.
Note that "set rollout trials" could correspond to (1) the number of times
to rollout move A (the first play selected), (2) the average number of
rolled out games per play, (3) the maximum number of games to rollout each
play, or something else. To avoid the possibility of unexpected long
rollouts (i.e. if move A has extremely low standard errors), I suggest
choosing (2) or (3).
For multiple threads, the algorithm is similar: when a thread needs a new
game to rollout, assign it to rollout the move which currently has the
highest standard error.
If all of the above results in too much overhead (I doubt this is the case,
but I suppose it's possible), then the single-thread algorithm could be
carried out in batches of 36 games:
1. Rollout A, B, C, and D 36 times each.
2. Check the standard errors of each move. Rollout 36 additional games of
the move with the highest standard error.
3. Repeat step 2 until the rollout is complete.
(The multiple-thread algorithm would have to be tweaked a little, but the
idea is essentially the same.)
Chris
- [Bug-gnubg] Feature request: Rollout frequency based on std. errors,
Christopher Yep <=