bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Confidence intervals from rollouts


From: Douglas Zare
Subject: Re: [Bug-gnubg] Confidence intervals from rollouts
Date: Thu, 5 Sep 2002 13:36:57 -0400
User-agent: Internet Messaging Program (IMP) 3.1

Quoting David Montgomery <address@hidden>:

> Nis Jorgensen wrote:
> > David Montgomery wrote:
> > > Consider a single position, for which we have 3 rollout
> > > samples A, B, and C.  The idea of rotating the first ply
> > > or two is that the variance of the *difference* between
> > > two plays should be reduced, since one aspect of the
> > > luck has been eliminated.
> >
> > The idea of rotating the first ply or two should be to reduce the
> > difference between the "true" value of a position and the results of
> > rollouts.
> 
> I agree with you.  I mispoke.

It would be good to have a rollout whose point is to determine how much better 
doubling is than redoubling, rather than to determine the absolute equities of 
each. That's what one does by hand if one plays the position as a prop with two 
cubes, though of course the checker play can depend on the cube position. One 
doesn't necessarily get an estimate of the absolute equity. For example, one 
can call it a wash when the smaller cube is redoubled and accepted. It also 
makes sense to have rollouts whose purpose is to determine the difference 
between plays, without determining the equity of either one.

> > The variance (or rather standard error) is just a _measure_ of how much we
> > trust the result, and reducing the value is not a goal in itself.

Ok, I think it would be useful to clarify the definitions we are encountering 
here. I see at least three variances. First, there is an actual variance of the 
estimated equity of a rollout scheme. Second, and we can estimate this by the 
variance of a variable uniformly distributed among the trials of a single 
rollout, with or without the n/(n-1) adjustment. Third, we can talk about the 
average value of the latter quantity through some rollout scheme, as opposed to 
the actual output of the estimate in one rollout. (It's possible that another 
measure of the second quantity makes more sense, like the average of the 
squareroots.)

I'm going to call these the real variance, the observed variance, and the ideal 
sample variance. Please feel free to override these with better names; I 
research probability (among other things) but I'm not very familiar with 
statistics, and certainly not statistical conventions.

> > This is very important to stress, especially in cases like this, where we
> > should expect the standard error to go _up_ even though the actual
> > trustworthyness of the rollout should improve
> 
> I believe I completely missed this point until now.  Thanks.
> 
> Hmmm.... but perhaps I am still missing it.  Because now I
> can't see how the standard error will go up; at least not
> the "standard error" that I am thinking about.

By this change of the rollout scheme, the real variance should go down, but the 
ideal sample variance will increase. I think one can view those as separate 
effects, and that's why there might be disagreement about whether the variance 
increases or decreases. This just states in a different way a lot of stuff I 
snipped. 

> But I would say that the true standard error is
> actually lower.  Maybe there is something else I
> should call this.  What I mean is that, if we do our
> 1296 game rollout many times, we can actually gather
> statistics directly on the spread of the rollouts.
> And if our technique is variance reducing, the spread
> will be less.  A quick thought experiment to demonstrate
> this is consider 1296 game rollouts with and without
> rotating the first two ply, truncated after two rolls.
> The true standard error of this rollout is 0 when you
> rotate the rolls.  You get the exact answer every time.

Equivalently, one could imagine that the variance reduction works perfectly 
after the first two rolls. Anyway, this is how I've been trying to test the 
effectiveness of stratification for my next GammonVillage column, which I plan 
to be an introduction to the normal distribution. 

Let's suppose that the luck estimates are perfect after two rolls, but perhaps 
imperfect in the first two rolls. Then one level of stratification will have 
the effect of making the luck estimates perfect on the first roll, and the 
errors will just be from the second roll.

If the estimates on the first roll were already good, but the luck estimates on 
the second roll were terrible, then we have not gained much. If inaccurate luck 
estimates on the first roll were the problem, then we would gain a lot. 

> > Also, I don't understand why you bring "different plays" into this. This
> is
> > of course relevant for the "duplicate dice" evaluation, but not for the
> > rotation (for which I would reserve the word "stratification").

There is a similar issue in that the real variance of the difference between 
plays A and B may decrease. In this case the ideal sample variance also 
decreases, in most positions. However, the extent of the decrease is unclear. 
If you are deciding whether to leave a lethal shot 7 or 8 away, or 4 vs. 5 
away, then giving the same dice may increase both variances. Making the dice 
correlated will not change either variance for the estimates of the equities of 
the individual plays, of course.

Douglas Zare





reply via email to

[Prev in Thread] Current Thread [Next in Thread]