bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: The importance of METs


From: Douglas Zare
Subject: Re: [Bug-gnubg] Re: The importance of METs
Date: Thu, 4 Sep 2003 19:19:11 -0400
User-agent: Internet Messaging Program (IMP) 3.2.2

Quoting Jim Segrave <address@hidden>:

> On Wed 03 Sep 2003 (14:15 -0400), Douglas Zare wrote:
> > Quoting address@hidden:
> > 
> > > On Tue, 2 Sep 2003, Joseph Heled wrote:
> > > 
> > > > Here are the numbers.
> > > > E1 (woolsey wins both) - 31747
> > > > E2 (mec26 wins both)   - 32067
> > > > E3                     - 186186
> > > >
> > > > The verdict is?
> > 
> > Is this what has been touted as a 1.2% improvement? I would not conclude
> that
> > from those numbers. 
> 
> The 1.2% was someone misplacing a decimal point in a surfeit of
> enthousiasm - the actual value suggested is 0.12 percent.

Ok. I'm not sure that I see enough accuracy to say 0.12% rather than 0.0-0.2%,
but I'll trust that someone has gone through that carefully. However, the
Woolsey-Heinrich MET is a straw man. Woolsey says he doesn't use it (for
extreme scores), and there are scores which seem to be quite wrong, such as for
3-away 4-away. If you have a new MET that is supposed to be an improvement over
what is out there, why not test it against METs people believe, or at least
better ones?

If you just want to be able to report as large a (correct) advantage as
possible, you might want to use match lengths at which the defects in the W-H
table show up more. Rather than use a percentage system that does not make
sense out of context, why not translate the advantage into elo points?

> > > I expected the correlation to be much higher - I am surprised that the
> MET
> > > used influences the outcome of more than a quarter of matches (although
> > > these MET's are much more different than Snowie and mec26)
> > 
> > Better variance reduction may fix this. If I understand your methodology,
> if the
> > length of a game but not the result depends on the MET, then the rest of
> the
> > match should be only slightly more correlated than independent trials
> starting
> > at the resulting match score. If so, you may find a greater correlation if
> you
> > make the rolls of each game independent of the number of moves made up to
> that
> > point. You could test why the matches diverge, too. 

I guess I was wrong in assuming that the rolls were generated randomly for each
pair of matches rather than for each pair of games. Why not generate them
randomly for each match score, rather than for each game? There must be a lot
of transposition. 

Douglas Zare





reply via email to

[Prev in Thread] Current Thread [Next in Thread]