bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Strange FIBS ratings


From: kvandoel
Subject: Re: [Bug-gnubg] Re: Strange FIBS ratings
Date: Mon, 8 Sep 2003 19:05:01 +0200 (CEST)

On Mon, 8 Sep 2003, Jim Segrave wrote:

> > >>Error rate (per decision)       -13.86 ( -0.036%)    -21.97 ( -0.075%)
> > >>Error based abs. FIBS rating    1936.0               1866.3
> > >>Chequerplay errors rating loss    95.9                131.8
> > >>Cube errors rating loss           18.1                 51.9
> > >
>
> I don't have direct experience of FIBs, but it surprises me that with
> a chequer error rate of about .020/move (two beers behind Joern?), the
> estimated ratings would be 1700 odd in the first match and 1866 in the
> second.

That .02  is NOT a chequer error  rate but an overall  errror rate. That
fact may lessen your surprize.

> I sort of have serious doubts about the practicality of trying to
> correlate a single short match to a rating which is supposed to be
> accumulated over a large number of matches.

Of course it's only an indicator.  "If you'd play like this always you'd
have a rating of XYZ" is what it indicates. Yet on average (over a large
number) of  matches I found the  current FIBS estimator  to be extremely
accurate.

> Questions  have also  been raised  in a  rather acrimonious  thread on
> gammonline  re.  the validity  of  modeling  ratings  using the  noise
> feature in gnubg to simulate human play.

> It would be interesting, though probably impractical even if the data
> were available, to take a *huge* number of matches from FIBS with the
> player's ratings and analyse those matches to try to correlate
> real-world play and error rates with the assigned ratings.

I don't see why it's impractical.

I have  analysed about  500 matches (can't  do more  now on my  data set
because  of the  illegal .mat  import crash  bug) from  gamesite2000 and
tabulated the  average GNUBG estimated  FIBS rating of every  player.  A
comparison with the  actual ratings of the people  involved proved to me
experimentally that the current  GNUBG rating estimate is very accurate.
At least as accurate as you can expect from a 0-ply error estimate.

Since it is  based on the assumption that  real-world playing errors can
be modeled by noise I would conclude that this (or a similar experiment)
settles the acrimonous  thread on gammonline you allude  to in the above
(for  which I  didn't pay  so I  have no  idea what  they  are bickering
about).

If someone  gives me a large  bag of .mat  files I can analyse  them and
tabulate the  estimated (error based) ratings for  comparison with their
actual ratings if known.

> In the meantime, I find the one word summaries more useful than the
> estimated ratings when skipping through a long match to find my worst
> blunders. Awful/beginner/casual player (okay, that's two words) are a
> sign that I really ought to have a look at what I did wrong, the rare
> ET matches I can usually skip as far as learning goes.

So the textual scores need to be aligned with the rating version.

Kees





reply via email to

[Prev in Thread] Current Thread [Next in Thread]