bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New backgammon engine: wildbg


From: Carsten Wenderdel
Subject: Re: New backgammon engine: wildbg
Date: Wed, 22 Nov 2023 12:05:18 +0000

>> The current strength is roughly 1800 ELO or an error rate mEMG of roughly 
>> 7.5 when being analyzed with GnuBG.
> 
> May I suggest that you don't measure your bot by what other bots think of it 
> but by how they perform against it in terms of real win/loss results.

Agreed. I didn’t have time to implement that yet, help is welcome.


>> So far only the classic TD-Gammon inputs have been implemented, the expert 
>> features from Hans Berliner are missing.
>> It’s also still learning without expert features, so I’m optimistic that 
>> there is lots of room for improvement.
> 
> As far as I know TD-Gammon was initially trained through random cubeless, 
> matchless, (even backgammonless) self-play; that is only single games with 1 
> and 2 point wins.
> 
> Is this how you are training your bot?

Only the neural net inputs are „TD“. The outputs also include backgammon 
probabilities.
No TD-learning or other reinforcement learning is used. Instead I use cubeless 
money game rollouts, followed by supervised learning.


>> 4. Usage of 3rd party machine learning libraries instead of self written 
>> neural nets
>> This allows usage of for example PyTorch and the ONNX format. Changing the 
>> topology of the nets (ReLu instead of sigmoid, adding hidden layers) becomes 
>> very easy.
> 
> I find this the most interesting to comment about. I have no idea about the 
> specific technologies mentioned but the end result should ideally an 
> Alpha-Zero type bot that learns through purely random cubeful and matchful 
> self-play; that is without human bias introduced by applying cube formulas to 
> cubeless equities and without using score tables for match equities.
> 
> The CPU power that didn't exist at the days of TD-Gammon does exist today, 
> even if it may not be accessible to just anybody. But there is a was to 
> transition using average desktop CPU power, by
> doing what I call "clinical rollouts" done through purely random cubeful and 
> matchful trials.
> 
> Gnu-BG's lowest player strength with maximum noise setting comes close to 
> this but not quite. Even though it would be trivial to implement this in 
> Gnu-BG, they are refusing to do it most likely for the fear that it would 
> prove to be self-destructive for Gnu-BG.
> 
> Random checker and cube decisions go so fast that in the same amount of time 
> that it takes to do a typical 1,296 trials, you can do 1,296,000 trials on an 
> average desktop PC.
> 
> So, even is you can't train your new bot through Alpha-Zero type learning, 
> you can at least implement Alpha-Zero type rollouts to double-check how an 
> Alpha-Zero BG bot would have played individual positions one at a time.
> 
> Over time, those "clinical rollouts" will accumulate and demonstrate that 
> none of the currently existing BG bots are good enough and that there is a 
> need for better BG bots. If you understand and agree with what I am 
> explaining, your bot will be the first one to progress in that direction.

Alpha Zero uses the same net for evaluation of positions and moves. For that 
you need thousands of outputs to encode each legal move. I don’t see how this 
is feasible in backgammon.
I currently don’t plan to do anything than cubeless money game rollouts. If 
someone with programming skills wants to try something different themselves, 
I’m willing to help, for example providing the move generation code.


>> I don’t want to abuse this mailing list too much; if you are interested 
>> please join this Discord server:
>> https://discord.gg/Ckvf7y8nGz
> 
> Can you allow read-only access for the general public without having to 
> create an account?


I can’t.


Best, Carsten


reply via email to

[Prev in Thread] Current Thread [Next in Thread]