Re: New backgammon engine: wildbg

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New backgammon engine: wildbg

From:	Murat K
Subject:	Re: New backgammon engine: wildbg
Date:	Fri, 1 Dec 2023 15:21:46 -0700
User-agent:	Mozilla Thunderbird



On 11/22/2023 5:05 AM, Carsten Wenderdel wrote:

As far as I know TD-Gammon was initially trained through random cubeless, 
matchless, (even backgammonless) self-play; that is only single games with 1 
and 2 point wins.
Is this how you are training your bot?

Only the neural net inputs are „TD“. The outputs also include backgammon 
probabilities.


Okay, real vs extrapolated backgammon probabilities is an improvement.

No TD-learning or other reinforcement learning is used. Instead I use cubeless 
money game rollouts, followed by supervised learning.

I don't understand the concept of "cubeless money game". How is that different than a "plain singlegame"? (which some people erroneously call a "1-pointer" even though it allows 2 and 3 point wins??)

Alpha Zero uses the same net for evaluation of positions and moves. For that 
you need thousands of outputs to encode each legal move. I don’t see how this 
is feasible in backgammon.
I currently don’t plan to do anything than cubeless money game rollouts. If 
someone with programming skills wants to try something different themselves, 
I’m willing to help, for example providing the move generation code.

Perhaps my usage of "Alpha-Zero" was misleading/misunderstood. I meant "learning through all randomdecisions without human bias injected" through cube equity formulas or match equity tables.

With this hopefully clarifying it, why not train your bot using "cubeful money game rollouts"? Itshould be feasible since rollouts with random cube decisions as well as random checker decisions goreal fast. This will give real vs extrapolated cubeful equities also. To me "cubeless money game"sounds nonsensical if not oxymoron.

I was concurrently suggesting that even if you don't train your bot now using random cube andchecker decisions, you should later provide in your bot the capability of doing such random rolloutsat least one position at a time, in order to see how an "unbiased bot" would have played it.

MK

[Prev in Thread]

Current Thread

[Next in Thread]

Re: New backgammon engine: wildbg, Murat K <=

Prev by Date: RE: GNU backgammon - random starting point?
Next by Date: Re: GNU backgammon - random starting point?
Previous by thread: RE: GNU backgammon - random starting point?
Next by thread: updated gnubg benchmark files
Index(es):
- Date
- Thread