Reinforcement-Learning-TicTacToe

This a some code I implemented to teach an algorithm how to play tic tac toe using a simple reinforcement learning algorithm. The state of every game is recorded in a dictionary along with its value accoring to the equation below.

StateValue = LearningRate*(NextValueState*decay - StateValue)

The value of a state gets updated after the end of every game. The final state gets rewarded a 1 if that player won, a -1 if that player lost, and a 0 if the game ended in a tie.

Each player has a hash table with values with which it can make its choice based on which state has the highest value. During training, each player chooses to either exploit the hash table and make the best decision or explore by selecting a random option.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
RL_tictactoe.py		RL_tictactoe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement-Learning-TicTacToe

About

Releases

Packages

Languages

wogara/Reinforcement-Learning-TicTacToe

Folders and files

Latest commit

History

Repository files navigation

Reinforcement-Learning-TicTacToe

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages