Lately, I have been working on artificial intelligence. I have read the article of “Mastering the Game of Go with Deep Neural Networks and Tree Search Research” introduces how to combine the best of deep neural networks and search algorithms to achieve what was thought the impossible.
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence because of its enormous search space and the difficulty of evaluating board positions and moves.
New approaches were introduced by the authors of this research article. In one of the new approaches, the AI Player of the Go Game uses value networks and policy networks. In addition to this, the authors also introduced a new search algorithm called the Monte Carlo Tree Search where the new algorithm combines Monte Carlo simulation with value and policy networks.
- “Value Networks” is a deep neural networks to evaluate board positions.
- “Policy Networks” is a deep neural networks to choose the moves.
- “Monte Carlo Tree Search” is a search algorithm that combines neural network evaluations with Monte Carlo Tree Search outcomes.
These two deep neural networks (value networks and policy networks) are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Additionally, using the Monte Carlo Tree Search algorithm, AlphaGo was able to achieve a 99.8% winning rate against other Go programs. As a result of it, AlphaGo was also able to defeat the human European Go champion 5 to 0. This was the first time in History a computer program was able to defat a human professional player in the full-size game of Go.
To sum up, the Go game has enormous search space and the difficulty of evaluating board positions and moves. The computer program called AlphaGo was able to make huge significant accomplishments and defatted a human professional player for the first time of the full-size game of Go.
Previously, the Deep Blue is the chess machine which defeated the World Chess Champion, Garry Kasparov, in a six-game match in 1997. In its chess game against Kasparov, AlphaGo evaluated thousands of times fewer positions than Deep Blue did in its chess match against Kasparov; compensating by selecting positions more intelligently using the policy networks and the value networks. AlphaGo did this by evaluating the positions more precisely which has been also identified as an approach closer to how humans played during the match between the AlphaGo and Fan Hui.
The AlphaGo was able to reach a professional level in Go game for the first time by combining Monte Carlo Tree Search with policy and value networks. This accomplishment provides a real hope for human-level performance that can now be achieved in other seemingly intractable artificial intelligence domains.
- “AlphaGo” is a narrow AI, computer program developed by Alphabet Inc.’s Google DeepMind in London to play the board game Go. In October 2015, it became the first Computer Go program to beat a human professional Go player without handicaps on a full-sized 19×19 board.
- “Deep Blue” is a chess machine which defeated World Chess Champion.
- “Fan Hui” is the European champion of Go game.
References: AlphaGo nature paper