ReBeL is an artificial intelligence tool that is reportedly self-educating
According to a team of researchers working for Facebook, they have developed an artificial intelligence (AI) framework - Recursive Belief-based Learning (ReBeL) - that can achieve efficiency in heads-up, No-Limit Texas Hold'em poker that they claim is better at the game than humans. These researchers stated that ReBeL is helping technology take one step further in the development of growing strategies for multi-agent interactions. This innovation can pave the way for bringing new tools in other areas like auctions, negotiations, cybersecurity and self-driving vehicles.
Reinforcement learning combined with search capabilities is what has been used to train AI models and, so far, it has led to several significant advances. Reinforcement learning is when agents are taught to do something by maximizing rewards, and search is following a navigation process from start to a goal state. This is something that has been quite useful for training bots to beat players in games like chess or shogi, but not for imperfect games like poker. More often than not, AI makes several assumptions that don't fit the scenario given.
That's what scientists were trying to fix with ReBeL. The notion of "game state" is expanded for this device, so it includes the agent's belief about what state they might be in. This perception of "state" is based on common knowledge and the policies of other agents. For the training, ReBeL trains two AI models - a value network and a policy network - to determine those states but also using reinforcement learning. ReBeL uses both models for search even during self-play and the result is a simple, flexible algorithm that according to researchers is capable of defeating the top human poker players in a two-player game.