Pluribus, Carnegie Mellon’s poker bot, has won an AI award for Tuomas Sandholm, one of its creators.
Sandholm received the 2021 Robert S. Engelmore Memorial Lecture Award from the Association for the Advancement of Artificial Intelligence (AAAI) as recognition for his research in the field of AI and for his service to the AI community.
As well as being part of the team that cracked the complexity of multi-opponent games, Sandholm’s intelligences are currently running the U.S. kidney donor bank. Around 80% of U.S. kidney transplant sites have been using his algorithms to pair donors and recipients since 2010.
How to think like a killer app
The basic way a bot works is by assessing the current situation and then simulating all the possible moves it could make, then all the possible answers to those moves, then all its possible answers to those answers and so on. After a set number of imagined moves into the future, it looks at the strings of decisions and makes the move that results in the best possible outcome for itself, assuming that its opponent plays the best possible moves for them.
The job of AI programmers like Sandholm is to create an algorithm that simulates all the possible moves efficiently, assessing the results (or end state) accurately.
The more options a bot has, the more simulations it must run. The clearer the metrics for assessment, the more straightforward the process. Poker has enormous complications in the first of those spheres, which Sandholm helped to crack.
Luckily assessing the outcomes is a bit easier: what move wins the most or loses the least amount of lettuce?
The clockwork ticking
In 1996 it was chess. Garry Kasparov lost the deciding match to Deep Blue.
In 2016, Go was next. Lee Sedol lost his match 4-1. It was almost a whitewash but something seemed to go badly wrong with AlphaGo’s search function in match 4.
AlphaGo has moved on to solving unsolvable problems like protein folding, leaving the playing fields to younger AIs, and setting forth to solve real scientific issues.
But it left one stone unturned. That of poker. Poker may be a less elegant game than Go, but that is precisely what made it the next holy grail for AI research.
Go is played by two players. Both players know the full state of the board. Poker, one the other hand, can have up to ten people per table. And players are never privy to an opponent’s hand until the showdown. Information is messy and incomplete.
It is this incomplete information that makes poker hard for an AI to handle. Each bit of information missing from the board makes for an extra set of simulations. The software has to run all of them to an end state before it can decide what to do. That costs time.
An additional complication with poker comes in table-stakes games. Two factors dictate the number of possible bet sizes: the player’s effective stack sizes and the increments in which a player can bet. A $10,000 effective stack has 1,000,000 possible bet sizes if you can bet in one cent increments.
To counter this, Sandholm’s Pluribus simply ran its simulations using a limited set of possible bet sizes. For example, with effective stacks of 100bb, it might run 20 simulations, each using bet sizes with 5bb increments.
This greatly reduces the computing power required.
It was enough to make Pluribus’s forefather, Libratus, good enough to crush the best players’ heads-up. In June 2019, Pluribus was able to do the same in the chaotic world of six-max.
A year is a long time in silicon valley. There’s already a new shark in the AI player pool. This one is called ReBeL, it does everything Pluribus does from a smaller starting blueprint. And it’s owned by Facebook.
Enjoy your prize while you can, Sandholm. The singularity is coming.
Interested in working in the AI field? See our article: Top AI Job Trends in the U.S.
Image source: Flickr