This was the week... AI took on the poker pros and won

The issue of AI-powered poker has been in the news again recently, with various stories swirling around about bots beating low and mid-stakes poker games online.

AI can do a lot of things - creating the image above, to choose an obvious example - but could a bot take on and defeat high-stakes, professional heads-up specialists? That was the question a team of academics at Carnegie Mellon University (CMU) were hoping to answer this week in 2017 using Libratus, a computer program designed to play heads-up no-limit hold’em.

Poker pros Jason Les, Dong Kyu Kim, Daniel McAulay and Jimmy Chou formed the team of human players. As Phil Galfond put it at the time, “Les, Kim, McAulay and Chou are among the very best heads-up no-limit Texas hold’em players in the world. Your favorite poker player almost surely wouldn't agree to play any of these guys for high stakes, and would lose a lot of money if they did. Each of the four would beat me decisively.”

Carnegie Mellon University's Tuomas Sandholm

In the other corner was Libratus, programmed by CMU Professor of Computer Science Tuomas Sandholm and Ph.D student Noam Brown.

Harnessing the computational power of the Pittsburgh Supercomputing Center, they used around 15 million core hours of computation to set Libratus up to devise a winning heads-up strategy. “We don’t write the strategy,” Sandholm stated, “we write the algorithm that computes the strategy.”

The two factions would compete in a challenge dubbed ‘Brains Vs. Artificial Intelligence: Upping the Ante’, and the result took everyone by surprise.

Claudico: the first challenger

First, a little context. The computer scientists at CMU had initially designed a program called Tartanian, which in 2014 was pitted against a number of rival AI programs in a heads-up tournament, and won. The following year saw them reveal Tartanian’s successor, Claudico, which built on the foundational work they had already completed.

Poker player Doug Polk, pictured at a WPT event in 2014

Claudico - Latin for ‘I limp’, as pre-flop limping formed a big part of its strategy - was used in a 2015 challenge against four top human pros: Dong Kyu Kim and Jason Les, who would go on to compete against Libratus, Hong Kong high-roller Bjorn Li, and Doug Polk, who at the time was one of the world’s top-rated heads-up players.

80,000 hands of poker were played in that first ‘Brains Vs. Artificial Intelligence’ challenge in 2015, with each of the humans playing 20,000 hands against the AI. The team of human players collectively wonover $732k from Claudico (theoretically - they did not play for real stakes but each received a share of $100k based on their performance), after around $170 million was bet during those 80,000 hands.

The result was not regarded as being significant enough to determine a clear win for the human team, and so ended in a ‘statistical tie’.

Libratus: a new challenger appears

Two years later, the CMU team returned with the next generation of their AI poker bot, named Libratus. As with Claudico, the name was chosen carefully to denote an element of its strategy; while ‘Claudico’ means ‘I limp’ in Latin, ‘Libratus’ means ‘balanced’.

New technology powering Libratus was intended to drive the program towards playing a perfectly balanced game theory optimized (GTO) strategy to attain a Nash equilibrium. A new approach to endgame strategies was also devised in order to help the program make fewer obvious and/or exploitable errors on the river.

A change to the format was also implemented, with the setting up of duplicate matches. This meant that Libratus would receive the same cards as a human playing a separate match against it, and vice versa, allowing for direct comparisons between the AI and human results.

The 20-day contest, ‘Brains Vs. Artificial Intelligence: Upping the Ante’, was held at Pittsburgh’s Rivers Casino in January 2017, with matches played in public view from 11am to 7pm each day.

Libratus pitches a shutout

This time around, Libratus defeated the team for a total of $1,766,250 over 120,000 hands, with each human player losing to the AI. Jason Les was down over $880k, Jimmy Chou lost $520k, Daniel McAuley $277k, and Dong Kim $85k. As before, the players were not wagering their actual money, and each was compensated a share of a $200k prize pool according to their results.

With the increased number of hands played, and the greater margin of victory, the result was determined to be a resounding statistical victory for Libratus.

Poker player Jason Les at a WPT event in 2019

As CMU’s Tuomas Sandholm said at the close of the challenge, “The best AI’s ability to do strategic reasoning with imperfect information has now surpassed that of the best humans.”

Head of Computer Science at CMU, Frank Pfenning, added, “The computer can’t win if it can’t bluff. Developing an AI that can do that successfully is a tremendous step forward scientifically.”

Sandholm went on to explain one of the crucial differences in the AI’s approach to improving and honing its strategies. “After play ended each day, a meta-algorithm analyzed what holes the pros had identified and exploited in Libratus’ strategy. It then prioritized the holes and algorithmically patched the top three using the supercomputer each night… Typically, researchers develop algorithms that try to exploit the opponent’s weaknesses. In contrast, here the daily improvement is about algorithmically fixing holes in our own strategy.”

What happened next

Following Libratus’ heads-up victories, the team at CMU went back to the drawing board to develop an AI capable of victory in a 6-Max game. Once again they chose the name of the program wisely, with ‘Pluribus’ a Latin word meaning ‘Many’, indicative of the bot’s new multiplayer focus.

Built in collaboration with Meta, in 2019 Pluribus took on a table of pros including Chris Ferguson, Trevor Savage, Michael Gagliano and Seth Davies, alongside Jimmy Chou and Jason Les from the previous challenge.

The result was a decisive win for Pluribus. But that’s a story for another week.

Poker player Chris Ferguson, pictured in a Full Tilt cowboy hat in 2011

As the issue of AI bots continues to throw up new challenges and risks to online poker, it’s interesting to look back at the moment the scales tipped in favor of the computers.

When online poker operators discuss AI now, it’s often as much about using the technology to catch and identify bots as it is the bots themselves. As with all technological advances, let’s hope our ability to harness AI for good keeps pace with the greed that so often threatens to derail our best laid plans.

^{Images courtesy of WPT/Carnegie Mellon University/DeepAI}