Polk, Schulman, Boeree explore AI poker showdown as OpenAI dominates

Mo Afdhal
Posted on: February 3, 2026 16:21 PST

In a collaborative effort between Google DeepMind and Kaggle, ten of the leading Large Language Models (LLMs) – Grok 4, Grok 4.1 Fast Reasoning, OpenAI o3, GPT 5.2, GPT 5-Mini, Gemini 3 Pro, Gemini 3 Flash, DeepSeek 3.2, Claude Opus 4.5, and Claude Sonnet 4.5 – have taken to the newly-updated Kaggle Game Arena to battle it out across three different games: poker, chess, and Werewolf. 

Liv Boeree, Doug Polk, Nick Schulman, and Chess Grandmaster Hikaru Nakamura were all tapped by Google DeepMind and Kaggle to cover the exhibition and share any findings with their audiences – with each of the four providing their own level of expertise and knowledge on the games in question. 

While we've seen organized poker exhibitions between LLMs in the past, the Kaggle AI Game Arena adds a new wrinkle to the fight with its multi-game offerings that test LLMs on multiple fronts. 

Back in October of 2025, Max Pavlov ran his own LLM-based poker tournament – and OpenAI came out ahead as the clear victor. Interestingly enough, the two LLMs that have reached the finals of the poker (to be played February 4) are both owned and operated by OpenAI.

Does this position OpenAI as the best of the best when it comes to silicon poker players?

The experts weigh in

As Polk explains in the video above, the AI exhibition poker match kicked off on Monday, February 2, following a seeding round between each of the models. In Day 1's quarter-finals, the LLMs were paired off based on the seeding round, and the heads-up matches began. In his video, Polk highlights a number of hands that he found interesting – and a few that were just downright crazy. 

Polk breaks down the hands from a human perspective but also makes note of the LLM's explanations for its decision-making processes – and is quick to point out the often flawed logic in its reasoning. 

On Tuesday, February 3, Nakamura and Schulman teamed up for live-streamed commentary on both the semi-finals of the poker side of the exhibition and the ongoing chess matches between the LLMs.

If you've ever watched coverage of poker tournaments that features Schulman in the commentary booth, you'll know that his skills and knowledge on the felt translate well into his analysis of other players' decision-making. It was particularly interesting to listen to Schulman try to rationalize the choices made by the LLMs in the poker matches he spectated. Even more interesting was Schulman's inquisitive nature and general curiosity when it came to watching the chess matches. In the video below, Schulman and Nakamura's coverage starts at the 2:19:00 mark.

In an episode of her Win, Win podcast released on February 3, Boeree sat down with one of Google DeepMind's lead engineers to gain a better understanding of the project. 

Below, you'll find the entirety of their conversation – which highlights the surprising results we've seen so far and features Boeree offering a bit of pushback when she asks the all-important question: is teaching LLMs to play social games a dangerous step forward? 

"I can imagine that some people might find it concerning, or at least worth considering, that by training LLMs – or at least testing LLMs – on games of deception, like Werewolf and poker, that we could in some way be encouraging deceptive or manipulative behavior from LLMs," Boeree noted in the discussion. "And I think that is a fair concern. What would you say to that? Do you agree with it, or is it the wrong way to think about the problem?" 

The Kaggle Game Arena poker heads-up final plays out on February 4.