Computer science researcher Michael Bowling and his colleagues at the University of Alberta, along with Finnish software developer Oskari Tammelin, have devised an algorithm to solve a variant of poker called heads-up limit hold’em (HULHE) (1).

The authors’ strategy plays HULHE so flawlessly “as to render pointless further work on this game,” notes Eric Jackson, a computer-poker researcher based in Menlo Park, California (1).

Texas Hold 'Em

Computer scientists at the University of Alberta have written an effectively unbeatable algorithm for heads-up limit hold’em.

Unlike chess or checkers, where the full knowledge of past moves and the current game state is available to all players, poker is known as an imperfect-information game. To date, every nontrivial game that has been solved is a perfect-information game (2).

Poker playing algorithms typically employ a heuristic called ‘regret.’ The heuristic is attached to every decision the algorithm makes and retrospectively represents how poor the decision was. The algorithm developed by Bowling and colleagues played more than 1,500 games to gather all of the regret data needed to win poker consistently (1).

Poker has a massive number of potential game states. There are 3.16 * 1017 states HULHE can reach, and 3.19 * 1014 states at which a player must make a decision (1). Any algorithm that hopes to solve poker must have a calculated move to make at each of the game states.

Storing all the possible moves and their associated regret values requires approximately 262 terabytes of storage (2). To put this in context, one terabyte is roughly the total storage capacity of an off-the-shelf iMac.

The researchers managed to compress all the data into 11 terabytes to store the regret data and six terabytes to store the move strategies (2). “I think the […] regret algorithm is a major advance,” says computer scientist Jonathan Shapiro at the University of Manchester, United Kingdom. “But they have done several other very clever things to make this problem computationally feasible” (1).

Because poker is a game of imperfect information, there is not a mathematically rigorous means of winning every game. However, Bowling and his colleagues showed that their algorithm always wins in the long run, deeming poker ‘weakly solved’ (1, 2).

The implications of this poker-solving algorithm scale to larger operations of imperfect information, like managing stock portfolios. According to the Bowling et al., “the breakthroughs behind our result are general algorithmic advances that make game-theoretic reasoning in large-scale models of any sort more tractable” (2).

Recently, there has been a surge in game-theoretic applications involving security and medicine. Real life decisions almost always involve imperfect information or uncertainty, and algorithms that learn as they go along, like the one built by Bowling and colleagues, are crucial to future applications.

Sources:

1. Ball, Philip (2015, January 8). Game theorists crack poker. Retrieved January 11, 2015, from http://www.nature.com/news/game-theorists-crack-poker-1.16683#/b1

2. Bowling, et al. (2015, January 9). Heads-up limit hold’em poker is solved. Retrieved January 11, 2015, from http://www.sciencemag.org/content/347/6218/145.full