How A.I. Conquered Poker

It was this desire to model economic decision-making that led them to game play. Von Neumann rejected most games as unsuitable to the task, especially those like checkers or chess in which both players can see all the pieces on the board and share the same information. Real life is not like that, he explained to Jacob Bronowski, a fellow mathematician. Real life consists of bluffing, of little tactics of deception, of asking yourself what is the other man going to think I mean to do. And that is what games are about in my theory. Real life, von Neumann thought, was like poker.

Using his own simplified version of the game, in which two players were randomly dealt secret numbers and then asked to make bets of a predetermined size on whose number was higher, von Neumann derived the basis for an optimal strategy. Players should bet large both with their very best hands and, as bluffs, with some definable percentage of their very worst hands. (The percentage changed depending on the size of the bet relative to the size of the pot.) Von Neumann was able to demonstrate that by bluffing and calling at mathematically precise frequencies, players would do no worse than break even in the long run, even if they provided their opponents with an exact description of their strategy. And, if their opponents deployed any strategy against them other than the perfect one von Neumann had described, those opponents were guaranteed to lose, given a large enough sample.

There are a lot of really strange plays now that these guys are making that are effective but if people saw them back in the day, I think that theyd be invited into the game every night.

Theory of Games pointed the way to a future in which all manner of competitive interactions could be modeled mathematically: auctions, submarine warfare, even the way species compete to pass their genes on to future generations. But in strategic terms, poker itself barely advanced in response to von Neumanns proof until it was taken up by members of the Department of Computing Science at the University of Alberta more than five decades later. The early star of the departments games research was a professor named Jonathan Schaeffer, who, after 18 years of work, discovered the solution to checkers. Alberta faculty and students also made significant progress on games as diverse as go, Othello, StarCraft and the Canadian pastime of curling. Poker, though, remained a particularly thorny problem, for precisely the reason von Neumann was attracted to it in the first place: the way hidden information in the game acts as an impediment to good decision making.

Unlike in chess or backgammon, in which both players moves are clearly legible on the board, in poker a computer has to interpret its opponents bets despite never being certain what cards they hold. Neil Burch, a computer scientist who spent nearly two decades working on poker as a graduate student and researcher at Alberta before joining an artificial intelligence company called DeepMind, characterizes the teams early attempts as pretty unsuccessful. What we found was if you put a knowledgeable poker player in front of the computer and let them poke at it, he says, the program got crushed, absolutely smashed.

Partly this was just a function of the difficulty of modeling all the decisions involved in playing a hand of poker. Game theorists use a diagram of a branching tree to represent the different ways a game can play out. In a straightforward one like rock-paper-scissors, the tree is small: three branches for the rock, paper and scissors you can play, each with three subsequent branches for the rock, paper and scissors your opponent can play. The more complicated the game, the larger the tree becomes. For even a simplified version of Texas Hold em, played heads up (i.e., between just two players) and with bets fixed at a predetermined size, a full game tree contains 316,000,000,000,000,000 branches. The tree for no-limit hold em, in which players can bet any amount, has even more than that. It really does get truly enormous, Burch says. Like, larger than the number of atoms in the universe.

At first, the Alberta groups approach was to try to shrink the game to a more manageable scale crudely bucketing hands together that were more or less alike, treating a pair of nines and a pair of tens, say, as if they were identical. But as the field of artificial intelligence grew more robust, and as the teams algorithms became better tuned to the intricacies of poker, its programs began to improve. Crucial to this development was an algorithm called counterfactual regret minimization. Computer scientists tasked their machines with identifying pokers optimal strategy by having the programs play against themselves billions of times and take note of which decisions in the game tree had been least profitable (the regrets, which the A.I. would learn to minimize in future iterations by making other, better choices). In 2015, the Alberta team announced its success by publishing an article in Science titled Heads-Up Limit Holdem Poker Is Solved.

www.actusduweb.com
Suivez Actusduweb sur Google News


Ce site utilise des cookies pour améliorer votre expérience. Nous supposerons que cela vous convient, mais vous pouvez vous désinscrire si vous le souhaitez. J'accepte Lire la suite