Home / Others / An unbeatable poker bot offers glimpses or the future of AI video games.

An unbeatable poker bot offers glimpses or the future of AI video games.

Whether playing in a fun Red Dead Redemption 2 or the whole game itself, poker fans are routinely irritated by an artificial intelligence that ignores Kenny Rogers' timeless advice on holding, bending, and the like. Some bots on the table can be bluffed from any hand; Others will never be bluffed. Some will withdraw at the slightest provocation, while others charge with even worse cards than you have. The players have as much visibility of the behavior of their CPU opponents as their cards are, that is, none.

For that reason, the research published by high-level problem solvers on Facebook and Carnegie Mellon University caught my attention earlier this week. Just wait to appear in a video game anytime soon. But your Pluribus poker AI is important in that, through a game, computer engineers have again emulated a previously accepted behavior as only of a human nature. And that is a bluff.

"This is true for many AI advances," Noam Brown, a research scientist at Facebook and co-creator of the robot, told me on Thursday. "Many of the things that we have are limited to human ability are possible with an AI.

"People thought in the 1950s that playing chess was something very human that I could not do," Brown explained. "Then people thought that playing Go at a high level of teacher, that's something very human that AI could not do, and then people thought that bragging is a very human thing that AI could not do. I can brag better than any living human. "

Screenshot of a poker hand testing the Pluribus AI.

One hand among six players testing the AI ​​of Pluribus poker.

The first scientific information that represents Brown's research comes with some qualifiers. Scientists have used poker to study AI behavior and learning before. In 2015, researchers from the University of Alberta built a poker robot that was basically unbeatable in Texas hold & em; em with a limit for two players. And, of course, common applications such as video games have put multiple participants at a poker table, especially at the height of poker fashion at the turn of the century.

The AIs with which people are most familiar are not as analytical as the frequency of a type of behavior applied to a given situation, be it the general strength of the hand or the first to rise on the flop. For years, poker simulators have included AI sliders for the aggressive and conservative game, whose usefulness is really just human training and play, regardless of what another person does.

That is before we arrive at the bluff, which is considered a human art form due to the revelations or tendencies of other players who give their trust, or the lack of it, in their hands. The Coresoft World Championship Poker series for PlayStation 2 even had a bluff minigame, which made it a more viable tactic. But more often, you get races where the opponents called everything, they got up inexplicably or held their hands like they were a couple of jacks. These games are sustainably entertaining because most players end up outdoing themselves out of boredom or impatience.

Pluribus is different, more or less, is analyzing the effect of deception, that is, betting with a weak hand, instead of selling competitors the strength of what they have. "The robot does not consider it as deceptive or a liar in any way, it just seems to be" this is the action that is going to be done with the most money in this situation ".

Pluribus, which Brown and his colleague of CMU created Tuomas Sandholm, resembles in a certain way a process that would be the calculation of results and hypothetical many steps ahead. The difference is that the Brown and Sandholm bot sees only two or three moves in advance. This short-term approach helped his bluffing tendencies to be fully opened to the five human professionals that Pluribus defeated outright in more than 10,000 hands.

In a way, it raises an existential question of what defines the lanterns the most: the behavior or the result?

However, Brown was not willing to answer that. His interest in poker, as a research environment, dates back to his undergraduate days at Rutgers University about 15 years ago. "All this idea that there is this mathematical strategy of the game, this perfect strategy that, if you can play, nobody can beat you," fascinated Brown.

Professional players have promoted systems for different games, with different levels of intellectual rigor and honesty, for years. Poker seems to be system-proof because it depends on incomplete or imperfect information, unlike blackjack, go or chess, where all participants know the information (where the blackjack dealer can not act independently).

But, in a way, Brown has shown that you can develop a strategy to win consistently ($ 1,000 per hour) in poker: he's human, able to play instant math.

"This is one of the interesting things about this AI, it is not adapting to its opponent," Brown said. "He has his strategy. It is fixed, it does not change what it is playing according to how humans play. This is the idea that could be a strategy in the game, really fascinating and that really pushed me to study it more. It was childish or mystical, in a sense, there is a strategy that we know exists, but we can not find it. "

A press release for Pluribus to the hardware hardware workshop that feeds it: a 64-core server with less than 512 GB of RAM, which worked for eight days, developed the AI. The researchers estimated that using servers in the cloud to train the program would only cost $ 150.

But do not wait for Pluribus to enter virtual poker rooms and start smashing everyone, or train a generation or formidable human players who pocket an hour. Brown said there are no plans to convert Pluribus into any kind of commercial work. AI is simply a proof-of-concept, whose lessons will help Brown and other researchers address the behavior of the computer in even more complex situations.

For example, auto driving cars. "One of the things we mentioned to reporters is the possibility of applying this to something like navigating in traffic with a self-driven car," Brown said.

That also goes back to another obvious application of video games, and another familiar one for many videogame fans: career car drivers whose CPU counterparts are far more sophisticated than speed, the optimal line and the space they will give to other drivers.

"The motorsport games are a great example of how this can be applied in the future, because it is an interaction of multiple agents, there are multiple players and there is also a certain level of hidden information," Brown reflected. "Many of the game's artificial intelligence games, from what I understand, are not using principles-based techniques, they're more rigid, more specific to the child the game is, of course, it makes it easier to debug and understand what happens.

"But if we develop these fundamental techniques, I think we're going to start looking at the computer gaming industry and start being more prominent," he added. "It would not surprise me, that's one of the first places where it really penetrates industrial applications."

File list It is Polygon's column at the intersection of sports and video games.

Source link