"However, I can say this.
In a game of Go, the state of the game is determined entirely by the players moves, which both betonline cheating blackjack players see.Figure 1: A game tree of an extensive form game.For all a_1 in A_1 and a_2 in A_2, we have: r_1(a_1, a_2) r_2(a_1, a_2) 0 Zero-sum games are interesting since any Nash equilibrium can be computed efficiently using the minmax theorem.A Deep Q-network learns how to play under the reinforcement learning framework, where a single agent interacts with a fixed environment, possibly with imperfect information.Dong Kim, one of the professionals that Libratus competed against.Strength of the AI edit, libratus had been leading against the human players from day one of the tournament.
Thanks to Noam Brown for bringing this to our attention.
One of the subteams was playing in the open, while the other subteam was located in a separate room nicknamed 'The Dungeon' where no mobile phones or other external communications were allowed.Teaching AI poker could have significance outside of the poker world (although it has already transformed how humans play poker).While Go and poker are both extensive form games, the key difference between the two is that Go is a perfect information game, while poker is an imperfect information game.(On a chessboard, every piece is visible, but in poker no player can see another players cards.).Libratus therefore uses reach subgame solving which gives slightly weaker guarantees that the opponent is no better off for those cases where he is likely to reach this subgame instead of accounting for all possible strategy changes.In the game tree, this is denoted by the information set, or the dashed line between the two states.
For one thing, it's a way of teaching AI to work with incomplete information, which comes up in real world situations like negotiations.
The poker variant that Libratus can play, no-limit heads up Texas Hold'em poker, is an extensive-form imperfect-information zero-sum game.
Importantly, the Nash equilibria of zero-sum games are computationally tractable and are guaranteed to have the same unique value.