Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information

About This Presentation

Title:

Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information

Description:

Rock. Rock. Paper. Scissors. Scissors. Paper. 1/3. 1/3. 1/3. 1/3. 1/3. 1/3 ... The sequence form is an alternative representation that is more compact [Koller, ... – PowerPoint PPT presentation

Number of Views:167

Avg rating:3.0/5.0

Slides: 28

Provided by: csC76

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information

1
Finding equilibria in large sequential games of
imperfect information

Andrew Gilpin and Tuomas Sandholm
Carnegie Mellon University
Computer Science Department

2
Motivation Poker

Poker is a wildly popular card game
This years World Series of Poker prize pool
surpassed 103 million, including 56 million for
the World Championship event
ESPN is broadcasting parts of the tournament
Poker presents several challenges for AI
Imperfect information
Risk assessment and management
Deception (bluffing, slow-playing)
Counter-deception (calling a bluff)

3
Rhode Island Holdem poker The Deal
4
Rhode Island Holdem poker Round 1
5
Rhode Island Holdem poker Round 2
6
Rhode Island Holdem poker Round 3
7
Rhode Island Holdem poker Showdown
8
Sneak preview of resultsSolving Rhode Island
Holdem poker

Rhode Island Holdem poker invented as a testbed
for AI research Shi Littman 2001
Game tree has more than 3.1 billion nodes
Previously, the best techniques did not scale to
games this large
Using our algorithm we have computed optimal
strategies for this game
This is the largest poker game solved to date by
over four orders of magnitude

9
Outline of this talk

Game-theoretic foundations Equilibrium
Model Ordered games
Abstraction mechanism Information filters
Strategic equivalence Game isomorphisms
Algorithm GameShrink
Solving Rhode Island Holdem

10
Game Theory

In multi-agent systems, an agents outcome
depends on the actions of the other agents
Consequently, an agents optimal action depends
on the actions of the other agents
Game theory provides guidance as to how an agent
should act
A game-theoretic equilibrium specifies a strategy
for each agent such that no agent wishes to
deviate
Such an equilibrium always exists Nash 1950

11
A simple example
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0
12
Complexity of computing equilibria

Finding a Nash equilibrium is A most fundamental
computational problem whose complexity is wide
open and together with factoring the most
important concrete open question on the boundary
of P today Papadimitriou 2001
Even for games with only two players
There are algorithms (requiring exponential-time
in the worst-case) for computing Nash equilibria
Good news Two-person zero-sum matrix games can
be solved in poly-time using linear programming

13
What about sequential games?

Sequential games involve turn-taking, moves of
chance, and imperfect information
Every sequential game can be converted into a
simultaneous-move game
Basic idea Make one strategy in the
simultaneous-move game for every possible action
in every possible situation in the sequential
game
This approach leads to an exponential blowup in
the number of strategies

14
Sequence form representation

The sequence form is an alternative
representation that is more compact Koller,
Megiddo, von Stengel, Romanovskii
Using the sequence form, two-player zero-sum
games with perfect recall can be solved in time
polynomial in the size of the game tree
But, Texas Holdem has 1018 nodes

15
Our approach

Instead of developing an equilibrium-finding
algorithm per se, we instead introduce an
automated abstraction technique that results in a
smaller, equivalent game
We prove that a Nash equilibrium in the smaller
game corresponds to a Nash equilibrium in the
original game
Our technique applies to n-player sequential
games with observed actions and ordered signals

16
Illustration of our approach
Original game
Nash equilibrium
Nash equilibrium
17
Game with ordered signals(a.k.a. ordered game)

Players I 1,,n
Stage games G G1,,Gr
Player label L
Game-ending nodes ?
Signal alphabet T
Signal quantities ? ?1,,?r and ? ?1,,?r
Signal probability distribution p
Partial ordering of subsets of T
Utility function u (increasing in private signals)

18
Information filters

Observation We can make games smaller by
filtering the information a player receives
Instead of observing a specific signal exactly, a
player instead observes a filtered set of signals
E.g. receiving the signal A?,A?,A?,A? instead
of A?
Combining an ordered game and a valid information
filter yields a filtered ordered game
Prop. A filtered ordered game is a finite
sequential game with perfect recall
Corollary If the filtered ordered game is
two-person zero-sum, we can solve it in poly-time
using linear programming

19
Filtered signal trees

Every filtered ordered game has a corresponding
filtered signal tree
Each edge corresponds to the revelation of some
signal
Each path corresponds to the revelation of a set
of signals
Our algorithms operate directly on the filtered
signal tree
We never load the full game representation into
memory

20
Filtered signal tree Example
21
Ordered game isomorphic relation

The ordered game isomorphic relation captures the
notion of strategic symmetry between nodes
We define the relationship recursively
Two leaves are ordered game isomorphic if the
payoffs to all players are the same at each leaf,
for all action histories
Two internal nodes are ordered game isomorphic if
they are siblings and there is a bijection
between their children such that only ordered
game isomorphic nodes are matched
We can compute this relationship efficiently
using dynamic programming and perfect matching
computations in a bipartite graph

22
Ordered game isomorphic abstraction transformation

This operation transforms an existing information
filter into a new filter that merges two ordered
game isomorphic nodes
The new filter yields a smaller, abstracted game
Thm If a strategy profile is a Nash equilibrium
in the smaller, abstracted game, then it is a
Nash equilibrium in the original game

23
Applying the ordered game isomorphic abstraction
transformation
24
Applying the ordered game isomorphic abstraction
transformation
25
Applying the ordered game isomorphic abstraction
transformation
26
GameShrink Efficiently computing ordered game
isomorphic abstraction transformations

Recall we have a dynamic program for determining
if two nodes of the filtered signal tree are
ordered game isomorphic
Algorithm Starting from the top of the filtered
signal tree, perform the transformation where
applicable
Approximation algorithm instead of requiring
perfect matching, instead require a matching with
a penalty below some threshold

27
GameShrink Efficiently computing ordered game
isomorphic abstraction transformations

The Union-Find data structure provides an
efficient representation of the information
filter
Linear memory and almost linear time
Can eliminate certain perfect matching
computations by using easy-to-check necessary
conditions
Compact histogram databases for storing win/loss
frequencies to speed up the checks

28
Solving Rhode Island Holdem poker

GameShrink computes all ordered game isomorphic
abstraction transformations in under one second
Without abstraction, the linear program has
91,224,226 rows and columns
After applying GameShrink, the linear program has
only 1,237,238 rows and columns
By solving the resulting linear program, we are
able to compute optimal min-max strategies for
this game
CPLEX Barrier method takes 7 days, 17 hours and
25 GB RAM to solve
This is the largest poker game solved to date by
over four orders of magnitude

29
Comparison to previous research

Rule-based
Limited success in even small poker games
Simulation/Learning
Do not take multi-agent aspect into account
Game-theoretic
Manual abstraction
Approximating Game-Theoretic Optimal Strategies
for Full-scale Poker, Billings, Burch, Davidson,
Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03.
Distinguished Paper Award.
Automated abstraction

30
Directions for future work

Computing strategies for larger games
Requires approximation of solutions
Tournament poker
More than two players
Other types of abstraction

31
Summary

Introduced an automatic method for performing
abstractions in a broad class of games
Introduced information filters as a technique for
working with games with imperfect information
Developed an equilibrium-preserving abstraction
transformation, along with an efficient algorithm
Described a simple extension that yields an
approximation algorithm for tackling even larger
games
Solved the largest poker game to date
Playable on-line at http//www.cs.cmu.edu/gilpin/
gsi.html

Thank you very much for your interest

Write a Comment

User Comments (0)

About PowerShow.com

Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information - PowerPoint PPT Presentation

Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information

Rock. Rock. Paper. Scissors. Scissors. Paper. 1/3. 1/3. 1/3. 1/3. 1/3. 1/3 ... The sequence form is an alternative representation that is more compact [Koller, ... – PowerPoint PPT presentation