Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information - PowerPoint PPT Presentation

About This Presentation
Title:

Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information

Description:

Rock. Rock. Paper. Scissors. Scissors. Paper. 1/3. 1/3. 1/3. 1/3. 1/3. 1/3 ... The sequence form is an alternative representation that is more compact [Koller, ... – PowerPoint PPT presentation

Number of Views:167
Avg rating:3.0/5.0
Slides: 28
Provided by: csC76
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Finding%20equilibria%20in%20large%20sequential%20games%20of%20imperfect%20information


1
Finding equilibria in large sequential games of
imperfect information
  • Andrew Gilpin and Tuomas Sandholm
  • Carnegie Mellon University
  • Computer Science Department

2
Motivation Poker
  • Poker is a wildly popular card game
  • This years World Series of Poker prize pool
    surpassed 103 million, including 56 million for
    the World Championship event
  • ESPN is broadcasting parts of the tournament
  • Poker presents several challenges for AI
  • Imperfect information
  • Risk assessment and management
  • Deception (bluffing, slow-playing)
  • Counter-deception (calling a bluff)

3
Rhode Island Holdem poker The Deal
4
Rhode Island Holdem poker Round 1
5
Rhode Island Holdem poker Round 2
6
Rhode Island Holdem poker Round 3
7
Rhode Island Holdem poker Showdown
8
Sneak preview of resultsSolving Rhode Island
Holdem poker
  • Rhode Island Holdem poker invented as a testbed
    for AI research Shi Littman 2001
  • Game tree has more than 3.1 billion nodes
  • Previously, the best techniques did not scale to
    games this large
  • Using our algorithm we have computed optimal
    strategies for this game
  • This is the largest poker game solved to date by
    over four orders of magnitude

9
Outline of this talk
  • Game-theoretic foundations Equilibrium
  • Model Ordered games
  • Abstraction mechanism Information filters
  • Strategic equivalence Game isomorphisms
  • Algorithm GameShrink
  • Solving Rhode Island Holdem

10
Game Theory
  • In multi-agent systems, an agents outcome
    depends on the actions of the other agents
  • Consequently, an agents optimal action depends
    on the actions of the other agents
  • Game theory provides guidance as to how an agent
    should act
  • A game-theoretic equilibrium specifies a strategy
    for each agent such that no agent wishes to
    deviate
  • Such an equilibrium always exists Nash 1950

11
A simple example
0, 0 -1, 1 1, -1
1, -1 0, 0 -1, 1
-1, 1 1, -1 0, 0
12
Complexity of computing equilibria
  • Finding a Nash equilibrium is A most fundamental
    computational problem whose complexity is wide
    open and together with factoring the most
    important concrete open question on the boundary
    of P today Papadimitriou 2001
  • Even for games with only two players
  • There are algorithms (requiring exponential-time
    in the worst-case) for computing Nash equilibria
  • Good news Two-person zero-sum matrix games can
    be solved in poly-time using linear programming

13
What about sequential games?
  • Sequential games involve turn-taking, moves of
    chance, and imperfect information
  • Every sequential game can be converted into a
    simultaneous-move game
  • Basic idea Make one strategy in the
    simultaneous-move game for every possible action
    in every possible situation in the sequential
    game
  • This approach leads to an exponential blowup in
    the number of strategies

14
Sequence form representation
  • The sequence form is an alternative
    representation that is more compact Koller,
    Megiddo, von Stengel, Romanovskii
  • Using the sequence form, two-player zero-sum
    games with perfect recall can be solved in time
    polynomial in the size of the game tree
  • But, Texas Holdem has 1018 nodes

15
Our approach
  • Instead of developing an equilibrium-finding
    algorithm per se, we instead introduce an
    automated abstraction technique that results in a
    smaller, equivalent game
  • We prove that a Nash equilibrium in the smaller
    game corresponds to a Nash equilibrium in the
    original game
  • Our technique applies to n-player sequential
    games with observed actions and ordered signals

16
Illustration of our approach
Original game
Nash equilibrium
Nash equilibrium
17
Game with ordered signals(a.k.a. ordered game)
  1. Players I 1,,n
  2. Stage games G G1,,Gr
  3. Player label L
  4. Game-ending nodes ?
  5. Signal alphabet T
  6. Signal quantities ? ?1,,?r and ? ?1,,?r
  7. Signal probability distribution p
  8. Partial ordering of subsets of T
  9. Utility function u (increasing in private signals)

18
Information filters
  • Observation We can make games smaller by
    filtering the information a player receives
  • Instead of observing a specific signal exactly, a
    player instead observes a filtered set of signals
  • E.g. receiving the signal A?,A?,A?,A? instead
    of A?
  • Combining an ordered game and a valid information
    filter yields a filtered ordered game
  • Prop. A filtered ordered game is a finite
    sequential game with perfect recall
  • Corollary If the filtered ordered game is
    two-person zero-sum, we can solve it in poly-time
    using linear programming

19
Filtered signal trees
  • Every filtered ordered game has a corresponding
    filtered signal tree
  • Each edge corresponds to the revelation of some
    signal
  • Each path corresponds to the revelation of a set
    of signals
  • Our algorithms operate directly on the filtered
    signal tree
  • We never load the full game representation into
    memory

20
Filtered signal tree Example
21
Ordered game isomorphic relation
  • The ordered game isomorphic relation captures the
    notion of strategic symmetry between nodes
  • We define the relationship recursively
  • Two leaves are ordered game isomorphic if the
    payoffs to all players are the same at each leaf,
    for all action histories
  • Two internal nodes are ordered game isomorphic if
    they are siblings and there is a bijection
    between their children such that only ordered
    game isomorphic nodes are matched
  • We can compute this relationship efficiently
    using dynamic programming and perfect matching
    computations in a bipartite graph

22
Ordered game isomorphic abstraction transformation
  • This operation transforms an existing information
    filter into a new filter that merges two ordered
    game isomorphic nodes
  • The new filter yields a smaller, abstracted game
  • Thm If a strategy profile is a Nash equilibrium
    in the smaller, abstracted game, then it is a
    Nash equilibrium in the original game

23
Applying the ordered game isomorphic abstraction
transformation
24
Applying the ordered game isomorphic abstraction
transformation
25
Applying the ordered game isomorphic abstraction
transformation
26
GameShrink Efficiently computing ordered game
isomorphic abstraction transformations
  • Recall we have a dynamic program for determining
    if two nodes of the filtered signal tree are
    ordered game isomorphic
  • Algorithm Starting from the top of the filtered
    signal tree, perform the transformation where
    applicable
  • Approximation algorithm instead of requiring
    perfect matching, instead require a matching with
    a penalty below some threshold

27
GameShrink Efficiently computing ordered game
isomorphic abstraction transformations
  • The Union-Find data structure provides an
    efficient representation of the information
    filter
  • Linear memory and almost linear time
  • Can eliminate certain perfect matching
    computations by using easy-to-check necessary
    conditions
  • Compact histogram databases for storing win/loss
    frequencies to speed up the checks

28
Solving Rhode Island Holdem poker
  • GameShrink computes all ordered game isomorphic
    abstraction transformations in under one second
  • Without abstraction, the linear program has
    91,224,226 rows and columns
  • After applying GameShrink, the linear program has
    only 1,237,238 rows and columns
  • By solving the resulting linear program, we are
    able to compute optimal min-max strategies for
    this game
  • CPLEX Barrier method takes 7 days, 17 hours and
    25 GB RAM to solve
  • This is the largest poker game solved to date by
    over four orders of magnitude

29
Comparison to previous research
  • Rule-based
  • Limited success in even small poker games
  • Simulation/Learning
  • Do not take multi-agent aspect into account
  • Game-theoretic
  • Manual abstraction
  • Approximating Game-Theoretic Optimal Strategies
    for Full-scale Poker, Billings, Burch, Davidson,
    Holte, Schaeffer, Schauenberg, Szafron, IJCAI-03.
    Distinguished Paper Award.
  • Automated abstraction

30
Directions for future work
  • Computing strategies for larger games
  • Requires approximation of solutions
  • Tournament poker
  • More than two players
  • Other types of abstraction

31
Summary
  • Introduced an automatic method for performing
    abstractions in a broad class of games
  • Introduced information filters as a technique for
    working with games with imperfect information
  • Developed an equilibrium-preserving abstraction
    transformation, along with an efficient algorithm
  • Described a simple extension that yields an
    approximation algorithm for tackling even larger
    games
  • Solved the largest poker game to date
  • Playable on-line at http//www.cs.cmu.edu/gilpin/
    gsi.html

Thank you very much for your interest
Write a Comment
User Comments (0)
About PowerShow.com