JuariBot - PowerPoint PPT Presentation

1 / 15

About This Presentation

Title:

JuariBot

Description:

Number of Views:38

Avg rating:3.0/5.0

Slides: 16

Provided by: csBer

Category:

Tags: juaribot | solitaire

Transcript and Presenter's Notes

Title: JuariBot

1
JuariBot Poker Playing Bot

2
Why Research Poker

Game of Imperfect Information
Game of Chance
Dynamic Environment (unlike solitaire)
Although zero-sum and not general sum, poker
approximates many issues of the real world.

3
Texas Holdem

4
Poker Approaches Game Theory

Zero-sum games have a Nash Equilibrium strategy
which can be proven to be optimal for each
player.
To solve the Nash equilibrium, one needs to
represent each players strategy and form a
payoff matrix for every combination of
strategies.
Normal form of the payoff matrix is exponential
in the game tree size
Kollers sequential form is linear in the game
tree. But Texas Holdem has 1018 nodes in the
game tree.
PsOpti approximates the game tree down to 107
nodes. No way to tell how good the approximation
is.

5
Poker Approaches Opponent Modeling

Opponent modeling based approaches deduce some
sort of model of the opponent and predict
expected payoffs
Prediction of payoff is based on either an
expecti-max search or Bayesian reasoning.
Action selection is based on an arbitrary
function of expected payoffs
Example Pra_i eExp(a_i)/n

6
Rejected Approach Reinforcement Learning

Learns a policy with a deterministic action in
each state, but in poker we need a randomized
strategy
RL in Markov Games (Littman) can learn a
randomized policy. However, this requires both
players to know the game state
In poker if both players know the game state then
the best policy is deterministic!

7
My Take

The key issue is action selection
Build a decent opponent modeling system to
predict odds of winning and expected payoffs
Compare different methods for action selection
Pra_i eExp(a_i)/n
Rule based (example odds gt .7 and opponent has
not raised then raise)

8
Opponent Modeling

9
Action Observation

Build a histogram of bets made in each round of
the game for each opponent
For an opponent who ends the round with a raise -
assume that he would have gone to the maximum bet

10
Deducing Hand Strength

From each opponent action (call or raise) deduce
the upper and lower limit of the relative hand
strength
From a raise we can only deduce a lower limit
Definition of Relative Hand Strength
Compute the odds of winning for each legal hand
(randomly simulate unknown cards)
Sort all the legal hands by their odds of winning
Relative position of a hand in this order is its
relative hand strength (real number in 0, 1)

11
Odds of Winning

Down-weight all possible opponent hands outside
the limits deduced from the opponent actions
Compare my hand to each possible opponent hand
Compute weighted odds of winning

12
Action Selection

13
Results

14
Future Work

Need to improve hand strength accuracy.
Prone to simulation errors
Python is quite slow!
Need to combine hand strength and relative hand
strength in one pass to reduce the number of
simulations
Need to experiment with different action
selection rules
Experiment with different poker bots
Experiment with heads up play (so far only ring
play)

15
Conclusions