Title: Search applied to a problem against an adversary
1Game playing
- Search applied to a problem against an adversary
- some actions are not under the control of the
problem-solver - there is an opponent (hostile agent)
- Since it is a search problem, we must specify
states operations/actions - initial state current board operators legal
moves goal state game over utility function
value for the outcome of the game - usually, (board) games have well-defined rules
the entire state is accessible
2Basic idea
- Consider all possible moves for yourself
- Consider all possible moves for your opponent
- Continue this process until a point is reached
where we know the outcome of the game - From this point, propagate the best move back
- choose best move for yourself at every turn
- assume your opponent will make the optimal move
on their turn
3Problem
- For interesting games, it is simply not
computationally possible to look at all possible
moves - in chess, there are on average 35 choices per
turn - on average, there are about 50 moves per player
- thus, the number of possibilities to consider is
35100
4Types of Games
Chance
Deterministic
Perfect Information
Imperfect Information
5Deterministic Two-person Games
- Two players take turn.
- Each Player knows what the other has done to this
point and can do. - One of the players wins and the other loses (or
there is a draw).
6Games as Search Grundys Game (Nilsson 1980,
Page 112)
- Start with a stack of coins
- Each player divides one of the current stacks
into two unequal stacks (one having more coins
than the other). - The game ends when every stack contains one or
two coins - The first player who cannot play loses.
7Grundys Game States
Can Max devise a strategy to always win?
7
Mins turn
8(No Transcript)
9Conduct a depth first search until a terminal
node is reached- a breadth first search until
all nodes at level 2 are generated and then apply
a static evaluation function or perform
Evaluation function example
tic-tac-toe for a node n e(n)(number of complete
lines- row, column or diagonal that are still
open for max) - (number of
complete lines- row, column or diagonal that are
still open for min)
10General Minimax Algorithm on a Game Tree
For each move by the computer 1. perform
depth-first search as far as the terminal
state 2. assign utilities at each terminal
state 3. propagate upwards the minimax
choices if the parent is a minimizer
(opponent) propagate up the minimum value of
the children if the parent is a maximizer
(computer) propagate up the maximum value of
the children 4. choose the move (the child of
the current node) corresponding to the maximum
of the minimax values of the children
minimax values are gradually propagated upwards
as depth-first search proceeds, i.e., minimax
values propagate up the tree in a
left-to-right fashion minimax values for
sub-tree are propagated upwards as we go, so
only O(bd) nodes need to be kept in
memory at any time
11Minimax
computed by function
12Minimax Algorithm Illustrated
Move selected by minimax
Static evaluation Value returned
13Minimax Algorithm
function MINIMAX(N) is begin if N is a
leaf then return the estimated score
of this leaf else Let N1, N2,
.., Nm be the successors of N if N
is a Min node then return
minMINIMAX(N1), .., MINIMAX(Nm)
else return maxMINIMAX(N1), ..,
MINIMAX(Nm) end MINIMAX
14Properties of Minimax
- Complete
- Optimal
- Time complexity
- Space complexity
Yes, if tree is finite. Yes, against optimal
opponent. Otherwise?? O(bm). O(bm).
m is depth of the tree
- For chess, b ? 35, m ? 100 for a reasonable
game. - Which is infeasible
15?? Pruning
Minimax Position evaluation takes place after
search is complete- inefficient
- Essential idea stop searching down a branch of
tree when you can determine that it is a dead end.
16Alpha Beta Procedure
- Depth first search of game tree, keeping track
of - Alpha Highest value seen so far on maximizing
level - Beta Lowest value seen so far on minimizing
level - Pruning
- When Maximizing,
- do not expand any more sibling nodes once a node
has been seen whose evaluation is smaller than
Alpha - When Minimizing,
- do not expand any sibling nodes once a node has
been seen whose evaluation is greater than Beta
-
17(No Transcript)
18Alpha-Beta Pruning Example
gt3
Max (3, Min(2,x,y) ) is always 3
A3
A2
A1
3
? 2
?14
A11
A12
A13
A21
A22
A23
A31
A32
A33
3
12
8
2
14
5
2
Min (2, Max(3,x,y) ) is always 2
We know this without knowing x and y
19Alpha-Beta PruningSummary
- Alpha the value of the best choice weve found
so far for MAX (highest) - Beta the value of the best choice weve found
so far for MIN (lowest) - When maximizing, cut off values lower than Alpha
- When minimizing, cut off values greater than Beta
20Alpha-beta algorithm
- function MAX-VALUE (state, game, alpha, beta)
- alpha best MAX so far beta best MIN
- if CUTOFF-TEST (state) then return EVAL (state)
- for each s in SUCCESSORS (state) do
- alpha MAX (alpha, MIN-VALUE (state, game,
- alpha, beta))
- if alpha gt beta then return beta
- end
- return alpha
- function MIN-VALUE (state, game, alpha, beta)
- if CUTOFF-TEST (state) then return EVAL (state)
- for each s in SUCCESSORS (state) do
- beta MIN (beta, MAX-VALUE (s, game, alpha,
beta)) - if beta lt alpha then return alpha
- end
- return beta
21Effectiveness of Alpha-Beta Search
- Worst-Case
- branches are ordered so that no pruning takes
place - alpha-beta gives no improvement over exhaustive
search - Best-Case
- each players best move is the left-most
alternative (i.e., evaluated first) - in practice, performance is closer to best rather
than worst-case - In practice often get O(b(d/2)) rather than O(bd)
- this is the same as having a branching factor of
sqrt(b), - since (sqrt(b))d b(d/2)
- i.e., we have effectively gone from b to square
root of b - e.g., in chess go from b 35 to b 6
- this permits much deeper search in the same
amount of time - makes computer chess competitive with humans!
22Alpha-Beta prunning in Tic-Tac-Toe