Title: Uri Zwick Tel Aviv University
 1Uri ZwickTel Aviv University
Simple Stochastic GamesMean Payoff GamesParity 
Games
TexPoint fonts used in EMF. Read the TexPoint 
manual before you delete this box. AAAA 
 2Randomized subexponential algorithm for SSG
Deterministic subexponential algorithm for PG 
 3Simple Stochastic Games
Mean Payoff Games
Parity Games 
 4A simple Simple Stochastic Game 
 5Simple Stochastic game (SSGs) Reachability 
version Condon (1992)
R
min
MAX
RAND
Two Players MAX and min
Objective MAX/min the probability of getting to 
the MAX-sink 
 6Simple Stochastic games (SSGs)Strategies
A general strategy may be randomized and history 
dependent
A positional strategy is deterministicand 
history independent
Positional strategy for MAX choice of an 
outgoing edge from each MAX vertex 
 7Simple Stochastic games (SSGs)Values
Every vertex i in the game has a value ?vi
general
positional
general
positional
Both players have positional optimal strategies
There are strategies that are optimal for every 
starting position 
 8Simple Stochastic game (SSGs) Condon (1992)
Terminating binary games
 The outdegrees of all non-sinks are 2
 All probabilities are ½.
The game terminates with prob. 1
Easy reduction from general gamesto terminating 
binary games 
 9Solving terminating binary SSGs
The values vi of the vertices of a game are the 
unique solution of the following equations
The values are rational numbersrequiring only a 
linear number of bits
Corollary Decision version in NP ? co-NP 
 10Value iteration (for binary SSGs)
Iterate the operator
Converges to the unique solution
But, may require an exponentialnumber of 
iterations to get close 
 11Simple Stochastic game (SSGs) Payoff version 
Shapley (1953)
R
min
MAX
RAND
Limiting average version
Discounted version 
 12Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1964)
Values and optimal strategies of a MDP can be 
found by solving an LP 
 13Markov Decision Processes (MDPs)
R
min
MAX
RAND
Theorem Epenoux (1970)
Values and optimal strategies of a MDP can be 
found by solving an LP 
 14SSG ? NP ? co-NP  Another proof
Deciding whether the value of a game isat least 
(at most) v is in NP ? co-NP
To show that value ? v ,guess an optimal 
strategy ? for MAX
Find an optimal counter-strategy ? for min by 
solving the resulting MDP.
Is the problem in P ? 
 15Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski 
(1979)
R
min
MAX
RAND
Non-terminating version
Discounted version
PayoffSSGs
ReachabilitySSGs
MPGs
Pseudo-polynomial algorithm
(PZ96) 
 16Mean Payoff Games (MPGs)Ehrenfeucht, Mycielski 
(1979)
Again, both players have optimal positional 
strategies.
Value(s,?)  average of cycle formed 
 17Selecting the second largest element with only 
four storage locations PZ96 
 18Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely 
often is even 
 19Parity Games (PGs)
EVEN wins if largest priorityseen infinitely 
often is even
Equivalent to many interesting problemsin 
automata and verification
Non-emptyness of ?-tree automata
modal ?-calculus model checking 
 20Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges 
 21Simple Stochastic games (SSGs)Switches
A switch is a change of strategy at a single 
vertex
A switch is profitable for MAX if it increases 
the value of the game (sum of values of all 
vertices)
A strategy is optimal iff no switch is profitable 
 22Switches 
 23Strategy/Policy Iteration
Start with some strategy s (of MAX)
While there are improving switches, perform some 
of them
As each step is strictly improving and as there 
is a finite number of strategies, the algorithm 
must end with an optimal strategy
SSG ? PLS (Polynomial Local Search) 
 24Strategy/Policy IterationComplexity?
Performing only one switch at a time may lead to 
exponentially many improvements,even for MDPs 
Condon (1992)
What happens if we perform all profitable 
switches Hoffman-Karp (1966)
???
Not known to be polynomialO(2n/n) Mansour-Singh 
(1999)
No non-linear examples2n-O(1) Madani (2002) 
 25A randomized subexponential algorithm for simple 
stochastic games 
 26A randomized subexponentialalgorithm for binary 
SSGsLudwig (1995)Kalai (1992) 
Matousek-Sharir-Welzl (1992)
Start with an arbitrary strategy ? for MAX
Choose a random vertex i??VMAX
Find the optimal strategy ? for MAX in the 
gamein which the only outgoing edge of i is 
(i,?(i))
If switching ? at i is not profitable, then ? 
is optimal
Otherwise, let ??? (?)i and repeat 
 27A randomized subexponentialalgorithm for binary 
SSGsLudwig (1995)Kalai (1992) 
Matousek-Sharir-Welzl (1992)
MAX vertices
All correct !
Would never be switched !
There is a hidden order of MAX vertices under 
which the optimal strategy returned by the first 
recursive call correctly fixes the strategy of 
MAX at vertices 1,2,,i 
 28The hidden order
Let ui be the sum of values of the optimal 
strategy of MAX that agrees with s on i
Order the vertices such that
Positions 1,..,Iwere switchedand would neverbe 
switched again 
 29The hidden order
ui(s) - the maximum sum of values of a strategy 
of MAX that agrees with s on i 
 30The hidden order
Order the vertices such that
Positions 1,..,iwere switchedand would neverbe 
switched again 
 31SSGs are LP-type problems Halman (2002)
General (non-binary) SSGs can be solved in time 
Independently observed byBjörklund-Sandberg-Voro
byov (2005)
AUSO  Acyclic Unique Sink Orientations 
 32SSGs GPLCPGärtner-Rüst (2005)Björklund-
Svensson-Vorobyov (2005)
GPLCP Generalized Linear ComplementaryProblem 
with a P-matrix 
 33A deterministic subexponential algorithm for 
parity games
Mike PatersonMarcin JurdzinskiUri Zwick 
 34Parity Games (PGs) A simple example
Priorities
2
3
2
1
4
1
EVEN wins if largest priorityseen infinitely 
often is even 
 35Parity Games (PGs)
Mean Payoff Games (MPGs)
Stirling (1993) Puri (1995)
Replace priority k by payoff (?n)k
Move payoffs to outgoing edges 
 36Exponential algorithm for PGsMcNaughton (1993) 
 Zielonka (1998)
Vertices of highest priority(even) 
Firstrecursivecall
Second recursivecall
In the worst case, both recursive calls are on 
games of size n?1
Vertices from whichEVEN can force thegame to 
enter A 
 37Exponential algorithm for PGsMcNaughton (1993) 
 Zielonka (1998)
Vertices of highest priority(even)
Firstrecursivecall
Vertices from whichEVEN can force thegame to 
enter A 
Lemma (i) (ii) 
 38Exponential algorithm for PGsMcNaughton (1993) 
 Zielonka (1998)
Second recursivecall
In the worst case, both recursive calls are on 
games of size n?1 
 39Deterministic subexponential alg for PGs 
Jurdzinski, Paterson, Z (2006)
Idea Look for small dominions!
Second recursivecall
Dominions of size s can be found in O(ns) time
Dominion
Dominion A (small) set from which one of the 
players can win without the play ever leaving 
this set 
 40Open problems
- Polynomial algorithms? 
 - Is the Policy Improvement algorithm polynomial? 
 - Faster subexponential algorithmsfor parity 
games?  - Deterministic subexponential algorithmsfor MPGs 
and SSGs?  - Faster pseudo-polynomial algorithmsfor MPGs?