Title: Game Theory Folk Theorems
1Game Theory Folk Theorems
- Univ. Prof.dr. M.C.W. Janssen
- University of Vienna
- Winter semester 2008-9
- Week 48 (November 24-6)
2Notation in repeated games
- Define history of play as follows.
- Let a0 (a01 , a02 ,,a0n ) as the action
profile that is played in stage 0, i.e., the
actions played by all players - History at the beginning of period 1, h1 a0
- History at the beginning of stage t1, ht1
(a0,,at) - The set Ht is the set of all possible histories
ht and Ai(ht) is the set of actions that player i
can choose after history ht and Ai(Ht) is the
union of this set over all possible histories - Strategy si of player i is a sequence of mappings
ski where each ski maps Hk to mixed actions. - Note that you cannot condition on the random
events
3Pay-offs in repeated games
- Overall pay-offs ui stage game pay-offs gi,
continuation pay-off from period t onwards - Want to have an expression where one can easily
compare stage game pay-offs and repeated game
pay-offs, i.e., normalisation - Time averaging is sometimes used for the case of
complete patience
4Folk Theorem I
- If players are sufficiently patient, then any
feasible, individually rational pay-offs can be
enforced by an equilibrium - Individually rational pay-offs minimax pay-off
- vi
- mji is action player j chooses to minimax player
i - Feasible pay-offs is the convex hull V of the
static game pay-offs, i.e., V convex hull v /
there is an a ? A such that g(a) v - Both terms need some explanation
5Minimax pay-offs
- What are the Nash equilibria of this game?
- Pure strategy eq (D,L), (D,R)
- Denote by q the probability player 2 chooses L
- In a mixed strategy eq ?q?, pay-offs also 0 and
1 - Minimax for player 1
- u(U) -3q1
- u(M) 3q-2
- U(D) 0
- Minimax is 0
- Minimax for player 2 is also 0
- By 1 choosing (½,½,0)
- Thus, minimax poay-offs can be lower than Nash
eq. pay-offs
6Feasible pay-offs
- Equilibrium pay-offs are (2,1), (1,2) and (?, ?)
- Convex hull is triangle connecting the three
points (also e.g. (1½,1½)) - But (1½,1½) cannot be obtained by independent
mixing, only as correlated eq - Correlated mixing can happen in repeated setting
by alternating between playing two equilibria
(and time averaging pay-offs or d close to 1)
Eq. pay-offs
7Folk Theorem II
- Prop. For every feasible pay-off vector v with vi
vi, there exist a d lt 1 such that for all d gt d
there exist a nash equilibrium of the infinitely
repeated game with pay-off v. - Pay-offs in repeated game cannot only be larger,
but also smaller than static Nash eq pay-offs!! - Basic idea if players are sufficiently patient,
then any finite gain in a one period deviation is
nothing compared to a small, but permanent loss
in future pay-offs (punishment by minimaxing a
player)
8Proof Nash Folk Theorem
- Consider feasible pay-off v and action profile
g(a)v - If there is no action profile a that yields v,
you may choose a sequence of actions such that v
is (close to) average (discounted) pay-offs (or a
public randomization) - Consider strategy start by playing ai play ai
as long as others do, if one player j deviates
minimax him forever, i.e., choose mji - Deviation in period t pay-off yields
- which is smaller than vi if d is larger than di,
where di solves
9Is the threat of Minimaxing credible?
- If we restrict analysis to static Nash threats,
then Friedman shows that only pay-offs larger
than the static Nash equilibrium pay-offs can be
supported - Others show that in games where the minimax
pay-offs are lower than the static equilibrium
pay-offs, even worse outcomes can be compatible
with a SPE of the infinitely repeated game.
10Basic idea of SPE with minimax pay-offs time
averaging
- After a deviation, play the minimax pay-off for N
periods, where n is chosen for all players s.t. - After N periods return back to cooperative mood
- (finite) N ensures that no player has an
incentive to deviate - Cost of punishment is extremely small as with
time averaging pay-offs in a finite number of
periods do not make a difference - Average pay-oof to player j when I is punished is
vj
11Basic idea of SPE with minimax pay-offs discou
nted pay-offs
- Reward punishers, instead of punishing them if
they dont punish - Choose a vector in the interior of V such that
for each i you can still give a higher pay-off. - V needs to be of full dimension
- Play in three phases
- Initial cooperative phase
- Punishment phase where players minimax for N
periods the deviator (as before) switch back to
initial phase if this happens. - If a player deviates in punishment phase start to
punish that player - What to do in case pay-offs can only be obtained
with randomizations
12Renegotiation proofness in repeated games
- Is SPE the best notion of a credible threat?
- Suppose you cooperate for some time in the PD and
then someone defects, by chance. Should you go
back immediately to always defect? - Or should players renegotiate?
- It is in both players interest to revert back to
the cooperative outcome - In any equilibrium the equilibrium played must
not be Pareto-dominated. - Pareto-optimality as an assumption and the
critique that is possible (risk dominance and
Pareto-dominance) - Deviations are accidents and unlikely to be
repated? Bygones are bygones
13Pareto perfection only applies in two-player games
A
- Two Nash equilibria in pure strategies (U,L,A)
and (D,R,B) - ULA is Pareto-efficient
- Natural candidate?
- Suppose players 1 and 2 expect matrix chooser to
choose A. then they can renegotiate and gain by
playing (D,R)
B
14Definition of Pareto perfect equilibrium
- Fix stage game g and play it for T periods.
- Let P(T) the set of pay-offs of pure strategy SPE
of G(T) - R(t) is the set of strongly efficient points of
P(t), i.e., this is the set of points such that
there is not another pay-off point where no
player is worse off and some player is better
off. - Set Q(1) P(1)
- For any t, let Q(t) be the set of pay-offs of
pure strategy SPE that can be ebforced with
continuation pay-offs in R(t-1) - A SPE is Pareto perfect if for every possible
history and in every time period t, the
continuation pay-offs are in R(T-t)
15Pareto perfection restricts threats
- Some efficient equilibria cannot be supported
anymore under Pareto-perfection - It restricts the set of threats, and thereby it
is more difficult to keep players on the
equilibrium path - Example
16Example Pareto-perfection
- Three pure strategies in G(1) with pay-offs
(4,2), (2,4) and (3,3) - In G(2) without discounting pay-off of 8 is
possible. Unique element in R(2) - Without restriction to Pareto perfection in G(3)
pay-off of 13 possible - With Pareto perfection in first period of G(3) no
threat possible one has to play stage game
equilibrium - Equilibrium play alternates between odd and even
periods under Pareto perfection
17Exercises
- Fudenberg and Tirole 4.5
- Fudenberg and Tirole 5.1
- Consider the following normal form and the
infinite repetition of it. What are the SPE of
the infinite game? How does your result depend on
d?