Computing Nash Equilibrium - PowerPoint PPT Presentation

About This Presentation
Title:

Computing Nash Equilibrium

Description:

Goal: Achieve the value of the game. Payoff matrix of the 'game' define at the end ... Theorem: For a non-degenerate game. finite number of p with m labels ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 37
Provided by: mathT
Category:

less

Transcript and Presenter's Notes

Title: Computing Nash Equilibrium


1
Computing Nash Equilibrium
  • Presenter Yishay Mansour

2
Outline
  • Problem Definition
  • Notation
  • Last week Zero-Sum game
  • This week
  • Zero Sum Online algorithm
  • General Sum Games
  • Multiple players approximate Nash
  • 2 players exact Nash

3
Model
  • Multiple players N1, ... , n
  • Strategy set
  • Player i has m actions Si si1, ... , sim
  • Si are pure actions of player i
  • S ?i Si
  • Payoff functions
  • Player i ui S ? ?

4
Strategies
  • Pure strategies actions
  • Mixed strategy
  • Player i pi distribution over Si
  • Game P ?i pi
  • Product distribution
  • Modified distribution
  • P-i probability P except for player i
  • (q, P-i ) player i plays q other player pj

5
Notations
  • Average Payoff
  • Player i ui(P) EsPui(s) ? P(s)ui(s)
  • P(s) ?i pi (si)
  • Nash Equilibrium
  • P is a Nash Eq. If for every player i
  • For any distribution qi
  • ui(qi,P-i) ? ui(P)
  • Best Response

6
Two player games
  • Payoff matrices (A,B)
  • m rows and n columns
  • player 1 has m action, player 2 has n actions
  • strategies p and q
  • Payoffs u1(pq)pAqt and u2(pq) pBqt
  • Zero sum game
  • A -B

7
Online learning
  • Playing with unknown payoff matrix
  • Online algorithm
  • at each step selects an action.
  • can be stochastic or fractional
  • Observes all possible payoffs
  • Updates its parameters
  • Goal Achieve the value of the game
  • Payoff matrix of the game define at the end

8
Online learning - Algorithm
  • Notations
  • Opponent distribution Qt
  • Our distribution Pt
  • Observed cost M(i, Qt)
  • Should be MQt, and M(Pt,Qt) Pt M Qt
  • cost on 0,1
  • Goal minimize cost
  • Algorithm Exponential weights
  • Action i has weight proportional to bL(i,t)
  • L(i,t) loss of action i until time t

9
Online algorithm Notations
  • Formally
  • Number of total steps T is known
  • parameter b 0lt b lt 1
  • wt1(i) wt(i) bM(i,Qt)
  • Zt ? wt(i)
  • Pt1(i) wt1(i) / Zt
  • Initially, P1(i) gt 0 , for every i

10
Online algorithm Theorem
  • Theorem
  • For any matrix M with entries in 0,1
  • Any sequence of dist. Q1 ... QT
  • The algorithm generates P1, ... , PT
  • RE(AB) ExA ln (A(x) / B(x) )

11
Relative Entropy
  • For any two distributions A and B
  • RE(AB) ExA ln (A(x) / B(x) )
  • can be infinite
  • B(x) 0 and A(x) ? 0
  • Always non-negative
  • log is concave
  • ? ai log bi ? log ? ai bi
  • ? A(x) ln B(x) / A(x) ? ln ? A(x) B(x) / A(x) 0

12
Online algorithm Analysis
  • Lemma
  • For any mixed strategy P
  • Corollary

13
Online Algorithm Optimization
  • b 1/(1 sqrt2 (ln n) / T)
  • additional loss
  • O(sqrt(ln n )/T)
  • Zero sum game
  • Average Loss v
  • additional loss O(sqrt(ln n )/T)

14
Example Zero Sum
5 1
3 2
2 3
3 4
15
Two players General sum games
  • Input matrices (A,B)
  • No unique value
  • Computational issues
  • find some Nash,
  • all Nash
  • Can be exponentially many
  • identity matrix
  • Example 2xN

16
Computational Complexity
  • Complexity of finding a sample equilibrium is
    unknown
  • no proof of NP-completeness seems possible
    (Papadimitriou, 94)
  • Equilibria with certain properties are NP-Hard
  • e.g., max-payoff, max-support
  • (Even) for symmetric 2-player games
  • ? NE with expected social welfare at least k?
  • ? NE with least payoff at least k?
  • ? Pareto-optimal NE?
  • ? NE with player 1 EU of at least k?
  • ? multiple NE?
  • ? NE where player 1 plays (or not) a particular
    strategy?

Gilboa Zemel, Conitzer Sandholm
17
Two players General sum games
  • player 1 best response
  • Like for zero sum
  • Fix strategy q of player 2
  • maximize p (Aqt) such that ?j pj 1 and pj ?0
  • dual LP minimize u such that u ? Aqt
  • Strong Duality p(Aqt) u p u
  • p( u Aq) 0
  • complementary system
  • Player 2 q(v- pB) 0

18
Nash Linear Complementary System
  • Find distributions p and q and values u and v
  • u ? Aqt
  • v ? pB
  • p( u Aq) 0
  • q(v- pB) 0
  • ?j pj 1 and pj ? 0
  • ?j qj 1 and qj ? 0

19
Two players General sum games
  • Assume the support of strategies known.
  • p has support Sp and q has support Sq
  • Can formulate the Nash as LP

20
Approximate Nash
  • Assume we are given Nash
  • strategies (p,q)
  • Show that there exists
  • small support
  • epsilon-Nash
  • Brute force search
  • enumerate all small supports!
  • Each one requires only poly. time
  • Proof!

21
Nash Linear Complementary System
  • Find distributions p and q and values u and v
  • u ? Aqt
  • v ? pB
  • p( u Aq) 0
  • q(v- pB) 0
  • ?j pj 1 and pj ? 0
  • ?j qj 1 and qj ? 0

22
Lemke Howson
  • Define labeling
  • For strategy p (player 1)
  • Label i if (pi0) where i action of player 1
  • Label j if action j (payer 2) is best response
    to p
  • bj p ? bkp
  • Similar for player 2
  • Label j if (qj0) where j action of player 2
  • Label i if action i (payer 1) is best response
    to q
  • ai q ? ajq

23
LM algo
  • strategy (p,q) is Nash if and only if
  • Each label k is either a label of p or q (or
    both)
  • Proof!
  • Example

24
Lemke-Howson Example
G1
G2
a3
a5
(0,0,1)
(0,1)
1
2
4
(0,1/3,2/3)
4
2
(1/3,2/3)
1
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
a4 a5
a1 1 0
a2 0 2
a3 4 3
a4 a5
a1 0 6
a2 2 5
a3 3 3
U2
U1
25
Lemke-Howson Example
G1
G2
a3
a5
(0,0,1)
(0,1)
1
2
4
(0,1/3,2/3)
4
2
(1/3,2/3)
1
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
a4 a5
a1 1 0
a2 0 2
a3 4 3
a4 a5
a1 0 6
a2 2 5
a3 3 3
U2
U1
26
LM non-degenerate
  • Two player game is non-degenerate if
  • given a strategy (p or q)
  • with support k
  • At most k pure best responses
  • Many equivalent definitions
  • Theorem For a non-degenerate game
  • finite number of p with m labels
  • finite number of q with n labels

27
LM Graphs
  • Consider distributions where
  • player 1 has m labels
  • player 2 has n labels
  • Graph (per player)
  • join nodes that share all but 1 label
  • Product graph
  • nodes are pair of nodes (p,q)
  • edges if (p,p) an edge then (p,q)-(p,q) edge

28
LM
  • completely labeled node
  • node that has mn labels
  • Nash!
  • node k-almost completely labeled
  • all labeling but label k.
  • edge k-almost completely labeled
  • all labels on both sides except label k
  • artificial node (0,0)

29
LM Paths
  • Any Nash Eq.
  • connected to exactly one vertex which is
  • k-almost completely labeled
  • Any k-almost completely labeled node
  • has two neighbors in the graph
  • Follows from the non-degeneracy!

30
LM algo
  • start at (0,0)
  • drop label k
  • follow a path
  • end of the path is a Nash

31
Lemke-Howson Algorithm
a3
a5
(0,0,1)
G1
G2
(0,1)
1
2
4
(0,1/3,2/3)
4
2
(1/3,2/3)
1
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
32
Lemke-Howson Algorithm
a3
a5
G2
(0,0,1)
G1
(0,1)
1
2
4
(0,1/3,2/3)
4
2
(1/3,2/3)
1
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
33
Lemke-Howson Algorithm
a3
a5
(0,0,1)
G1
G2
(0,1)
1
2
4
(0,1/3,2/3)
4
2
1
(1/3,2/3)
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
34
Lemke-Howson Other Equilibria
a3
a5
G1
(0,0,1)
G2
(0,1)
1
2
4
(0,1/3,2/3)
4
2
1
(1/3,2/3)
a1
3
(2/3,1/3)
5
(1,0,0)
a4
(2/3,1/3,0)
(1,0)
5
3
(0,1,0)
a2
35
LM Theorem
  • Consider a non-degenerate game
  • Graph consists of disjoint paths and cycles
  • End points of paths are Nash
  • or (0,0)
  • Number of Nash is odd.

36
LM Sketch of Proof
  • Deleting a label k
  • making support larger
  • making BR smaller
  • Smaller BR
  • solve for the smaller BR
  • subtract from dist. until one component is zero
  • Larger support
  • unique solution (since non-degenerate)
Write a Comment
User Comments (0)
About PowerShow.com