Computing Nash Equilibrium - PowerPoint PPT Presentation

About This Presentation
Title:

Computing Nash Equilibrium

Description:

Today: Zero-Sum game. Next week: General Sum Games. Multiple ... Strong Duality. If there are feasible solutions then c,x = b,y for some feasible x and y ... – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 34
Provided by: mathT
Category:

less

Transcript and Presenter's Notes

Title: Computing Nash Equilibrium


1
Computing Nash Equilibrium
  • Presenter Yishay Mansour

2
Outline
  • Problem Definition
  • Notation
  • Today Zero-Sum game
  • Next week General Sum Games
  • Multiple players

3
Model
  • Multiple players N1, ... , n
  • Strategy set
  • Player i has m actions Si si1, ... , sim
  • Si are pure actions of player i
  • S ?i Si
  • Payoff functions
  • Player i ui S ? ?

4
Strategies
  • Pure strategies actions
  • Mixed strategy
  • Player i pi distribution over Si
  • Game - P ?i pi
  • Product distribution
  • Modified distribution
  • P-i probability P except for player i
  • (q, P-i ) player i plays q other player pj

5
Notations
  • Average Payoff
  • Player i ui(P) EsPui(s) ? P(s)ui(s)
  • P(s) ?i pi (si)
  • Nash Equilibrium
  • P is a Nash Eq. If for every player i
  • For any distribution qi
  • ui(qi,P-i) ? ui(P)
  • Best Response

6
Notations
  • Alternative payoff
  • xij(P) ui(sij,P-i) EsPui(s) si sij
  • Difference in payoff
  • zij(P) xij(P) ui(P)
  • Improvement in payoff
  • gij(P) max zij(P),0

7
Fixed point Theorems
  • Intermediate Value Theorem
  • domain a,b
  • function f continuous
  • f(a) f(b) lt 0
  • exists z such that f(z)0
  • Proof M x f(x)? 0 M- x f(x) ? 0
  • closed sets and have an intersection.

8
Brouwers Fixed point theorem
  • f S ? S continuous, S compact and convex
  • There exists z in S z f(z)
  • For S0,1, previous theorem

9
Kakutani Fixed Point Theorem
  • L S ? S correspondence
  • L(x) is a convex set
  • L semi-continuous
  • S compact and convex
  • There exists z z in L(z)

10
Nash Equilibrium I
  • Best response correspondence
  • L(P) argmaxQ ? ui(qi, P-i)
  • L is a correspondence, continuous
  • Nash is a fixed point of L
  • P in L(P)
  • Kakutanis fixed point theorem

11
Nash Equilibrium II
  • Fixed point
  • K(P) has mN parameters
  • Kij(P) (pijgij(P)) / (1 ? gij(P))
  • Nash is a fixed point of K
  • P K(P)
  • Original proof of Nash
  • Continuous function on a compact space
  • Brouwers fixed point theorem

12
Nash Equilibrium III
  • Non-linear complementary problem (NCP)
  • Recall zij(P)
  • For every player i and action aij
  • zij(P)pij 0
  • zi(P) is orthogonal to pi
  • Nash z(P) ? 0
  • zij(P) ? 0

13
Nash Equilibrium IV
  • Stationary point problem
  • Recall x alternative payoff
  • Nash P
  • For every P
  • (P-P) x(P) ? 0
  • (pij pij) x(P) ? 0

14
Nash Equilibrium V
  • Minimizing a function
  • Objective function
  • V(P) ?i ?j gij(P)2
  • V(P) is continuous and differentiable,
    non-negative function
  • NASH V(P) 0
  • Local Minima

15
Nash Equilibrium VI
  • Semi-Algebraic set
  • distribution P ?j pij 1
  • difference in payoff
  • zij(P) ? 0
  • zij(P) xij(P) ui(P) ? 0
  • Explicitly

16
Two player games
  • Payoff matrices (A,B)
  • m rows and n columns
  • player 1 has m action, player 2 has n actions
  • strategies p and q
  • Payoffs u1(pq)pAqt and u2(pq) pBqt
  • Zero sum game
  • A -B

17
Linear Programming
  • Primal LP
  • x in SETprimal is feasible
  • maximize ltc,xgt subject to x in SETprimal

18
Linear Programming
  • Dual LP
  • y in SETdual is feasible
  • minimize ltb,ygt subject to y in SETdual

19
Duality Theorem
  • Weak duality ltc,xgt ? ltb,ygt
  • for any feasible x and y
  • proof!
  • Strong Duality
  • If there are feasible solutions then
  • ltc,xgt ltb,ygt for some feasible x and y
  • sketch of proof.

20
Two players zero sum
  • Fix strategy q of player 2,
  • player 1 best response
  • maximize p (Aqt) such that ?j pj 1 and pj ?0
  • dual LP minimize u such that u ? Aqt
  • Player 2 select strategy q
  • minimize u such that u ? Aqt and ?i qi 1 and
    qi ?0
  • dual (strategy for player 1)
  • maximize v such that v ? pA, ?j pj 1 and pj ?0
  • There exists a unique value v.

21
Example
22
Summary
  • Two players zero sum
  • linear programming
  • polynomial time
  • can have multiple Nash
  • unique value!
  • If (p,q) and (p,q) Nash then
  • (p,q) and (p,q) Nash

23
Online learning
  • Playing with unknown payoff matrix
  • Online algorithm
  • at each step selects an action.
  • can be stochastic or fractional
  • Observes all possible payoffs
  • Updates its parameters
  • Goal Achieve the value of the game
  • Payoff matrix of the game define at the end

24
Online learning - Algorithm
  • Notations
  • Opponent distribution Qt
  • Our distribution Pt
  • Observed cost M(i, Qt)
  • Should be MQt
  • Goal minimize cost
  • Algorithm Exponential weights
  • Action i has weight proportional to bL(i,t)
  • L(i,t) loss of action i until time t

25
Online algorithm Notations
  • Formally
  • parameter b 0lt b lt 1
  • wt1(i) wt(i) bM(i,Qt)
  • Zt ? wt(i)
  • Pt1(i) wt1(i) / Zt
  • Number of total steps T is known

26
Online algorithm Theorem
  • Theorem
  • For any matrix M with entries in 0,1
  • Any sequence of dist. Q1 ... QT
  • The algorithm generates P1, ... , PT
  • RE(AB) ExA ln (A(x) / B(x) )

27
Online algorithm Analysis
  • Lemma
  • For any mixed strategy P
  • Corollary

28
Online Algorithm Optimization
  • b 1/(1 sqrt2 (ln n) / T)
  • Average Loss v O(sqrt(ln n )/T)

29
Two players General sum games
  • Input matrices (A,B)
  • No unique value
  • Computational issues find some, all Nash
  • player 1 best response
  • Like for zero sum
  • Fix strategy q of player 2
  • maximize p (Aqt) such that ?j pj 1 and pj ?0
  • dual LP minimize u such that u ? Aqt

30
Two players General sum games
  • Assume the support of strategies known.
  • p has support Sp and q has support Sq
  • Can formulate the Nash as LP

31
Approximate Nash
32
Lemke Howson
33
Example
Write a Comment
User Comments (0)
About PowerShow.com