IMLayers - PowerPoint PPT Presentation

About This Presentation

Title:

IMLayers

Description:

... change his/her strategy and gain benefit (=local maxim) ... V remembers the list of (u1,...uk) witnesses. Abstract Algorithm Proof. Base case: ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 37

Provided by: gedonr

Category:

Tags: imlayers | maxim | uk

more less

Transcript and Presenter's Notes

Title: IMLayers

1
Graphical Models for Game Theory
by Michael Kearns, Michael L. Littman,
Satinder Singh
Presented by Gedon Rosner
2
Agenda

Introduction
Motivation
Goals
Terminology
The Algorithm
Outline
Details
Proof
Back up

3
Introduction

This paper describes a graphical representation
of multi-player single-stage games.
Presents a polynomial algorithm that provides
approximations to well-defined problems that
would otherwise be computationally hard.
Presents an exponential algorithm with precise
results that will not be described.

4
Introduction cont.

Multi-Player game theory uses Tables to represent
games payoffs to each player per their course
of action.
Tables require immense computational resources
(space time).
In certain cases graphical structures succinctly
describe the game and may be computationally less
expensive as well (depending on what is computed).

5
Motivation -Tabular Form

n agents with X possible actions require
nXn space in matrix/tabular form.
Each agent has X2 possible actions 0,1 the
possible results of the game is represented in n
matrices (for each player) where each matrix is
2n cells for every combination of actions vi that
the other players may perform (v1, v2,. vn).
The representation in itself is exponential by
the number of players, computation seems at least
as hard.

6
Motivation-Graphical Form

Matrices Graphs - special graphs (e.g. trees)
are better used to describe sparse Matrices.
A full graph (V,E) is isomorphic to a matrix.
Trees - graph traversal algorithms are better
for flow computation representing dependencies.
If a game has dependencies between sets of
localized players and mutual influence is
propagated across the board a tree structure is
inherent.

7
Motivation - Computation

Nash Equilibriums are sets of strategies in which
no player can unilaterally change his/her
strategy and gain benefit (local maxim).
Radio stations music vs. rating benefit

8
Nash equilibrium

The danger is that both stations will choose the
more profitable ?????? format -- and split the
market, getting only 25 each! Actually, there is
an even worse danger that each station might
assume that the other station will choose ??????,
and each choose MTV, splitting that market and
leaving each with a market share of just 15.

9
Nash equil. motivation

The problem for the players is to figure out
which equilibrium will in fact occur.
Coordination problem how can the players
coordinate their strategies to avoid the danger
of a mutually inferior outcome ?
Tomas Schelling (1960) - any bit of information
available to all participants in a coordination
game, might enable them all to focus on the same
equilibrium and might solve the problem

10
Goals

Provide a complete graphical representation for
multi-player one-stage games.
Define how/when the graphical structure may
provide a succinct representation in an order of
magnitude. (polynomial vs. exponential).
Provide a polynomial algorithm for computing
approximate Nash equilibriums in one stage games
by trees or sparse graphs.

11
Agenda

Introduction
Motivation
Goals
Terminology
The Algorithm
Outline
Details
Proof
Back up

12
Terminology

Games in Tabular form
An n-player, two-action game is defined by n
matrices Mi with n indices. The entry Mi(x1,..
xn) specifies the payoff to player i when the
combined action of the n players is x ? 0,1n.
Each matrix has 2n entries.
Pure and Mixed Strategies
The actions of either 0 or 1 are pure. A
mixed strategy is a probability pi the player
will play 0.

?
13
Terminology cont.

Expected Payoff for mixed strategy
Player i expects the payoff Mi(p) which is
defined as the Exp.xpMi(p).
here xp indicates that
xj 0 pj.
xj 1 1- pj.
Nash Theorem (1951)
For any game, there exists a Nash equilibrium
in the space of joint mixed strategies.

?
?
?
?

14
Terminology cont.

Nash equilibrium
A mixed strategy of all the players denoted
as.
p s.t. for any player i and for any other
strategy p?0,1 Mi(p) ? Mi(pipi).
This just means that no player can improve their
payoff by deviating unilaterally from the Nash
equilibrium.
?-Nash equilibrium
Mi(p)? ? Mi(pipi) improve by at most ?.

?
?
?
?
?
?
?
?
15
Agenda

Introduction
Motivation
Goals
Terminology
The Algorithm
Outline
Details
Proof
Back up

16
Graphical Game description

An n-player game is - (G,M) G is an undirected
graph on n vertices and Mi is a set of n matrices
for each player. Player i is represented by a
vertex labeled i.
NG(i)?1,,n the neighbors j of i in G s.t.
the undirected edge (i,j)?E(G) and (i,i)? NG(i).
If NG(i)?k then p ? 0,1k ? the expected
payoff is effected by k neighbors only and Mi(p)
Exp.xpMi(p) O(2k) ltlt O(2k).

?
?
?
?
17
A Complete Description

Proof
There is a complete mapping between graph
representation and tabular representation. Every
game has a trivial representation as a graphical
game by choosing the complete graph.
In cases (like Bayesian networks) where a flow
or a local neighborhood may be defined and can be
bound by k ltlt n, exponential space saving occurs.

Attaining Goal 1 2
18
The Tree Algorithm - Abstract

The graphical game is (G,M). G is a tree.
The computation is an ?-Nash equilibrium of the
game.
The algorithm traverses the tree in reverse
depth-first order using a relaxation computation
in each step. Inductively a group of Nash
equilibrium is determined.
Finally the tree is traversed in depth-first
ordering where a single Nash equilibrium is
chosen.

19
Terminology of the game

V is the father of U, R is the root of the tree.
Denote
GU - the sub-tree where U is the root to its
leaves.
MuVv as the subset of matrices of M
corresponding to the vertices in Gu where the
matrix MU has the index Vv.
Theorem 1
A Nash Equilibrium of (GU , MUVv ) is an
equilibrium downstream from U given that V
plays v.

20
Traversing the Tree

Upstream traversal - each node Ui will send V all
the Nash equilibrium found on the corresponding
sub-graph of GUi . V will then perform the
relaxation of the algorithm which determines
which equilibrium should be passed on.
In each step of the traversal, every vertex
communicates a binary-valued table T which is
indexed by all the possible values for the mixed
strategies v ? 0,1 of V, ui ? 0,1 of Ui
(!!!!).

21
The Relaxation

If U is a leaf then T(v,u)1 iff Uu is a best
response to Vv.
T(v, ui) 1 iff there exists a Nash equilibrium
for (GUi , MUiVv ).
V uses the k tables Ti it received and computes
the table for its parent W For each pair of
strategies (w,v), T(w,v)1 iff there exists a set
of strategies u1,uk (per child) s.t. T(v, ui)1
(? iltk) and Vv is best for Uiui , Ww.
V remembers the list of (u1,uk) witnesses.

22
Abstract Algorithm Proof

Base case
Every leaf U sends its parent V the table
T(v,u) for every strategy pair (v,u).
General case
If T(w,v)1 for some pair (w,v) then there
exists a witness (u1,uk) s.t. T(v, ui)1 for all
i.
Induction assumption Theorem 1 ? there exists
a downstream equilibrium s.t. each Uiui since
Vv is the best response - the equilibrium is
from V.

23
Abstract Algorithm Proof cont.

If T(w,v)0 using the same reasoning ? there is
no equilibrium in which Ww and Vv.
Nash Theorem concludes and assures that for every
game there exists at least one pair (w,v) s.t.
T(w,v) 1.
R receives a table that along with the witnesses
represent all Nash equilibriums.
R chooses a strategy non-deterministically and
informs its sons one of the strategies is
determined at the end of the downstream flow.

24
Agenda

Introduction
Motivation
Goals
Terminology
The Algorithm
Outline
Details
Proof
Back up

25
DetailsDetails

Claimed to find an approximation of a Nash
equilibrium in O(n) looks like weve found
every Nash equilibrium ??
The table T(w,v) is unrealistic w,v are
continues not discrete.
There may be exponential numbers of Nash
Equilibrium a deterministic algorithm cant be
polynomial.

26
Quantification

Instead of continues values discrete values
with finite size and computational ease.
Determine a grid 0,?,2 ?,,1. Player i plays qi
? 0,?,2 ?,,1 and q ? 0,?,2 ?,,1n.
Each table consists of binary values for 1/ ?2
entries.
Finding best responses is a simple search across
the table and are now approximate best responses.

?
27
Agenda

Introduction
Motivation
Goals
Terminology
The Algorithm
Outline
Details
Proof
Back up

28
Determining ?

? needs to insure that the loss suffered by any
player in moving to the grid is bound.
? needs to insure the Nash equilibriums may be
approximately preserved ? existence of an ? Nash
equilibrium.
? needs to be scalable to the size of the
representation to allow the algorithm to be
polynomial 1/ ? O(nx).

29
Bound Loss of Players - 1

Let NG(i)k then as defined
Mi(p) Exp.xpMi(p)

Remember xj 0,1 so this is merely the
probability that x actually occurs.

?
30
Lemma 1

Let p,q ? 0,1k satisfy pi qi ? ? (i1..k).
Then provided ? ? 4/ (k log2(k/2))

?
?

Proof by induction on k
Base case k 2 k logk 2
2? ? ?( p2q2) ? p1- q1( p2q2) ?
p1 p2 - q1 q2 p1 q2 - q1 p2 ? p1 p2 - q1
q2

31
Lemma 1 Proof cont.

Without loss of generality assume k is even.

The lemma holds if -k?((k/2)(log(k/2))?)2 ? 0.
So ? ? 4/(klog2(k/2)).

32
Lemma 2

Let p,q the mixed strategies for (G,M) satisfy
pi qi ? ? (i1..k), then provided
? ? 4/ (k log2(k/2))
Mi(p) - Mi(q) ? 2k(k logk)?
This Lemma gives an upper bound on the loss
suffered by any player in moving to the nearest
joint strategy on the ?-grid.

?
?
?
?
33
Lemma 2 - Proof
34
? Nash equilibrium - 2

Lemma 3
Let p be a Nash equilibrium for (G,M) and let
q be the nearest mixed strategy on the grid. Then
provided ? ? 4/(k log2(k/2)) q is a
2k1(klog(k) ? - Nash Equilibrium for (G,M).
Proof
Let ri be the best response for player i to q.
We bound Mi(qi ri) - Mi(q) which is the
benefit player i could attain from deviating from
q.

?
?
?
?
?
?
?
35
Lemma 3 - Proof

By Lemma 2
Mi(qi ri) - Mi(pi ri) ? 2k(k logk)?
Mi(q) ? Mi(p) - 2k(k logk)?
Since p is equilibrium
Mi(p) ? Mi(pi ri) ? Mi(qi ri) ? Mi(p)
2k(k logk)?
Sum the inequalities and result in
Mi(qi ri) - Mi(q) ? 2k1(k logk)? ?

?
?
?
?
?
?
?
?
?
?
36
Polynomial scalability

We now choose ? in accordance with the
constraints 2k1(k logk)? ? ?
? ? 4/(k log2(k/2))
So
? ? min(?/ 2k1(k logk) , 4/(k log2(k/2)) )
Notice that ? is exponential to k ltlt n. Each
step in the algorithm computes over (1/ ?)2
entries totaling (1/ ?)2k, the complete run time
is polynomial in n.

37
Graphical Models for Game Theory