Bayesian and non-Bayesian Learning in Games presentation

About This Presentation

Transcript and Presenter's Notes

Title: Bayesian and non-Bayesian Learning in Games

1
Bayesian and non-Bayesian Learning in Games
Ehud Lehrer
Tel Aviv University, School of Mathematical
Sciences
Including joint works with Ehud Kalai, Rann
Smorodinsky, Eioln Solan.
2
Learning in Games
Informal definition of learning a decentralized
process that converges (in some sense) to (some)
equilibrium.

Bayesian (rational) learning Players do not
start in equilibrium, but
they have some initial belief about other
players strategies
they are rational they maximize their payoffs
they take into account future payoffs
Convergence in REPEATED
GAME

Non-Bayesian learning Players
dont have any initial belief about other
players strategies
dont maximize their payoffs
dont take into account future payoffs
Convergence (of the empirical frequency) to an
equilibrium of the
ONE-SHOT GAME

3
Bayesian vs. non-Bayesian
Bayesian learning Players do not start in
equilibrium, but they start with a grain of
idea about what other players do. Nature of
results players eventually play something close
to an equilibrium of the repeated game.
Non-Bayesian learning Players have no idea about
other players actions. They dont care to
maximize payoffs. Nature of results the
statistics of past actions looks like an
equilibrium of the one-shot game.
4
Important tools

Bayesian learning merging of two probability
measures along a a filtration (an increasing
sequence of ? - fields)

Non-Bayesian learning approachability
Both were initiated by Blackwell (the first with
Dubins)
5
Repeated Games with Vector Payoffs

I finite set of actions of player 1.
J finite set of actions of player 2.
M (mi,j) a payoff matrix. Entries are
vectors in Rd.

A set F is approachable by player 1 if there is a
strategy ? s.t.
There are sets which are neither approachable nor
excludable.
6
Approachability

Applications (a sample)
No-regret (Hannan)
Repeated games with incomplete information
(Aumann-Maschler)
Learning (Foster-Vohra, Hart-Mas Colell)
Manipulation of calibration tests (Foster-Vohra,
Lehrer,
Smorodinsky-Sandroni-Vohra)
Generating generalized normal-number (Lehrer)

7
Characterization of Approachable Sets
mp,q ?i,j pi mi,j qj H(p) mp,q , q ? ?(I)

A closed set F ? Rd is a B-set if for every x ? F
there is y ? F that satisfies
y is a closest point in F to x.
The hyperplane perpendicular to the line xy that
passes through y separates between x and H(p),
for some p ? ?(I).

8
Characterization of Approachable Sets
Theorem Blackwell, 1956 every B-set F is
approachable.
The approaching strategy plays at each stage n
the mixed action p such that H(p) and x are
separated by the hyperplane connecting x and a
closest point to x in F. With this strategy
Theorem Blackwell, 1956 every convex set is
either approachable or excludable.
Theorem Hou, 1971 Spinat, 2002 every minimal
(w.r.t. set inclusion) approachable set is a
B-set. Or A set is approachable if and only if
it contains a B-set.
9
Bounded Computational Capacity
A strategy is k-bounded-recall if it depends only
on the last k pairs of actions (and it does not
depend on previously played actions).

A (non-deterministic) automaton is given by
A finite state space.
A probability distribution over states,
according to which the initial state is chosen.
A set of inputs (say, the set I J of action
pairs).
A set of outputs (say, I , the set of player 1s
actions).
A rule that assigns to each state a probability
distribution over outputs.
A transition rule that assigns to every state
and every input a probability distribution over
the next state.

10
Approachability and Bounded Capacity
A set F is approachable with bounded-recall
strategies by player 1 if for every ?gt0, the set
B(F, ?) y d(y, F) ? ? is approachable by
some bounded-recall strategy.
A set F is excludable against bounded-recall
strategies by player 2 if player 2 has a strategy
? such that

Theorem (w/ Eilon Solan) The following
statements are equivalent.
The set F is approachable with bounded-recall
strategies.
The set F is approachable with automata.
The set F contains a convex approachable set.
The set F is not excludable against
bounded-recall strategies.

4 points to note
11
Main Theorem

Theorem The following statements are equivalent
for closed sets.
The set F is approachable with bounded-recall
strategies.
The set F is approachable with automata.
The set F contains a convex approachable set.
The set F is not excludable against
bounded-recall strategies.

A set is approachable with automata if and only
if it is approachable by bounded-recall
strategies.

2. A complete characterization of sets that are
approachable with bounded-recall strategies.
3. A set which is not approachable with
bounded-recall strategies, is excludable against
all bounded-recall strategies.
4. We do not know whether the same holds for
automata.
12
Example
(1,-1) (-1,1)
(-1,-1) (1,1)
On board
Good news in applications target sets are
convex ( a point or a whole --
positive or negative -- orthant).
13
Approachability in Hilbert space

I finite set of actions of player 1.
J finite set of actions of player 2.
M (mi,j) a payoff matrix. Entries are points
in HS
(random variables).
All may change with the stage n.

A set F is approachable by player 1 if there is a
strategy ? s.t.
Advantage allows for infinitely many constraints
Theorem Suppose that at stage n, the average
payoff is and y is a closest point in F to
. If the hyperplane perpendicular to the line
that passes through y separates between
and H(p), for some p ? ?(I), then F is
approachable.
14
Approachability and law of large numbers
are uncorrelated r.v.s with .

is the dot product.
F is
At any stage n, .
15
The game each players has only one action. The
payoff at stage n is . Thus, F is
approachable. This is the strong law of large
numbers. (When the payoffs are not uniformly
bounded, there is an additional boundedness
condition.)
16
Problem Approachability in norm spaces.
17
Activeness function
H is (even over a finite probability space).
At stage n the characteristic function
indicates which coordinates are active and which
are not.
The average payoff at stage n is
Applications 1. repeated games with incomplete
information different
games are active on different times
2. construction of normal numbers
3. manipulability of many calibration
tests 4. general no-regret
theorem (against many replacing schemes)
5. convergence to correlated eq. along
many sequences
18
Activeness function cont.
Theorem suppose that F is convex. Let be
the closest point in F to the average payoff at
time n, . If the hyperplane perpendicular
to the line that passes through
separates between and H(p), for some p ?
?(I), then F is approachable.

Write a Comment

User Comments (0)

About PowerShow.com

Bayesian and non-Bayesian Learning in Games PowerPoint PPT Presentation