Title: Bayesian and non-Bayesian Learning in Games
1Bayesian and non-Bayesian Learning in Games
Ehud Lehrer
Tel Aviv University, School of Mathematical
Sciences
Including joint works with Ehud Kalai, Rann
Smorodinsky, Eioln Solan.
2Learning in Games
Informal definition of learning a decentralized
process that converges (in some sense) to (some)
equilibrium.
- Bayesian (rational) learning Players do not
start in equilibrium, but - they have some initial belief about other
players strategies - they are rational they maximize their payoffs
- they take into account future payoffs
- Convergence in REPEATED
GAME
- Non-Bayesian learning Players
- dont have any initial belief about other
players strategies - dont maximize their payoffs
- dont take into account future payoffs
- Convergence (of the empirical frequency) to an
equilibrium of the
ONE-SHOT GAME
3Bayesian vs. non-Bayesian
Bayesian learning Players do not start in
equilibrium, but they start with a grain of
idea about what other players do. Nature of
results players eventually play something close
to an equilibrium of the repeated game.
Non-Bayesian learning Players have no idea about
other players actions. They dont care to
maximize payoffs. Nature of results the
statistics of past actions looks like an
equilibrium of the one-shot game.
4Important tools
- Bayesian learning merging of two probability
measures along a a filtration (an increasing
sequence of ? - fields)
Non-Bayesian learning approachability
Both were initiated by Blackwell (the first with
Dubins)
5Repeated Games with Vector Payoffs
- I finite set of actions of player 1.
- J finite set of actions of player 2.
- M (mi,j) a payoff matrix. Entries are
vectors in Rd.
A set F is approachable by player 1 if there is a
strategy ? s.t.
There are sets which are neither approachable nor
excludable.
6Approachability
- Applications (a sample)
- No-regret (Hannan)
- Repeated games with incomplete information
(Aumann-Maschler) - Learning (Foster-Vohra, Hart-Mas Colell)
- Manipulation of calibration tests (Foster-Vohra,
Lehrer, -
Smorodinsky-Sandroni-Vohra) - Generating generalized normal-number (Lehrer)
7Characterization of Approachable Sets
mp,q ?i,j pi mi,j qj H(p) mp,q , q ? ?(I)
- A closed set F ? Rd is a B-set if for every x ? F
there is y ? F that satisfies - y is a closest point in F to x.
- The hyperplane perpendicular to the line xy that
passes through y separates between x and H(p),
for some p ? ?(I).
8Characterization of Approachable Sets
Theorem Blackwell, 1956 every B-set F is
approachable.
The approaching strategy plays at each stage n
the mixed action p such that H(p) and x are
separated by the hyperplane connecting x and a
closest point to x in F. With this strategy
Theorem Blackwell, 1956 every convex set is
either approachable or excludable.
Theorem Hou, 1971 Spinat, 2002 every minimal
(w.r.t. set inclusion) approachable set is a
B-set. Or A set is approachable if and only if
it contains a B-set.
9Bounded Computational Capacity
A strategy is k-bounded-recall if it depends only
on the last k pairs of actions (and it does not
depend on previously played actions).
- A (non-deterministic) automaton is given by
- A finite state space.
- A probability distribution over states,
according to which the initial state is chosen. - A set of inputs (say, the set I J of action
pairs). - A set of outputs (say, I , the set of player 1s
actions). - A rule that assigns to each state a probability
distribution over outputs. - A transition rule that assigns to every state
and every input a probability distribution over
the next state.
10Approachability and Bounded Capacity
A set F is approachable with bounded-recall
strategies by player 1 if for every ?gt0, the set
B(F, ?) y d(y, F) ? ? is approachable by
some bounded-recall strategy.
A set F is excludable against bounded-recall
strategies by player 2 if player 2 has a strategy
? such that
- Theorem (w/ Eilon Solan) The following
statements are equivalent. - The set F is approachable with bounded-recall
strategies. - The set F is approachable with automata.
- The set F contains a convex approachable set.
- The set F is not excludable against
bounded-recall strategies.
4 points to note
11Main Theorem
- Theorem The following statements are equivalent
for closed sets. - The set F is approachable with bounded-recall
strategies. - The set F is approachable with automata.
- The set F contains a convex approachable set.
- The set F is not excludable against
bounded-recall strategies.
- A set is approachable with automata if and only
if it is approachable by bounded-recall
strategies.
2. A complete characterization of sets that are
approachable with bounded-recall strategies.
3. A set which is not approachable with
bounded-recall strategies, is excludable against
all bounded-recall strategies.
4. We do not know whether the same holds for
automata.
12Example
(1,-1) (-1,1)
(-1,-1) (1,1)
On board
Good news in applications target sets are
convex ( a point or a whole --
positive or negative -- orthant).
13Approachability in Hilbert space
- I finite set of actions of player 1.
- J finite set of actions of player 2.
- M (mi,j) a payoff matrix. Entries are points
in HS -
(random variables). - All may change with the stage n.
A set F is approachable by player 1 if there is a
strategy ? s.t.
Advantage allows for infinitely many constraints
Theorem Suppose that at stage n, the average
payoff is and y is a closest point in F to
. If the hyperplane perpendicular to the line
that passes through y separates between
and H(p), for some p ? ?(I), then F is
approachable.
14Approachability and law of large numbers
are uncorrelated r.v.s with .
is the dot product.
F is
At any stage n, .
15The game each players has only one action. The
payoff at stage n is . Thus, F is
approachable. This is the strong law of large
numbers. (When the payoffs are not uniformly
bounded, there is an additional boundedness
condition.)
16Problem Approachability in norm spaces.
17Activeness function
H is (even over a finite probability space).
At stage n the characteristic function
indicates which coordinates are active and which
are not.
The average payoff at stage n is
Applications 1. repeated games with incomplete
information different
games are active on different times
2. construction of normal numbers
3. manipulability of many calibration
tests 4. general no-regret
theorem (against many replacing schemes)
5. convergence to correlated eq. along
many sequences
18Activeness function cont.
Theorem suppose that F is convex. Let be
the closest point in F to the average payoff at
time n, . If the hyperplane perpendicular
to the line that passes through
separates between and H(p), for some p ?
?(I), then F is approachable.