When is it Best to Best-Reply? presentation

About This Presentation

Transcript and Presenter's Notes

Title: When is it Best to Best-Reply?

1
When is it Best to Best-Reply?

Michael Schapira
(Yale University and UC Berkeley)
Joint work withNoam Nisan (Hebrew U),
Gregory Valiant (UC Berkeley)
and Aviv Zohar (Hebrew U)

2
Motivation Internet Routing

Establish routes between Autonomous Systems
(ASes).
Currently handled by the Border Gateway
Protocol (BGP).

3
Internet Routing as a GameLevin-S-Zohar

Internet routing is a game!
players ASes
players types preferences over routes
strategies routes
BGP Best-Response Dynamics
each AS constantly selects its best available
route to each destination
until a stable state ( PNE) is reached.

4
But

Challenge I No synchronization ofplayers
actions
players can best-reply simultaneously.
players can best-reply based on outdated
information.
When is BGP guaranteed to converge to a stable
state?
Challenge II Are players incentivized to follow
best-response dynamics?
Can an AS gain from not executing BGP?

5
Agenda

Mechanism design approach to best-response
dynamics.(main focus of this talk)
Convergence of best-response dynamics in
asynchronous environments. Jaggard-S-Wright(if
time permits)

6
Agenda

Part I mechanism design approach to
best-response dynamics.
Part II on the convergence of best-response
dynamics in asynchronous environments.

Incentive-Compatible Best-Response Dyanmics
7
Main Questions

When is myopic best-replying also good in the
long run?
When can stable outcomes be implemented in
partial-information settings?
Can we reason about partial-information settings
via complete-information games?

8
Our Results Have Implications For

Internet protocols
Internet routing (BGP), congestion control (TCP)
Auctions
1st-price auctions, unit-demand auctions, GSP
Matching
correlated markets, interns and hospitals
Cost-sharing mechanisms
Moulin mechanisms,

9
1st Price Auction
Alice (va4)
winnerutility
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
10
Ascending-Price English Auction
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
11
Best-Reply(with some-tie breaking)
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
12
The Model

n players
Player i has
action set Ai
(private) type ti ?Ti
utility function ui

13
The Model Dynamic Interaction

Discrete time steps. Initial action profile a0.
One player is activated in each time step
round-robin (cyclic) order
our results are independent of the order (and
also hold for asynchronous environments)
Players strategies specify which actions are
selected in each time step.
can be history-dependent
Best-response dynamics the strategy profile in
which each player constantly best-replies to
others actions

14
Two Possible Payoff Models

Cumulative model
Payoffs are accumulated
Alternative formulation with discount factors

Payoff at the limit
If the dynamics converges to a stable outcome a
If no convergence, the resulting payoff is low.

Weaker (actively discourages oscillations),
interesting applications
More natural.sometimes too restrictive
15
Solution Concept

A strategy profile ? is an ex-post Nash
equilibrium if no player wishes to deviate from ?
regardless of the types
(this is essentially the best possible in a
distributed environment Shneidman-Parkes)

16
Best-Replying is Not Always Best
Row Player Type 1
Row Player Type 2
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3

dominance-solvable
potential game
unique and Pareto optimal PNE

17
When is it Good to Best-Reply?

Goal identify a class of games in which
best-response dynamics is an ex-post Nash
equilibrium.
i.e., best-replying is incentive-compatible
close in spirit to learning equilibria
Brafman-tennenholtz
This class is going to be VERY restricted. Still
a variety of mechanisms/protocols.
Remark The best replies are not always unique.
Thus, we must handle tie-breaking.

18
One Class of Games

Lemma If each realization of types yields a game
in which each player has a single dominant
strategy, then best-response dynamics is an
ex-post Nash equilibrium.

19
On the Other Hand
Row Player Type 2
Row Player Type 1
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1

no player has a dominant strategy (in both
realizations).
best-response dynamics is an ex-post Nash
equilibrium.
This game is blindly solvable.

20
Blindly-Dominated Strategy Sets
9 7 8
8 6 5
1 2 3
0 4 3
T
21
Blindly-Solvable Games

Defn A game is blindly-solvable if iterated
elimination of blindly-dominated strategy sets
results in a single strategy profile.
Observation the surviving strategy profile is
the unique PNE of the game.
Defn A partial-information game is
blindly-solvable if every realization of types
yields a blindly-solvable game.

22
1st-Price Auctions Revisited
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
23
Merits of Blindly-Solvable Games

Thm Let G be a blindly-solvable
partial-information game. Let a be the surviving
strategy profile. Then,
Best-response dynamics converges to a within
n(SjAj) time steps.
In the payoff at the limit model, best-response
dynamics is incentive-compatible, and even
collusion-proof, in ex-post Nash.

24
Intuition for Proof of (2)

The first action that was not eliminated in the
elimination sequence of G must belong to a
manipulator.
The manipulators utility from that action is
lower than his utility from a.

25
Best-Response 1st-PriceAuction Mechanism
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
26
Implications forInternet Environments

Under realistic conditions routing with the
Border Gateway Protocol is incentive compatible.
Levin-S-Zohar
Convergence and incentive compatibility results
for congestion control. Godfrey-S-Zohar-Shenker

Mechanism design without money!
27
Beyond Blindly-Solvable Games
28
Generalized 2nd-Price Auction (GSP)

Used for selling ads on search engines.
k slots. Each slot j with click-through-rate ?j.
Users submit bids (per click) bi.
They are ranked in order of bids.
If ad is clicked pay next highest bid.

29
Generalized 2nd-Price Auction (GSP)

No dominant strategy equilibrium.
There exists an equilibrium with VCG payments.
Edelman-Ostrovsky-Schwarz, Varian
Best-response dynamics (with tie-breaking)
converge with probability 1 to that equilibrium.
Cary et al.
Thm (informal) Best-replying in GSP is
incentive-compatible.
Generalizes the English auction of
Edelman-Ostrovsky-Schwarz

30
Auctions With Unit-Demand Bidders

n bidders. m items.
Each bidder i has value vi,j for each item j,
and is interested in at most one item.
Thm There exists a best-response mechanism for
auctions with unit-demand bidders that is
incentive-compatible in ex-post Nash and
converges to the VCG outcome.
Generalizes the English auction of
Demange-Gale-Sotomayer
The proof of incentive-compatibility is simple.
The proof of convergence is more complex and is
based on Kuhns Hungarian method.

31
A New Perspective on Some Centralized Mechanisms
32
Centralized vs. Distributed
distributed
centralized
players declare types
players reach a stable outcome in a distributed
manner
simulate interaction
output the outcome
ex-post equilibrium in the decentralized setting
dominant strategy implementation in the
centralized setting.
33
The Centralized Setting

Each player i has an action set Ai, a private
type ti, and a utility function ui (as before).
Wanted a direct revelation mechanism that
outputs a pure Nash equilibrium of the game.
and incentivizes truthfulness

34
Clearly, This is Not Always Possible
Row Player Type 2
Row Player Type 1
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3
35
Corollary I

If every player has a single dominant strategy in
every realization, then the direct-revelation
mechanism is truthful.
Give each player his dominant strategy in the
reported realization.

36
Corollary II

If the game is blindly solvable, then the
direct-revelation mechanism is truthful.

Row Player Type 1
Row Player Type 2
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1
37
More Blindly-Solvable Games

Cost-Sharing mechanisms
Moulin mechanisms Moulin, Moulin-Shenker
Acyclic mechanisms Mehta-Roughgarden-Sundararaja
n
Matching games
Interns and Hospitals
Correlated two sided markets

38
Directions for Future Research

Implementability of other kinds of equilibria
(mixed Nash, correlated, )?
Incentive-compatibility of other kinds of
dynamics (fictitious play, regret minimization)?

39
Agenda
Best-Response DynamicsOut of Sync

Part I mechanism design approach to
best-response dynamics.
Part II on the convergence of best-response
dynamics in asynchronous environments.

40
Synchronous Environments

In traditional best-response dynamics players are
activated one at a time.
More generally, the study of game dynamics
normally supposes synchrony.
What if the interaction between players is
asynchronous? (Internet, markets)

41
Illustration
Column Player

0,0
2,1
Row Player
0,0
1,2
42
Illustration
Column Player

0,0
2,1
Row Player
0,0
1,2
43
But
Column Player

0,0
2,1
Row Player
0,0
1,2
44
Model for Analyzing Asynchronous Best-Response
Dynamics

Infinite sequence of discrete time-steps
In each time-step a subset of the players
best-replies.
The schedule is chosen by an adversarial entity
(the Scheduler).
The schedule must be fair (no player is
indefinitely starved from best-replying).

45
Result Jaggard-S-Wright

Thm If two pure Nash equilibria(or more) exist
in a game then asynchronous best-reply dynamics
can potentially oscillate.
Implications for Internet protocols, diffusion of
innovations in social networks, and more.

46
Directions for Future Research

Characterization of games for which asynchronous
best-response dynamics converge.
More generally, exploring game dynamics in the
realm that lies beyond synchronization
(fictitious play, regret minimization).

47
Thank You!

Write a Comment

User Comments (0)

When is it Best to Best-Reply? PowerPoint PPT Presentation