Title: When is it Best to Best-Reply?
1When is it Best to Best-Reply?
- Michael Schapira
- (Yale University and UC Berkeley)
- Joint work withNoam Nisan (Hebrew U),
- Gregory Valiant (UC Berkeley)
- and Aviv Zohar (Hebrew U)
2Motivation Internet Routing
- Establish routes between Autonomous Systems
(ASes). - Currently handled by the Border Gateway
Protocol (BGP).
3Internet Routing as a GameLevin-S-Zohar
- Internet routing is a game!
- players ASes
- players types preferences over routes
- strategies routes
- BGP Best-Response Dynamics
- each AS constantly selects its best available
route to each destination - until a stable state ( PNE) is reached.
4But
- Challenge I No synchronization ofplayers
actions - players can best-reply simultaneously.
- players can best-reply based on outdated
information. - When is BGP guaranteed to converge to a stable
state? - Challenge II Are players incentivized to follow
best-response dynamics? - Can an AS gain from not executing BGP?
5Agenda
- Mechanism design approach to best-response
dynamics.(main focus of this talk) - Convergence of best-response dynamics in
asynchronous environments. Jaggard-S-Wright(if
time permits)
6Agenda
- Part I mechanism design approach to
best-response dynamics. - Part II on the convergence of best-response
dynamics in asynchronous environments.
Incentive-Compatible Best-Response Dyanmics
7Main Questions
- When is myopic best-replying also good in the
long run? - When can stable outcomes be implemented in
partial-information settings? - Can we reason about partial-information settings
via complete-information games?
8Our Results Have Implications For
- Internet protocols
- Internet routing (BGP), congestion control (TCP)
- Auctions
- 1st-price auctions, unit-demand auctions, GSP
- Matching
- correlated markets, interns and hospitals
- Cost-sharing mechanisms
- Moulin mechanisms,
91st Price Auction
Alice (va4)
winnerutility
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
10Ascending-Price English Auction
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
11Best-Reply(with some-tie breaking)
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
12The Model
- n players
- Player i has
- action set Ai
- (private) type ti ?Ti
- utility function ui
13The Model Dynamic Interaction
- Discrete time steps. Initial action profile a0.
- One player is activated in each time step
- round-robin (cyclic) order
- our results are independent of the order (and
also hold for asynchronous environments) - Players strategies specify which actions are
selected in each time step. - can be history-dependent
- Best-response dynamics the strategy profile in
which each player constantly best-replies to
others actions
14Two Possible Payoff Models
- Cumulative model
- Payoffs are accumulated
- Alternative formulation with discount factors
- Payoff at the limit
- If the dynamics converges to a stable outcome a
- If no convergence, the resulting payoff is low.
Weaker (actively discourages oscillations),
interesting applications
More natural.sometimes too restrictive
15Solution Concept
- A strategy profile ? is an ex-post Nash
equilibrium if no player wishes to deviate from ?
regardless of the types - (this is essentially the best possible in a
distributed environment Shneidman-Parkes)
16Best-Replying is Not Always Best
Row Player Type 1
Row Player Type 2
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3
- dominance-solvable
- potential game
- unique and Pareto optimal PNE
17When is it Good to Best-Reply?
- Goal identify a class of games in which
best-response dynamics is an ex-post Nash
equilibrium. - i.e., best-replying is incentive-compatible
- close in spirit to learning equilibria
Brafman-tennenholtz - This class is going to be VERY restricted. Still
a variety of mechanisms/protocols. - Remark The best replies are not always unique.
Thus, we must handle tie-breaking.
18One Class of Games
- Lemma If each realization of types yields a game
in which each player has a single dominant
strategy, then best-response dynamics is an
ex-post Nash equilibrium.
19On the Other Hand
Row Player Type 2
Row Player Type 1
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1
- no player has a dominant strategy (in both
realizations). - best-response dynamics is an ex-post Nash
equilibrium. - This game is blindly solvable.
20Blindly-Dominated Strategy Sets
9 7 8
8 6 5
1 2 3
0 4 3
T
21Blindly-Solvable Games
- Defn A game is blindly-solvable if iterated
elimination of blindly-dominated strategy sets
results in a single strategy profile. - Observation the surviving strategy profile is
the unique PNE of the game. - Defn A partial-information game is
blindly-solvable if every realization of types
yields a blindly-solvable game.
221st-Price Auctions Revisited
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
23Merits of Blindly-Solvable Games
- Thm Let G be a blindly-solvable
partial-information game. Let a be the surviving
strategy profile. Then, - Best-response dynamics converges to a within
n(SjAj) time steps. - In the payoff at the limit model, best-response
dynamics is incentive-compatible, and even
collusion-proof, in ex-post Nash.
24Intuition for Proof of (2)
- The first action that was not eliminated in the
elimination sequence of G must belong to a
manipulator. - The manipulators utility from that action is
lower than his utility from a.
25Best-Response 1st-PriceAuction Mechanism
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
26Implications forInternet Environments
- Under realistic conditions routing with the
Border Gateway Protocol is incentive compatible.
Levin-S-Zohar - Convergence and incentive compatibility results
for congestion control. Godfrey-S-Zohar-Shenker
Mechanism design without money!
27Beyond Blindly-Solvable Games
28Generalized 2nd-Price Auction (GSP)
- Used for selling ads on search engines.
- k slots. Each slot j with click-through-rate ?j.
- Users submit bids (per click) bi.
- They are ranked in order of bids.
- If ad is clicked pay next highest bid.
29Generalized 2nd-Price Auction (GSP)
- No dominant strategy equilibrium.
- There exists an equilibrium with VCG payments.
Edelman-Ostrovsky-Schwarz, Varian - Best-response dynamics (with tie-breaking)
converge with probability 1 to that equilibrium.
Cary et al. - Thm (informal) Best-replying in GSP is
incentive-compatible. - Generalizes the English auction of
Edelman-Ostrovsky-Schwarz
30Auctions With Unit-Demand Bidders
- n bidders. m items.
- Each bidder i has value vi,j for each item j,
and is interested in at most one item. - Thm There exists a best-response mechanism for
auctions with unit-demand bidders that is
incentive-compatible in ex-post Nash and
converges to the VCG outcome. - Generalizes the English auction of
Demange-Gale-Sotomayer - The proof of incentive-compatibility is simple.
The proof of convergence is more complex and is
based on Kuhns Hungarian method.
31A New Perspective on Some Centralized Mechanisms
32Centralized vs. Distributed
distributed
centralized
players declare types
players reach a stable outcome in a distributed
manner
simulate interaction
output the outcome
ex-post equilibrium in the decentralized setting
dominant strategy implementation in the
centralized setting.
33The Centralized Setting
- Each player i has an action set Ai, a private
type ti, and a utility function ui (as before). - Wanted a direct revelation mechanism that
outputs a pure Nash equilibrium of the game. - and incentivizes truthfulness
34Clearly, This is Not Always Possible
Row Player Type 2
Row Player Type 1
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3
35Corollary I
- If every player has a single dominant strategy in
every realization, then the direct-revelation
mechanism is truthful. - Give each player his dominant strategy in the
reported realization.
36Corollary II
- If the game is blindly solvable, then the
direct-revelation mechanism is truthful.
Row Player Type 1
Row Player Type 2
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1
37More Blindly-Solvable Games
- Cost-Sharing mechanisms
- Moulin mechanisms Moulin, Moulin-Shenker
- Acyclic mechanisms Mehta-Roughgarden-Sundararaja
n - Matching games
- Interns and Hospitals
- Correlated two sided markets
38Directions for Future Research
- Implementability of other kinds of equilibria
(mixed Nash, correlated, )? - Incentive-compatibility of other kinds of
dynamics (fictitious play, regret minimization)?
39Agenda
Best-Response DynamicsOut of Sync
- Part I mechanism design approach to
best-response dynamics. - Part II on the convergence of best-response
dynamics in asynchronous environments.
40Synchronous Environments
- In traditional best-response dynamics players are
activated one at a time. - More generally, the study of game dynamics
normally supposes synchrony. - What if the interaction between players is
asynchronous? (Internet, markets)
41Illustration
Column Player
0,0
2,1
Row Player
0,0
1,2
42Illustration
Column Player
0,0
2,1
Row Player
0,0
1,2
43But
Column Player
0,0
2,1
Row Player
0,0
1,2
44Model for Analyzing Asynchronous Best-Response
Dynamics
- Infinite sequence of discrete time-steps
- In each time-step a subset of the players
best-replies. - The schedule is chosen by an adversarial entity
(the Scheduler). - The schedule must be fair (no player is
indefinitely starved from best-replying).
45Result Jaggard-S-Wright
- Thm If two pure Nash equilibria(or more) exist
in a game then asynchronous best-reply dynamics
can potentially oscillate. - Implications for Internet protocols, diffusion of
innovations in social networks, and more.
46Directions for Future Research
- Characterization of games for which asynchronous
best-response dynamics converge. - More generally, exploring game dynamics in the
realm that lies beyond synchronization
(fictitious play, regret minimization).
47Thank You!