When is it Best to Best-Reply?

About This Presentation
Title:

When is it Best to Best-Reply?

Description:

When is it Best to Best-Reply? Michael Schapira (Yale University and UC Berkeley) Joint work with Noam Nisan (Hebrew U), Gregory Valiant (UC Berkeley) – PowerPoint PPT presentation

Number of Views:2
Avg rating:3.0/5.0
Slides: 48
Provided by: Vijay90
Learn more at: http://www.cs.yale.edu

less

Transcript and Presenter's Notes

Title: When is it Best to Best-Reply?


1
When is it Best to Best-Reply?
  • Michael Schapira
  • (Yale University and UC Berkeley)
  • Joint work withNoam Nisan (Hebrew U),
  • Gregory Valiant (UC Berkeley)
  • and Aviv Zohar (Hebrew U)

2
Motivation Internet Routing
  • Establish routes between Autonomous Systems
    (ASes).
  • Currently handled by the Border Gateway
    Protocol (BGP).

3
Internet Routing as a GameLevin-S-Zohar
  • Internet routing is a game!
  • players ASes
  • players types preferences over routes
  • strategies routes
  • BGP Best-Response Dynamics
  • each AS constantly selects its best available
    route to each destination
  • until a stable state ( PNE) is reached.

4
But
  • Challenge I No synchronization ofplayers
    actions
  • players can best-reply simultaneously.
  • players can best-reply based on outdated
    information.
  • When is BGP guaranteed to converge to a stable
    state?
  • Challenge II Are players incentivized to follow
    best-response dynamics?
  • Can an AS gain from not executing BGP?

5
Agenda
  • Mechanism design approach to best-response
    dynamics.(main focus of this talk)
  • Convergence of best-response dynamics in
    asynchronous environments. Jaggard-S-Wright(if
    time permits)

6
Agenda
  • Part I mechanism design approach to
    best-response dynamics.
  • Part II on the convergence of best-response
    dynamics in asynchronous environments.

Incentive-Compatible Best-Response Dyanmics
7
Main Questions
  • When is myopic best-replying also good in the
    long run?
  • When can stable outcomes be implemented in
    partial-information settings?
  • Can we reason about partial-information settings
    via complete-information games?

8
Our Results Have Implications For
  • Internet protocols
  • Internet routing (BGP), congestion control (TCP)
  • Auctions
  • 1st-price auctions, unit-demand auctions, GSP
  • Matching
  • correlated markets, interns and hospitals
  • Cost-sharing mechanisms
  • Moulin mechanisms,

9
1st Price Auction
Alice (va4)
winnerutility
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
10
Ascending-Price English Auction
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
11
Best-Reply(with some-tie breaking)
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
12
The Model
  • n players
  • Player i has
  • action set Ai
  • (private) type ti ?Ti
  • utility function ui

13
The Model Dynamic Interaction
  • Discrete time steps. Initial action profile a0.
  • One player is activated in each time step
  • round-robin (cyclic) order
  • our results are independent of the order (and
    also hold for asynchronous environments)
  • Players strategies specify which actions are
    selected in each time step.
  • can be history-dependent
  • Best-response dynamics the strategy profile in
    which each player constantly best-replies to
    others actions

14
Two Possible Payoff Models
  • Cumulative model
  • Payoffs are accumulated
  • Alternative formulation with discount factors
  • Payoff at the limit
  • If the dynamics converges to a stable outcome a
  • If no convergence, the resulting payoff is low.

Weaker (actively discourages oscillations),
interesting applications
More natural.sometimes too restrictive
15
Solution Concept
  • A strategy profile ? is an ex-post Nash
    equilibrium if no player wishes to deviate from ?
    regardless of the types
  • (this is essentially the best possible in a
    distributed environment Shneidman-Parkes)

16
Best-Replying is Not Always Best
Row Player Type 1
Row Player Type 2
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3
  • dominance-solvable
  • potential game
  • unique and Pareto optimal PNE

17
When is it Good to Best-Reply?
  • Goal identify a class of games in which
    best-response dynamics is an ex-post Nash
    equilibrium.
  • i.e., best-replying is incentive-compatible
  • close in spirit to learning equilibria
    Brafman-tennenholtz
  • This class is going to be VERY restricted. Still
    a variety of mechanisms/protocols.
  • Remark The best replies are not always unique.
    Thus, we must handle tie-breaking.

18
One Class of Games
  • Lemma If each realization of types yields a game
    in which each player has a single dominant
    strategy, then best-response dynamics is an
    ex-post Nash equilibrium.









19
On the Other Hand
Row Player Type 2
Row Player Type 1
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1
  • no player has a dominant strategy (in both
    realizations).
  • best-response dynamics is an ex-post Nash
    equilibrium.
  • This game is blindly solvable.

20
Blindly-Dominated Strategy Sets
9 7 8
8 6 5
1 2 3
0 4 3
T
21
Blindly-Solvable Games
  • Defn A game is blindly-solvable if iterated
    elimination of blindly-dominated strategy sets
    results in a single strategy profile.
  • Observation the surviving strategy profile is
    the unique PNE of the game.
  • Defn A partial-information game is
    blindly-solvable if every realization of types
    yields a blindly-solvable game.

22
1st-Price Auctions Revisited
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
23
Merits of Blindly-Solvable Games
  • Thm Let G be a blindly-solvable
    partial-information game. Let a be the surviving
    strategy profile. Then,
  • Best-response dynamics converges to a within
    n(SjAj) time steps.
  • In the payoff at the limit model, best-response
    dynamics is incentive-compatible, and even
    collusion-proof, in ex-post Nash.

24
Intuition for Proof of (2)
  • The first action that was not eliminated in the
    elimination sequence of G must belong to a
    manipulator.
  • The manipulators utility from that action is
    lower than his utility from a.

25
Best-Response 1st-PriceAuction Mechanism
Alice (va4)
Bids 0 1 2 3 4 5
0 B2 A3 A2 A1 A0 A-1
1 B1 B1 A2 A1 A0 A-1
2 B0 B0 B0 A1 A0 A-1
3 B-1 B-1 B-1 B-1 A0 A-1
Bob (vb2)
26
Implications forInternet Environments
  • Under realistic conditions routing with the
    Border Gateway Protocol is incentive compatible.
    Levin-S-Zohar
  • Convergence and incentive compatibility results
    for congestion control. Godfrey-S-Zohar-Shenker

Mechanism design without money!
27
Beyond Blindly-Solvable Games
28
Generalized 2nd-Price Auction (GSP)
  • Used for selling ads on search engines.
  • k slots. Each slot j with click-through-rate ?j.
  • Users submit bids (per click) bi.
  • They are ranked in order of bids.
  • If ad is clicked pay next highest bid.

29
Generalized 2nd-Price Auction (GSP)
  • No dominant strategy equilibrium.
  • There exists an equilibrium with VCG payments.
    Edelman-Ostrovsky-Schwarz, Varian
  • Best-response dynamics (with tie-breaking)
    converge with probability 1 to that equilibrium.
    Cary et al.
  • Thm (informal) Best-replying in GSP is
    incentive-compatible.
  • Generalizes the English auction of
    Edelman-Ostrovsky-Schwarz

30
Auctions With Unit-Demand Bidders
  • n bidders. m items.
  • Each bidder i has value vi,j for each item j,
    and is interested in at most one item.
  • Thm There exists a best-response mechanism for
    auctions with unit-demand bidders that is
    incentive-compatible in ex-post Nash and
    converges to the VCG outcome.
  • Generalizes the English auction of
    Demange-Gale-Sotomayer
  • The proof of incentive-compatibility is simple.
    The proof of convergence is more complex and is
    based on Kuhns Hungarian method.

31
A New Perspective on Some Centralized Mechanisms
32
Centralized vs. Distributed
distributed
centralized
players declare types
players reach a stable outcome in a distributed
manner
simulate interaction
output the outcome
ex-post equilibrium in the decentralized setting
dominant strategy implementation in the
centralized setting.
33
The Centralized Setting
  • Each player i has an action set Ai, a private
    type ti, and a utility function ui (as before).
  • Wanted a direct revelation mechanism that
    outputs a pure Nash equilibrium of the game.
  • and incentivizes truthfulness

34
Clearly, This is Not Always Possible
Row Player Type 2
Row Player Type 1
2,1 0,0
3,0 1,3
3,1 1,0
2,0 0,3
35
Corollary I
  • If every player has a single dominant strategy in
    every realization, then the direct-revelation
    mechanism is truthful.
  • Give each player his dominant strategy in the
    reported realization.









36
Corollary II
  • If the game is blindly solvable, then the
    direct-revelation mechanism is truthful.

Row Player Type 1
Row Player Type 2
9,0 1,1 1,3
10,0 0,2 0,1
10,0 0,1 0,3
9,0 1,2 1,1
37
More Blindly-Solvable Games
  • Cost-Sharing mechanisms
  • Moulin mechanisms Moulin, Moulin-Shenker
  • Acyclic mechanisms Mehta-Roughgarden-Sundararaja
    n
  • Matching games
  • Interns and Hospitals
  • Correlated two sided markets

38
Directions for Future Research
  • Implementability of other kinds of equilibria
    (mixed Nash, correlated, )?
  • Incentive-compatibility of other kinds of
    dynamics (fictitious play, regret minimization)?

39
Agenda
Best-Response DynamicsOut of Sync
  • Part I mechanism design approach to
    best-response dynamics.
  • Part II on the convergence of best-response
    dynamics in asynchronous environments.

40
Synchronous Environments
  • In traditional best-response dynamics players are
    activated one at a time.
  • More generally, the study of game dynamics
    normally supposes synchrony.
  • What if the interaction between players is
    asynchronous? (Internet, markets)

41
Illustration
Column Player


0,0
2,1
Row Player
0,0
1,2
42
Illustration
Column Player


0,0
2,1
Row Player
0,0
1,2
43
But
Column Player


0,0
2,1
Row Player
0,0
1,2
44
Model for Analyzing Asynchronous Best-Response
Dynamics
  • Infinite sequence of discrete time-steps
  • In each time-step a subset of the players
    best-replies.
  • The schedule is chosen by an adversarial entity
    (the Scheduler).
  • The schedule must be fair (no player is
    indefinitely starved from best-replying).

45
Result Jaggard-S-Wright
  • Thm If two pure Nash equilibria(or more) exist
    in a game then asynchronous best-reply dynamics
    can potentially oscillate.
  • Implications for Internet protocols, diffusion of
    innovations in social networks, and more.

46
Directions for Future Research
  • Characterization of games for which asynchronous
    best-response dynamics converge.
  • More generally, exploring game dynamics in the
    realm that lies beyond synchronization
    (fictitious play, regret minimization).

47
Thank You!
Write a Comment
User Comments (0)