Title: Game Theory
1Game Theory
2The First Experiment in Game Theory
It took place in January of 1950. Merrill Flood
and Melvin Dresher of the RAND Corporation were
the experimenters. The experimental subjects were
Armen Alchian, an economics professor at UCLA,
and John Williams, head of the mathematics
department of the RAND Corporation. The game was
an unsymmetrical Prisoners Dilemma. Alchian and
Williams played the game 100 times in succession.
3The Play
It is clear that they started out with different
expectations, and to some extent retained quite
different expectations. Alchian expected Williams
to defect, while Williams tried to bring about a
cooperative solution by starting cooperatively
and playing a trigger strategy. Alchian initially
didn't get it and assumed that Williams was
playing a mixed strategy. (Williams commented
that Alchian was a dope.)
4Results
Despite all this confusion, the two players
managed to cooperate on 60 of the 100 games.
Mutual defection, the Nash equilibrium, occurred
only 14 times. Recall, always defect is the
subgame perfect Nash equilibrium for this
repeated game. This is typical of many
experiments on the Prisoners Dilemma.
5Prisoners Dilemma
- Interpretations of negative results on the
Prisoners Dilemma - People are not really as rational as game theory
assumes, and fail to play the dominant strategy
equilibrium because they do not understand the
game. - People are better at solving social dilemmas than
game theorists suppose, perhaps because they do
not always base their actions on self-interest.
6Bounded Rationality
Many social scientists (but a minority of
economists) argue that real human rationality is
"bounded rationality." That means People do not
spontaneously choose the mathematically rational
solutions to games, but think them through in
complex and fallible ways. People tend to play
according to heuristic rules, or "rules of thumb"
such as Tit-for-Tat, which work well in many
cases but are fallible.
7Non-Self-Interested Types
Another assumption in neoclassical economics and
much game theory is that people act in
self-interested ways. While it can be difficult
to get really meaningful evidence on this, we
have learned some things from game theoretic
experiments.
8Altruism
- It is possible that at least some people act in
altruistic ways. However, - The meaning of altruism is somewhat vague.
- Maximize the total payoff?
- Maximize a weighted sum?
- Act according to ethical rules?
- There is little uncontroversial evidence for
altruistic behavior.
9Reciprocity
Reciprocity means a tendency to return good for
good or evil for evil, even at a sacrifice of
ones own payoff. This is best understood in
terms of an example, and the Centipede Game is
a good start.
10A Small Centipede
11Centipede Results
- The subgame perfect strategy is to grab as soon
as possible. - This is a lose-lose outcome.
- In experiments, pass-pass is a fairly common
pattern. - This is consistent with reciprocity.
12Reciprocity
- The second pass is an instance of good-for-good
-- positive reciprocity. - There is also bad-for-bad -- negative
reciprocity. - They can reinforce one another.
13Centipede with Retaliation
14Retaliation
- This does not change the subgame perfect
equilibrium -- since the threat of retaliation is
never credible with common knowledge of
rationality. - However, in experiments, retaliation is observed.
- Also, pass-pass is even more frequently
observed.
15Conclusion on Reciprocity
Evidence (and opinion) is building that
reciprocity is a pervasive human tendency with
many practical implications.
16Level k (1)
- A lot of evidence indicates that rationality of
real human beings is bounded. - We also see that people often seem to try to
outsmart each other -- and there would be no
point to that if we knew other players are
rational. - Some recent research begins from the idea that
real decision-makers are of different types, or
at least they behave as though they are.
17Level k (2)
- Level 0 Players at level zero do not do any
strategic thinking at all, but choose their
strategies without much thinking, perhaps at
random. - Level 1 Players at level one choose the best
response to the decisions of players at level
zero. - Level k Whenever kgt0, a player at level k
chooses the best response to a player at level
k-1. For example, a level 2 player chooses the
best response to a player at level 1.
18Level k (3)
- There also seem to be
- equilibrium players, who simply play a Nash
equilibrium, and - sophisticated players, who try to estimate the
odds of being matched against a player of one of
the other types and choose their response to get
the best expected value payoff on the basis of
the estimate.
19Level k (4)
- The level-k theory is a theory of play for
one-off games, that is, games that are played
only once or played for the first time. In
experiments, typically, the experimental subjects
will be matched to play the games with rotation
so that each pair plays only once.
20Remember the Location Game
21Zero and One
- We suppose that level 0 players choose among the
four locations at random, with a probability of ¼
for each of the four strategies. Now suppose
Mimbels believes that Gacys is a level 0
player. We see that, as a level 1 player,
Mimbels best response is Center City, for an
expected value payoff of 116.5.
- Now suppose instead that Gacys assumes that
Mimbels is a level zero player and so will
choose among the strategies at random. We see
that Gacys best response to level 0 play is a
West Side location for an expected value payoff
of 97.5.
22Two and Above
- If Mimbels is a level 2 player, they will play
their best response to Gacys level 1 strategy,
and that is (again) Center City. If Gacys is a
level 2 player, they will play their best
response to Mimbels level one strategy, which is
Center City. At level 3, again, each will play
its best response to Center City which is
Center City and so the same is true at every
higher level. We recall that the Nash equilibrium
for this game is for both to play Center City, so
if both players play at least at level 2, they
will play the Nash Equilibrium.
23Cognitive Salience
- A difficulty with the level k approach is to
determine what the level 0 players will do what
it means to say that they do no strategic
thinking at all. In some games, it does not seem
that choosing the strategy at random really is
what an unthinking player would do. Some
strategies may have some special property that
attracts attention to them, so that the
unthinking decision-maker would tend to choose
that strategy. The term for a property like this
is cognitive salience.
24Greed Game
25Zero and One
- Suppose that a level zero player chooses among
the four strategies at random. Then a level 1
player will play strategy 2. We expect to see all
players above level 0 playing the equilibrium at
strategies 2,2. - However, the payoff of 1000 at strategy pair 1,4
or 4,1 is twice as large as any other payoff in
the game, something of a jackpot payoff. Suppose,
then, that instead of choosing at random, a level
0 player goes for the gold by choosing strategy
1. Then the best response, chosen by a level 1
player, is strategy 3 and a level 2 or higher
player responds with strategy 3, playing the Nash
equilibrium at strategy pair 3,3.
26Question Mark?
- In this game, as we see, different models of
level 0 play lead to quite different predictions.
In fact each predicts the play of a Nash
equilibrium, but they predict different
equilibria (2,2) if level 0 chooses at random
and (3,3) if level 0 chooses according to
cognitive salience. On the one hand, the level k
theory certainly would be more specific if we
could always identify a single model of level 0
play. On the other hand, in experiments we can
let the evidence speak for itself. - This is one instance of the more general problem
of framing in decision theory.
27Interim Summary
- Thus far we have explored experimental evidence
for - Non-self-interested motivations
- Altruism?
- Reciprocity
- Bounded rationality
- Level k
- Next evolution
28Remember the Hawk vs Dove Game
29Hawk vs. Dove Notes
- In biological applications, payoffs are
inclusive fitness -- expected number of
descendants - Animals are randomly matched to play 2x2 games
- The probability of meeting a hawk depends on the
proportion of hawks in the population.
30Expected Payoffs
Hawk
Dove
31Equilibrium
Let z be the proportion of hawks in the
population, (1-z) of doves EV(Hawk) -25z
14(1-z) 14-39z EV(Dove) -9z 5(1-z)
5-14z 5-14z 14 39z (39-14)z 14-5 25z 9 z
9/25 .36
32ESS
(Evolutionarily Stable Strategy)
According to Ferdinand Vega-Redondo, a strategy
is said to an ESS if, once adopted by the whole
population, no mutation adopted by an
arbitrarily small fraction of individuals can
invade, (i.e. enter and survive) by getting at
least a comparable payoff.
33Application
To apply this concept to the Hawk vs Dove game,
we have to interpret the 9/25 equilibrium
proportion as a mixed strategy -- as if each
individual bird adopted a mixed strategy with
9/25 probability of playing Hawk. We then ask, if
a small population were to adopt a different
probability, would they get higher payoffs? The
answer is no. 9/25 is ES.
34Replicator Dynamics
The dynamic idea underlying this is the
replicator dynamics. Again quoting Vega-Redondo,
the share of the population which plays any
given strategy changes in proportion to its
relative payoff (i. e. in proportion to its
deviation, positive or negative, from the average
payoff).Stable states under this dynamics are
identical with ESS.
35Bounded Rationality
In neoclassical economics and much of game
theory, it is assumed that people maximize, or
infallibly choose their best response. Others
argue that real people cannot do that, but rather
that real human rationality is bounded.
36Classical Artificial Intelligence
One version of that is expressed by the concept
of production systems -- that people act
according to rules, though the rules may be very
complex. This idea comes from studies in
artificial intelligence, but fits well with
strategy rules like Tit-for-Tat.
37Learning
This is not to say that people dont learn. But a
rational being would learn by applying Bayes
rule to all available information. Boundedly
rational creatures learn much less
systematically. One important form of boundedly
rational learning is imitative learning.
38Example 1
Suppose people are randomly matched to play the
social dilemma in the following payoff table.
39Example 2
The agents will play the game repeatedly with no
definite number of repetitions. The discount
factor, allowing for both time discount and the
probability that there will be no next round of
play, is 2/3. The population who play this game
are boundedly rational, in that they play
according to one of three rules
40The Rules
- always C
- always D
- Tit for Tat
41 Discounted Expected Value Payoffs
42How Agents Learn
- At a given time, there is a small probability
that an agent may switch strategies. - Such an agent will shift from strategy R to
strategy S with a probability that is
proportionate to the payoff to strategy R
relative to the payoff to strategy S. - Thus we can apply the replicator dynamic.
43Complicated Dynamics
Since the agents are matched at random, the
probability of being matched with a C player, a D
player, or a Tit-for-Tat player depends on the
proportions of C players, D players and
Titfortatters in the population. This is
complicated as we have two proportions to
consider. However, "always C" is dominated, so we
will keep it simple by considering only a few
cases with the proportion of C players at zero
and various proportions of Titfortatters.
44What Happens?
45Informationally (Almost) Efficient Markets
- Many economists and financial theorists have
argued that financial markets are informationally
efficient. This means that the current price of a
share in the XYZ corporation (for example)
reflects all information about the profitability
and risks of investment in XYZ that is available
to the public. - Thus, you may as well buy stocks at random -- or
buy an "index" fund. - But if everyone does that, markets cannot be
efficient. - 2 strategies study pick stocks or buy an index
fund.
46A Two-Person Game
47A Proportional Game
48Nash Equilibrium
- The equilibrium condition is that 31.6 of
investors are informed. There are a large number
of Nash equilibria -- depending on which
investors are informed and which are not -- but
every equilibrium has 31.6 informed and the rest
uninformed. - Is this evolutionarily stable? It is.
- In equilibrium, then, we have two "kinds" of
investors. One "kind," the minority, does
extensive market research and invests with care.
They get a net payoff of 7. The other kind buy
index funds or invest at random. Since the prices
in the marketplace are fairly highly efficient,
the inactive investors get a payoff of 7.
49Conclusion
- Evolutionary game theory is well established in
population biology, and has a growing application
as a model of boundedly rational learning on the
part of human beings.