Evolutionary Games - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Evolutionary Games

Description:

Evolutionary Games. The solution concepts that we have discussed in some detail include ... Agents are physically arranged in group. ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 23
Provided by: vic36
Category:

less

Transcript and Presenter's Notes

Title: Evolutionary Games


1
Evolutionary Games
  • The solution concepts that we have discussed in
    some detail include
  • strategically dominant solutions
  • equilibrium solutions
  • Pareto optimal solutions
  • best response solutions
  • mixed strategy solutions

2
  • We now turn attention to another kind of
    equilibrium-based solution. 
  • This is a solution that is produced by some form
    of learning or adaptation process. 
  • we will focus on the kinds of things that can be
    learned in a population of learning agents. 
  • We'll have to be careful because evolution has a
    huge number of things that can affect it mating,
    mutation, environment, catastrophes, other agents
    in the population, etc.  We'll restrict attention
    to just a couple of these factors. 

3
Reaching an equilibrium
  • the main requirement for reaching an equilibrium
    in learning is that the learning algorithms stop
    changing. 
  • This type of equilibrium can be very weak as
    when, for example, a learning agent happens to
    select parameter values that cause another
    learning agent to stop adapting, and vice versa. 
  • Or, both agents get tired of adapting and just
    "freeze" their solutions even though they may not
    be good solutions.  This type of equilibrium
    may also be weak because even the smallest
    perturbation to this type of equilibrium can
    cause the system to adapt to another solution. 
  • A stronger notion of equilibrium is a learned
    solution that is not easily changed by perturbing
    the system. 
  • We call such an equilibrium a stable solution.

4
  • Finally, not every learning process has an
    equilibrium. 
  • Since only certain types of learning processes
    and games produce these equilibria, the notion of
    a learning-based equilibrium is not as universal
    as the notion of a Nash equilibrium

5
  • In evolutionary games, the two main factors that
    contribute to what is learned are
  • The types of interactions that occur between the
    agents in a population.
  • The rules that are applied to determine which
    strategies within the population are fit and
    therefore likely to be learned by the population.

6
  • Let's begin by using an example. 
  • Suppose that we have two large and separate
    groups of agents (males and females) who will be
    playing the battle of the sexes game. 
  • Suppose that each of these two groups has a mix
    of agents that either always play cooperate (vote
    for what other wants) or always play defect (vote
    for what it wants) 
  •   One agent from each group, one male and one
    female, is selected at random, they each make
    their choice, and they get the reward that
    results.

7
  • Battle of the Sexes

Defect Coop
Defect 1,1 3,2
Coop 2,3 0,0
8
  • In these images, the x-axis represents the number
    of rounds that the game was played and the y-axis
    represents the percentage of the female group
    (red circles) and of the male group (green
    squares) that play always cooperate.  Note that
    the two graphs represent the two most common
    outcomes --- all the females play always
    cooperate while all the males play always defect
    (top graph), or all the females play always
    defect while all the males play always cooperate
    (bottom graph).  This should make some
    intuitive sense.  If the two groups play a lot,
    then they should learn to settle on one of the
    two Pareto optimal, Nash equilibrium solutions,
    but which solution is chosen depends on the
    initial make-up of the group.  For these
    simulations, the initial population was very
    close to 50/50, but with a small random
    perturbation towards either always defect or
    always cooperate for each group

9
(No Transcript)
10
Relative Fitness.
  • When we look at the strategies, if 1/3 of the
    agents are playing strategy A and getting 1/3 of
    the total utility, they are getting what they
    expected so they shouldnt change.
  • HOWEVER, if 1/3 are getting ½ of the total
    utility for all players, they are playing better
    than others. We will do better if we have MORE
    agents like these super achieving agents. But
    how many more?
  • The simple thing to do is reset the agents so the
    number of each type of agent exactly matches the
    percent of utility that group achieved in the
    last round.
  • When we are happy with the division (no under or
    over achieving group), we are done learning.

11
Imitator Dynamics
  • Replicator dynamics and random pairings of
    solutions are not the only models for evolution. 
    Thus, they are not the only learning models that
    have some claim to justification. 
  • We will explore a different technique for
    selecting the proportion of strategies that
    evolve from one generation to another, but first
    we will need to explore other models for
    selecting which agents interact with each other.

12
Playing with Neighbors
  • In the previous section, agents were randomly
    paired with other agents from the group.  From an
    evolutionary perspective, it sometimes makes more
    sense to assume that agents are paired with their
    neighbors rather than being randomly paired with
    any other agent.  This pairing with neighbors can
    be implemented in two ways.
  • Agents have some way to recognize another agent. 
    If they are randomly paired with another agent
    that they do not like, they can ask to be
    reassigned.  The reassignment will be random, but
    at least they get one chance to reject an
    undesirable agent and they therefore get more
    chances to interact with their friends.
  • Agents are physically arranged in group.  For
    example, agents may be arranged on a grid and
    restricted so that they can only interact with
    their immediate neighbors.  These immediate
    neighbors can be defined as those agents to the
    N, S, E, or W of the agent, or to the N, NE, E,
    SE, S, SW, W, or NW of the agent.  For another
    example, agents may be arranged on the perimeter
    of a circle and only able to interact with an
    agent to their right or left.

13
  • Standard evolutionary game (random interactions)
    ? all Defect
  • Modifications- spatial games Interactions no
    longer random, but with spatial neighbours
  • Sum scores. Player with highest score of 9 shaded
    takes square (territory, food, mates) in next
    generation
  • Some degree of cooperation evolves!

14
Imitator Dynamics
  • When agents can only play with their neighbors,
    we can introduce a different way (different from
    replicator dynamics) of selecting which
    strategies propagate to the next generation.  One
    way to do this is for an agent to imitate its
    most successful neighbor.  The algorithm for
    doing this goes something like this
  • Interact with all of my neighbors (wraping around
    the board as needed), and let all my neighbors
    interact with their neighbors.
  • After the interactions with my neighbors are
    complete, identify the interaction strategy from
    my neighbors that was most successful unless my
    current strategy beat all of my neighbors (in
    which case I'll stick to my strategy).
  • Change my strategy to the most successful
    strategy of my neighbors -- imitate them -- on
    the next round.
  • Imitator dynamics can produce vastly different
    results than replicator dynamics. 

15
Battle of the Sexes
  • Suppose we have 12 agents, and four strategies
    (as described in the homework). Suppose,
    initially, that there are equal numbers of each
    type of agent.
  • If the strategies are equally good, we would
    expect that each type of agent would do equally
    well.

16
Relative fit
  • We dont need to worry about computing expected
    utility, as we will produce actual utility.
  • For K times, we randomly select two players.
    They compete. We use gamma to decide how many
    times to repeat the interaction (as tit for tat
    strategies require repeated play with the same
    agent). We figure the average utility each agent
    made for a single interaction in each of the
    interactions they had.
  • We dont want to be biased by how long the
    interaction continues or on how many times the
    player was selected to play. Thus, we work with
    average utility earned.
  • We pick K to be a large number so each player
    gets to play lots of times (so the average is
    representative). This is important because a
    score of 2.2 (averaged over 10 games) is not as
    certain as the same score averaged over 10000
    games.

17
Redistributing
Agent Strategy Utility
1 A 2
2 B 2.1
3 C 1.7
4 D 3
5 A 2
6 B 1
7 C .4
8 D 1.5
9 A 3
10 B 1.2
11 C 1.6
12 D 1.8
  • Suppose after the first round we see the
    following average utilities

18
To find relative fit, we add up the total utility
earned by all agents of the same type
The total for all agents is 21.3 The percent of
utility earned by each agent is shown to the
left. Notice, that agents of strategy A should
be 32 of the agents in the new round (up
from 25 originally) while agents of type C
should be only 17 in the next round.
Agent A 7 0.328638
Agent B 4.3 0.201878
Agent C 3.7 0.173709
Agent D 6.3 0.295775
19
So in the next round, we adjust the numbers of
each agents
Agent Strategy Utility
1 A 2
2 B 2.1
3 C 2
4 D 3
5 A 2
6 B 1
7 C 0.4
8 D 1.5
9 A 2.9
10 D 1.2
11 A 2.1
12 D 2
Percent Number of agents
Agents A 9 0.405405 4.864865
Agents B 3.1 0.13964 1.675676
Agents C 2.4 0.108108 1.297297
Agents D 7.7 0.346847 4.162162
20
As we continue
  • What we want to show is how the percents of each
    type of agent change over time.

21
Over time the percents could vary as shown below
A 32 40 45 50 47 46
B 20 14 15 16 16 14
C 17 11 16 24 21 18
D 31 35 24 10 16 22
22
Using excels Chart Wizard, we can visualize the
results
Write a Comment
User Comments (0)
About PowerShow.com