Title: Iterated Prisoners Dilemma Game in Evolutionary Computation
1Iterated Prisoners Dilemma Game in Evolutionary
Computation
- 2003. 10. 2
- Seung-Ryong Yang
2Agenda
- Motivation
- Iterated Prisoners Dilemma Game
- Related Works
- Strategic Coalition
- Improving Generalization Ability
- Experimental Results
- Conclusion
3Motivation
- Evolutionary approach
- Understanding complex behaviors by investigating
simulation results using evolutionary process - Giving a way to find optimal strategies in a
dynamic environment - IPD game
- Model complex phenomena such as social and
economic behaviors - Provide a testbed to model dynamic environment
- Objectives
- Obtaining multiple good strategies
- Forming coalition to improve generalization
ability
4Iterated Prisoners Dilemma Game (1/2)
- Overview
- Prisoners possible choice
- Defection
- Cooperation
- Characteristics
- Non-cooperative
- Non-zerosum
- Types of Game
- 2IPD (2-player Iterated Prisoners Dilemma) game
- NIPD (N-player Iterated Prisoners Dilemma) game
Payoff Matrix of 2IPD Game by Axelrod, R.(1984)
5Iterated Prisoners Dilemma Game (2/2)
- Representation of Strategy
Own History
Opponents History
History Table
Recent Action
Last Action
Recent Action
Last Action
2N History
l 2 Example History 11 01
6Related Works
- Previous Study
- Paul J. Darwen and Xin Yao (1997) Speciation as
Automatic Categorical Modularization - Onn M. Shehory, et al. (1998) Multi-agent
Coordination through Coalition Formation - Y. G. Seo and S. B. Cho (1999) Exploiting
Coalition in Co-Evolutionary Learning - Issues
- Topics are broad about coalition formation in
multi-agent environment - Darwen and Yao have studied coalition in IPD
game, but different - Focused on cooperation, the number of player,
payoff variances, etc
7What is Different?
- Co-evolutionary Learning
- Selection Method
- Rank Based
- Roulette wheel
- Tournament
- Coalition Formation
- Coalition keeps surviving to next generation
- Condition to form coalition is flexible
- Decision Making in Coalition
- Adapting several decision making methods to
coalition - Borda Function, Condorect Function
- Average Payoff, Highest Payoff
- Weighted Voting
8Evolving Strategy
- To evolve strategy, we use
- Genetic algorithm
- Co-evolutionary learning
- Strategic coalition
- Evolutionary Process
9Evolution of Agents (1/2)
- Evolution of Agents
- Agents can develop their strategy using
co-evolutionary learning - Weak agents are removed from the population
- Evolution of Coalition
- Formed coalition survives to next generation
- Agents can join coalition generation by
generation
Before Population
Current Population
Next Population
Ci
Cl
Ck
Coalition survives or grows up
10Evolution of Agents (2/2)
- Problem Possibility of evolving by weak agents
- Caused by removing better agent from the
population who belongs to coalition - Making new agents by mixing better agents within
coalition
Repeat as the number of agents belong to
coalition
A1
Ci
Random Extraction
Ai
Population
Ck
Cj
Mutation
A2
Coalition
11Strategic Coalition (1/2)
- What is Coalition?
- A cooperative game as a set A of agents in which
each subset of A is called coalition- Matthias
Klusch and Andreas Gerber, 2002 - A group of agents that work jointly in order to
accomplish their tasks - Onn M.
Shehory, 1995 - Coalition in the IPD game
- Forming coalition through round-robin game
- Pursuing more payoff using generalization ability
- Coalition forms autonomously without supervision
12Strategic Coalition (2/2)
- Definitions
- Definition 1 Coalition Value
- Definition 2 Payoff Function
- Definition 3 Coalition Identification
- Definition 4 Decision Making
- Definition 5 Payoff Distribution
(1)
(2)
(3)
13Coalition Formation (1/2)
14Coalition Formation (2/2)
Y
Satisfy condition?
Stop
N
- Forming coalition
- Round-robin 2IPD game
- Obtain rank
- Determine confidence of agent according to the
rank - Joining coalition
- Round-robin 2IPD game
- Obtain rank
- If number of agents gt max. number of agents
within a coalition, remove the weakest agent - Determine confidence of each agent
Exceeds iteration per generation?
Y
N
2IPD Game
N
Satisfy condition for forming coalition?
Y
Game type?
Agent vs. Agent
Agent vs. Coalition
Coalition vs. Coalition
Forming Coalition
Joining Coalition
Genetic Operation
15Coalition Decision Making
- Decision making
- To decide coalitions opinion
- Use weighted voting method
- Sharing profits
- Distribution payoff with each agents confidence
- Rank influences each weight
- Determining next action of coalition
- Weight for cooperation of coalition
Ci - Weight for defection of coalition Ci
16Weight of Agents
- Adjusting weight
- Give incentive to agents in coalition
- It reflects decision making of coalition
Adjusting weight
17Improving Generalization Ability (1/2)
- Problem of one good strategy
- Not adaptive to dynamic environment
- Obtain multiple good strategies for specific
environment - Ex) Biological immune system
- Method
- Fitness sharing
- Adjust confidences of multiple strategies by
evolution - Co-evolution
- Coalition formation
18Improving Generalization Ability (2/2)
- How good a player performs against unknown player
-
- Evaluation
Random Generation of 100 Strategies
IPD Game
2IPD Game
Extract Top Strategies in the Population
Top Strategies
Genetically Evolved Strategies
19Test Strategy
Tit-for-Tat
CDCD
0
0
1
0
1
1
0
0
0
1
0
1
0
1
0
1
Trigger
CCD
0
0
0
1
1
1
1
1
0
0
1
0
0
1
0
0
AllD
Random
1
1
1
1
1
1
1
1
1
1
0
1
0
0
1
1
20Example of Game
Tit-for-Tat
Vs.
Evolved Strategy
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 history
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 history
1
0
1
1
1
0
0
1
1
1
1
0
1
0
1
1
1
1
0
1
0
0
1
1
0
0
0
1
0
0
1
1
1
0
1
1
0
0
0
1
1
1
2
3
4
5
2
3
4
5
Payoff
Payoff
3 5 1 1 1
3 0 1 1 1
21Test Environment
Experimental Result
- Population size 100
- Crossover rate 0.3
- Mutation rate 0.001
- Number of generations 200
- Number of iterations a third of population
- Training set Well-known 6 strategies
22Evolved Strategy vs. Random
Experimental Result
Random strategy is one of the weakest strategies
for 2IPD game. In this game, the evolved
strategies have a good performance. All
strategies win the game against Random test
strategies with high payoffs.
23Evolved Strategy vs. Tit-for-Tat
Experimental Result
Tit-for-Tat is a mimic strategy that gives
cooperation on the first move in 2IPD game.
The evolved strategies counteract in a proper
way not to lose the game. It proves the
generalization ability of the evolved strategies
well.
24Evolved Strategy vs. Trigger
Experimental Result
Trigger strategy is never forgiving strategy for
opponents defection. The way to win a game
against Trigger is also choosing defection
iteratively.
25Evolved Strategy vs. AllD
Experimental Result
The only way not to lose the game against AllD
is only choosing defection on all moves. There
is no way to cooperate for the game.
26Number of Coalition
Experimental Result
Coalition
Generation
Coalition survives next generation. In early
evolutionary process, most of coalition are
formed. It makes genetic diversity high and
better choice against opponents. Coalition can
grow if the conditions of agents are satisfied.
27Comparing the Results
Experimental Result
The evolved strategies get more payoff against
Random, CCD and CDCD than Tit-for-Tat, Trigger
and AllD. It describes the evolved strategies
exploit opponents actions well.
28Bias of the Strategy
Experimental Result
Bias
Generation
Bias shows how next choice of the strategies is
selected against its opponents. The higher rate
of bias means that a strategy chooses more
cooperation than defection with a bias rate
and vice versa.
29Conclusions
- Conclusion
- Strategic coalition might be a robust method that
can adapt to a dynamic environment - Decision making methods influence the results,
but not serious - The evolved strategies by coalition generalize
well against various opponents - Discussion
- Can the strategic coalition be adapted to n-IPD
game ? - Which parameters in IPD game influence
generalization ability ? - How can make opponent strategies to test ?
- How can adapt this problem to real world ?
30Examples (1)
31Examples (2)