Title: Evolution of Teamwork in Multiagent Systems
1Evolution of Teamwork in Multiagent Systems
- Research Preparation Examination
- by Jacob Schrum
2Why Multiple Agents?
- Many applications
- Physical World
- Robotics
- Autonomous automobiles
- Military applications
- Network Systems
- Artificial World
- Games
- Graphics
- Entertainment
- Artificial Life
3Why Multiagent Perspective?
- Decentralized control
- Failure recovery
- Individual agents simpler
than whole - Some environments dont
support central control - Human interaction
- Humans are also agents
- Agents interacting with
humans are in MAS
4Teamwork in Multiagent Systems
- Problem divided amongst many agents
- Teamwork often required for success
- Communication sometimes an issue
- How to learn teamwork open question
5Direct Approach Careful Design
- Hand code everything
- Benefits
- Understand end product
- Drawbacks
- Not general
- Difficult
- Programmer time
- Common in
- Robotics
- Video games
- Most deployed systems
- What if no one knows how to program it?
6Learn it Reinforcement Learning
- Environment is Markov Decision Process
- Learn optimal policy
- Depends on value function (TD methods)
- Proven convergence in tabular case
- Function approximation needed for bigger problems
- Problems with Partially Observable MDPs
- Successes in
- Pred/Prey Scenarios (Tan 1993)
- Soccer keep away
(Kalyanakrishnan, Stone 2009) - Robocup soccer (many)
7Breed it Evolution
- Based on evolution via natural selection
- Benefits
- Less restrictive policy representation
- Demonstrated success in POMDP domains
- Drawbacks
- Computationally intensive
- Time intensive
- Focus of talk
8Evolution Basics
- Initialize population P
- Evaluate all p in P (assign fitness)
- Derive P by selecting/modifying members of P
based on their fitness scores - Repeat from step 2 with P as P until done
- P is usually similar to P, but slightly better
- Many variations
- Genetic Algorithms, Evolution Strategies, etc.
9Evolution in Multiagent Systems
- Team Composition
- Homogeneous
- Heterogeneous
- Heterogeneous from Subpopulations
- Entire population
- Type of Selection
- Individual
- Team
- Self-Selection
- Multiple Objectives
Pick one member from each subpopulation to make a
team
101.A. Homogeneous Teams
- Team members share same policy
- Members know what to expect from team members
- One individual evaluated per trial
- Evaluations reliable because of consistent team
composition
111.B. Heterogeneous Teams
- Team composed of several policies
- Uncertainty as to who teammates will be
- Multiple individuals evaluated per trial
- Evaluation differs depending on choice of team
members
121.C. Subpopulations
- Each slot filled by representative from specific
subpopulation - Subpopulations specialize
- Individuals know what to expect of members in
each slot - Team composition is still heterogeneous
131.D. Entire Population
- The entire population is seen as a cooperating
team - Team level selection not possible
- Population may divide into competing
subpopulations - Mating restrictions
- Genetic/Tag-based recognition
142.A. Individual Selection
- Individuals selected based on own fitness
- Commonly used with heterogeneous teams
- Can result in selfish behaviors
- Altruism relevant
- sacrificing own fitness to raise fitness of
another - Reciprocity relevant
- helping another to get help in return
152.B. Team Selection
- Individuals selected based on team fitness
- Common fitness, sum, average, etc.
- Commonly used with homogeneous teams
- Enables slackers in heterogeneous teams
- Altruism and reciprocity have no meaning
- No credit assignment problems between members
162.C. Self-Selection
- Individuals choose when and with whom to mate
- Common in Artificial Life simulations
- AL studies emergence of biological phenomena
- Usually involves a spatial component
- Extinction is possible
- Auto restart
- Spawn new members
173. Multiple Objectives
- Assume individual has fitness scores
- F (f1,,fN) in objectives 1 through N
- Which values of F are best?
- Traditional approach
- fitness(F) f1w1 fNwN for weights
w1,,wN - Pareto-based approach
- Partition population into non-dominated Pareto
fronts - Assign fitness based on Pareto-front
18Pareto Front Example
- Each point represents
an individuals scores - Point dominates other points
in its box - 3 Pareto fronts of
non-dominated points
19Case Studies
- Review State of the Art
- For each study
- Classify type of selection
- Classify team composition
- Identify unanswered questions
- Future research directions
20AntFarm
- Evolve foraging behavior
- Pheromones to communicate
- Individual selection
- Entire population as a team
- No cooperative foraging!
- Likely cause individual selection
- Individual selection offers less incentive for
teamwork - Teamwork especially difficult when there is only
one team
AntFarm Towards Simulated Evolution. Collins,
Jefferson. 1991
21Evolving Communication
- Exploration task
- Pheromones to communicate
- Team selection
- Homogeneous teams vs. static bots
- Pairs of objectives, Pareto-based
- Different behaviors in different runs
- Compromise strategy
- Blocking strategy
- Teamwork possible with homogeneous teams
- Need to move beyond grid-worlds
- Move beyond two objectives
Emergence of Communication in Competitive
Multi-Agent Systems A Pareto Multi-Objective
Approach. McPartland, Nolfi, Abbass. 2005
22SwarmEvolveTags
- Birds visit food stations
- Energy can be shared
- Sharing based on tags
- Self-selection
- Entire population as team
- Competing subpopulations emerged
- Cooperation in entire population without team
selection - Altruism via aiding similar individuals
- Teamwork as a result of subpopulation homogeneity
Evolution of cooperation without reciprocity.
Riolo, Cohen, Axelrod. 2001
Tags and the Evolution of Cooperation in
Complex Environments. Spector, Klein, Perry. 2004
23Legion-I
- Roman legions defend countryside and cities
- Team level selection
- Homogeneous teams
- Multi-modal behavior
- Defend city
- Pursue barbarians
- Homogeneous team members must fill all roles
- Could not learn more complicated/strategic tasks
- Example building roads to speed up travel
Neuroevolution for Adaptive Teams. Bryant,
Miikkulainen. 2003
24Role-Based Cooperation
- Toroidal predator/prey grid world
- Individual selection
- Team fitness shared by team members
- Multi-Agent ESP subpopulation based
- Simple non-communicating method
outperforms communicating method - Teamwork without homogeneity
- Communication not always needed
- May only apply to simple domains
- Still need to scale up complexity
- Get away from grid worlds
Coevolution of Role-Based Cooperation in
Multi-Agent Systems. Yong, Miikkulainen. 2007
25NERO
- Machine Learning game
- Human interaction via fitness function
- Individual selection
- Entire population is team
- Multiple objectives
- User defines weights dynamically
- Maintenance of fitness function
- Old behaviors can be forgotten
when learning new ones - Need to learn multiple tasks simultaneously
Evolving Neural Network Agents in the NERO
Videogame. Stanley, Bryant, Miikkulainen. 2005
26Pareto Multi-objective NPCs
- Evolved monsters vs. bot with stick
- Individual selection
- Large heterogeneous teams of 15
- Third of entire population
- Multiple objectives, Pareto-based
- Credit assignment trick
- Learns multiple objectives simultaneously
- Different runs can lead to very different results
- Different areas of trade-off surface
- Population becomes mostly homogeneous
Constructing Complex NPC Behavior via
Multi-Objective Neuroevolution. Schrum,
Miikkulainen. 2008
27Dead End Game
- Human prey vs. predators
- Offline evolution vs. bot
- Team level selection
- Homogeneous teams
- Online evolution vs. human
- Individual selection
- Small heterogeneous team
- Different configurations appropriate at different
levels - Sometimes the domain leaves no choice
Interactive Opponents Generate Interesting
Games. Yannakakis, Hallam. 2004
28Cooperating Robots
- Retrieve tokens
- Simulation ? Robots
- Compared selection levels
- Individual vs. Team
- Compared team compositions
- Homogeneous vs. heterogeneous
- Homogeneous better with teamwork and altruism
- Homogeneous best with team selection
- Heterogeneous best with individual selection
- Did not consider subpopulations
- Tasks only involved foraging (no other objectives)
Genetic Team Composition and Level of Selection
in the Evolution of Cooperation. Waibel,
Keller, Floreano. 2008
29Summary of Issues
- More complexity
- Move beyond grid worlds
- Need multiple contradictory objectives
- Act in continuous, real-time world
- Best evolutionary configuration
- More comparisons between team compositions
- Especially subpopulation-based method
- Task/configuration pairings?
- Credit assignment issues
- Multi-modal behavior
- What to do and when
30Experiment
- Four monsters vs. bot with stick
- Smaller team makes task harder
- Compare homogeneous, heterogeneous and
subpopulation - Homogeneous uses team selection
- Others use individual selection
- Multiple objectives
- Group damage
- Individual injury
- Individual time alive
31Heterogeneous Results
- Many generations (600)
- Not that long in real time
- Mostly selfish
- Good teamwork can arise though (Baiting)
- Teamwork depends on population being homogeneous
Teamwork
Selfish
32Homogeneous Results
- Fewer Generations (100-200)
- Actually longer in real time
- Always some form a teamwork
- Baiting
- Timed Assault
Time Assault
Baiting
33Subpopulations Results
- Many Generations (400)
- Each generation takes a lot of real time
- Easy for slacker subpopulation to persist
- Limited teamwork
- Only some members participate
Cooperating Pair
34Discussion
- Can subpopulation method do better?
- Better credit assignment
- Team level selection (how?)
- Speed up homogeneous and subpopulations
- Heterogeneous discourage selfishness
35Future Research Questions
- Credit assignment issues
- Cooperating individuals cannot be identified
- Objectives define best evolutionary
configuration? - Complex domains/real problems
- Many objectives
- Continuous, real-time
- Potential challenge domains
- Robocup Soccer
- Unreal Tournament
36Conclusion
- Teamwork in Multiagent Systems important area
- Evolution has been successful
- Better understand why
- Team configuration
- Level of selection
- Presence/absence of credit assignment problems
- Apply to harder domains
- Real-time
- Continuous/noisy
- Multiple contradictory objectives
37Questions?
38Auxiliary Slides
39Cooperation Without Reciprocity
- Abstract study of the evolution of cooperation
- Donor/recipient model
- 3 random pairings with option of donating fitness
c so that recipient can gain fitness b - Choice to donate based on similarity of tags
- Individual selection with entire population as
team - Subpopulations emerged based on tags
- Donation rate changes cyclically, but generally
stays high (73) for c lt b - Need to apply in actual domain requiring teamwork
Evolution of cooperation without reciprocity.
Riolo, Cohen, Axelrod. 2001
40Cooperation Without Reciprocity Results
41Team Composition in MAS
- Taxonomy proposed by Stone
- Definition of communication is broad
- Message passing, blackboard, information sharing,
etc.
Multiagent Systems A Survey from a Machine
Learning Perspective. Stone. 2000