Evolution of Teamwork in Multiagent Systems - PowerPoint PPT Presentation

About This Presentation

Title:

Evolution of Teamwork in Multiagent Systems

Description:

Evolution of Teamwork in Multiagent Systems Research Preparation Examination by Jacob Schrum Why Multiple Agents? Many applications Physical World Robotics Autonomous ... – PowerPoint PPT presentation

Number of Views:178

Avg rating:3.0/5.0

Slides: 42

Provided by: nnCsUtex

Learn more at: https://nn.cs.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: Evolution of Teamwork in Multiagent Systems

1
Evolution of Teamwork in Multiagent Systems

Research Preparation Examination
by Jacob Schrum

2
Why Multiple Agents?

Many applications
Physical World
Robotics
Autonomous automobiles
Military applications
Network Systems
Artificial World
Games
Graphics
Entertainment
Artificial Life

3
Why Multiagent Perspective?

Decentralized control
Failure recovery
Individual agents simpler
than whole
Some environments dont
support central control
Human interaction
Humans are also agents
Agents interacting with
humans are in MAS

4
Teamwork in Multiagent Systems

Problem divided amongst many agents
Teamwork often required for success
Communication sometimes an issue
How to learn teamwork open question

5
Direct Approach Careful Design

Hand code everything
Benefits
Understand end product
Drawbacks
Not general
Difficult
Programmer time
Common in
Robotics
Video games
Most deployed systems
What if no one knows how to program it?

6
Learn it Reinforcement Learning

Environment is Markov Decision Process
Learn optimal policy
Depends on value function (TD methods)
Proven convergence in tabular case
Function approximation needed for bigger problems
Problems with Partially Observable MDPs
Successes in
Pred/Prey Scenarios (Tan 1993)
Soccer keep away
(Kalyanakrishnan, Stone 2009)
Robocup soccer (many)

7
Breed it Evolution

Based on evolution via natural selection
Benefits
Less restrictive policy representation
Demonstrated success in POMDP domains
Drawbacks
Computationally intensive
Time intensive
Focus of talk

8
Evolution Basics

Initialize population P
Evaluate all p in P (assign fitness)
Derive P by selecting/modifying members of P
based on their fitness scores
Repeat from step 2 with P as P until done

P is usually similar to P, but slightly better
Many variations
Genetic Algorithms, Evolution Strategies, etc.

9
Evolution in Multiagent Systems

Team Composition
Homogeneous
Heterogeneous
Heterogeneous from Subpopulations
Entire population
Type of Selection
Individual
Team
Self-Selection
Multiple Objectives

Pick one member from each subpopulation to make a
team
10
1.A. Homogeneous Teams

Team members share same policy
Members know what to expect from team members
One individual evaluated per trial
Evaluations reliable because of consistent team
composition

11
1.B. Heterogeneous Teams

Team composed of several policies
Uncertainty as to who teammates will be
Multiple individuals evaluated per trial
Evaluation differs depending on choice of team
members

12
1.C. Subpopulations

Each slot filled by representative from specific
subpopulation
Subpopulations specialize
Individuals know what to expect of members in
each slot
Team composition is still heterogeneous

13
1.D. Entire Population

The entire population is seen as a cooperating
team
Team level selection not possible
Population may divide into competing
subpopulations
Mating restrictions
Genetic/Tag-based recognition

14
2.A. Individual Selection

Individuals selected based on own fitness
Commonly used with heterogeneous teams
Can result in selfish behaviors
Altruism relevant
sacrificing own fitness to raise fitness of
another
Reciprocity relevant
helping another to get help in return

15
2.B. Team Selection

Individuals selected based on team fitness
Common fitness, sum, average, etc.
Commonly used with homogeneous teams
Enables slackers in heterogeneous teams
Altruism and reciprocity have no meaning
No credit assignment problems between members

16
2.C. Self-Selection

Individuals choose when and with whom to mate
Common in Artificial Life simulations
AL studies emergence of biological phenomena
Usually involves a spatial component
Extinction is possible
Auto restart
Spawn new members

17
3. Multiple Objectives

Assume individual has fitness scores
F (f1,,fN) in objectives 1 through N
Which values of F are best?
Traditional approach
fitness(F) f1w1 fNwN for weights
w1,,wN
Pareto-based approach
Partition population into non-dominated Pareto
fronts
Assign fitness based on Pareto-front

18
Pareto Front Example

Each point represents
an individuals scores
Point dominates other points
in its box
3 Pareto fronts of
non-dominated points

19
Case Studies

Review State of the Art
For each study
Classify type of selection
Classify team composition
Identify unanswered questions
Future research directions

20
AntFarm

Evolve foraging behavior
Pheromones to communicate
Individual selection
Entire population as a team
No cooperative foraging!
Likely cause individual selection
Individual selection offers less incentive for
teamwork
Teamwork especially difficult when there is only
one team

AntFarm Towards Simulated Evolution. Collins,
Jefferson. 1991
21
Evolving Communication

Exploration task
Pheromones to communicate
Team selection
Homogeneous teams vs. static bots
Pairs of objectives, Pareto-based
Different behaviors in different runs
Compromise strategy
Blocking strategy
Teamwork possible with homogeneous teams
Need to move beyond grid-worlds
Move beyond two objectives

Emergence of Communication in Competitive
Multi-Agent Systems A Pareto Multi-Objective
Approach. McPartland, Nolfi, Abbass. 2005
22
SwarmEvolveTags

Birds visit food stations
Energy can be shared
Sharing based on tags
Self-selection
Entire population as team
Competing subpopulations emerged
Cooperation in entire population without team
selection
Altruism via aiding similar individuals
Teamwork as a result of subpopulation homogeneity

Evolution of cooperation without reciprocity.
Riolo, Cohen, Axelrod. 2001
Tags and the Evolution of Cooperation in
Complex Environments. Spector, Klein, Perry. 2004
23
Legion-I

Roman legions defend countryside and cities
Team level selection
Homogeneous teams
Multi-modal behavior
Defend city
Pursue barbarians
Homogeneous team members must fill all roles
Could not learn more complicated/strategic tasks
Example building roads to speed up travel

Neuroevolution for Adaptive Teams. Bryant,
Miikkulainen. 2003
24
Role-Based Cooperation

Toroidal predator/prey grid world
Individual selection
Team fitness shared by team members
Multi-Agent ESP subpopulation based
Simple non-communicating method
outperforms communicating method
Teamwork without homogeneity
Communication not always needed
May only apply to simple domains
Still need to scale up complexity
Get away from grid worlds

Coevolution of Role-Based Cooperation in
Multi-Agent Systems. Yong, Miikkulainen. 2007
25
NERO

Machine Learning game
Human interaction via fitness function
Individual selection
Entire population is team
Multiple objectives
User defines weights dynamically
Maintenance of fitness function
Old behaviors can be forgotten
when learning new ones
Need to learn multiple tasks simultaneously

Evolving Neural Network Agents in the NERO
Videogame. Stanley, Bryant, Miikkulainen. 2005
26
Pareto Multi-objective NPCs

Evolved monsters vs. bot with stick
Individual selection
Large heterogeneous teams of 15
Third of entire population
Multiple objectives, Pareto-based
Credit assignment trick
Learns multiple objectives simultaneously
Different runs can lead to very different results
Different areas of trade-off surface
Population becomes mostly homogeneous

Constructing Complex NPC Behavior via
Multi-Objective Neuroevolution. Schrum,
Miikkulainen. 2008
27
Dead End Game

Human prey vs. predators
Offline evolution vs. bot
Team level selection
Homogeneous teams
Online evolution vs. human
Individual selection
Small heterogeneous team
Different configurations appropriate at different
levels
Sometimes the domain leaves no choice

Interactive Opponents Generate Interesting
Games. Yannakakis, Hallam. 2004
28
Cooperating Robots

Retrieve tokens
Simulation ? Robots
Compared selection levels
Individual vs. Team
Compared team compositions
Homogeneous vs. heterogeneous
Homogeneous better with teamwork and altruism
Homogeneous best with team selection
Heterogeneous best with individual selection
Did not consider subpopulations
Tasks only involved foraging (no other objectives)

Genetic Team Composition and Level of Selection
in the Evolution of Cooperation. Waibel,
Keller, Floreano. 2008
29
Summary of Issues

More complexity
Move beyond grid worlds
Need multiple contradictory objectives
Act in continuous, real-time world
Best evolutionary configuration
More comparisons between team compositions
Especially subpopulation-based method
Task/configuration pairings?
Credit assignment issues
Multi-modal behavior
What to do and when

30
Experiment

Four monsters vs. bot with stick
Smaller team makes task harder
Compare homogeneous, heterogeneous and
subpopulation
Homogeneous uses team selection
Others use individual selection
Multiple objectives
Group damage
Individual injury
Individual time alive

31
Heterogeneous Results

Many generations (600)
Not that long in real time
Mostly selfish
Good teamwork can arise though (Baiting)
Teamwork depends on population being homogeneous

Teamwork
Selfish
32
Homogeneous Results

Fewer Generations (100-200)
Actually longer in real time
Always some form a teamwork
Baiting
Timed Assault

Time Assault
Baiting
33
Subpopulations Results

Many Generations (400)
Each generation takes a lot of real time
Easy for slacker subpopulation to persist
Limited teamwork
Only some members participate

Cooperating Pair
34
Discussion

Can subpopulation method do better?
Better credit assignment
Team level selection (how?)
Speed up homogeneous and subpopulations
Heterogeneous discourage selfishness

35
Future Research Questions

Credit assignment issues
Cooperating individuals cannot be identified
Objectives define best evolutionary
configuration?
Complex domains/real problems
Many objectives
Continuous, real-time
Potential challenge domains
Robocup Soccer
Unreal Tournament

36
Conclusion

Teamwork in Multiagent Systems important area
Evolution has been successful
Better understand why
Team configuration
Level of selection
Presence/absence of credit assignment problems
Apply to harder domains
Real-time
Continuous/noisy
Multiple contradictory objectives

37
Questions?

schrum2_at_cs.utexas.edu

38
Auxiliary Slides
39
Cooperation Without Reciprocity

Abstract study of the evolution of cooperation
Donor/recipient model
3 random pairings with option of donating fitness
c so that recipient can gain fitness b
Choice to donate based on similarity of tags
Individual selection with entire population as
team
Subpopulations emerged based on tags
Donation rate changes cyclically, but generally
stays high (73) for c lt b
Need to apply in actual domain requiring teamwork