Collective Specialization for Evolutionary Design of Agent Controllers presentation

About This Presentation

Transcript and Presenter's Notes

Title: Collective Specialization for Evolutionary Design of Agent Controllers

1
Collective Specialization for Evolutionary
Design of Agent Controllers

A.E. Eiben, G.S. Nitschke, M.C. Schut
Computational Intelligence Group
Department of Computer Science, Faculty of
SciencesDe Boelelaan 1081a, 1081 HV Amsterdam
The Netherlands
gusz_at_cs.vu.nl, nitschke_at_cs.vu.nl, schut_at_cs.vu.nl

2
Introduction

Challenge Developing agent controllers for a
group of simulated aerial explorers.
Task Search and find (collective gathering)
missions.
1st goal Demonstrate that specialization
(behavior exhibited at agent group level) is
beneficial for this task.
2nd goal Use neuro-evolution as an adaptive
mechanism to derive specialization (to increase
task performance).
Compare Heuristic benchmark methods and other
adaptive methods with respect to specialized
behavior in given task.

3
Background

Collective behavior systems Where behavior or
morphology (or both) is specialized increases
efficiency and performance in many collective
behavior tasks.
Neuro-evolution An approach that combines neural
networks and evolutionary computation techniques.
Neuro-evolution collective behavior applications
RoboCup, multi-agent computer games, and
multi-robot system controllers.

4
Hypotheses
Hypothesis 1 There exist particular types of
environments where specialization increases task
performance. Hypothesis 2 The collective
specialization neuro-evolution method is
appropriate for deriving specialized groups that
yield a high task performance.
5
Adaptive MethodsConventional Neuro-Evolution

One genotype population (evolved offline)

Genotype Population 1000 neurons
Each tested in complete controller (9 other
neurons)
All neurons evaluated before recombination
Best 20 of neurons selected (each cloned 5
times) to replace population
End of 100 test scenarios Best 20 of neurons
selected as controllers (cloned 5 times) to run
in full simulation

6
Adaptive MethodsCollective Neuro-Evolution

N genotype populations (evolved online)

100 genotype populations of 100 neurons
10 neurons randomly selected from best 20 of a
population ? controller
Mutation (Cauchy distribution) applied to best
20, each cloned 5 times to replace current
population
Agent controllers updated when agent received
fitness reward

7
Experimental Setup

Agent Group 100 explorers 1 Lander (recharge
station)
Environment Discrete
200 x 200 x 200 voxels
40000 red rocks

Task Maximize number of features of interest
(red rocks) discovered, given U energy, T time,
to complete task.
8
Experimental Setup (cont.)

Non-Adaptive experiment set
Specialized Agent Groups
Non-Specialized Agent Groups
Adaptive experiment set
Conventional Neuro-Evolution
Collective Neuro-Evolution

9
Experimental Setup (cont.)

Test environments
10 distributions (ranging from low to high
structure of red rocks)
4 clusters (fixed positions), with radius r
(variable) used Gaussian mixture model data
generator (Paalanen and Kälviäinen, 2006)

1 Unstructured
10 Structured
10
Agent Design Sensors and Actuators

Agent types classified according to
probabilities to select one of four actions a
Detect, b Evaluate, c Move, d Communicate

Where ? ( a, b, c, d ) 1.0
a Detect (sensor) b Evaluate
(sensor) c Move (actuator)
d Communicate (actuator)

Battery 1000 units Cost 1 unit / sensor or
actuator use

Energy Rewards (Adaptive and Non-Adaptive
experiments)
Equal to number of red rocks discovered x
Scalar (Given as recharge at Lander)
Fitness Rewards (Adaptive Experiments)
Equal to number of red rocks discovered (used
by evolution selection process)

11
Adaptive Methods Agent Neural Controller
0..6 Non-Visual Neurons Nodes 1-4 Previous
motor output values (MO0 - MO3) Node 5 Ambient
wind Node 6 Ambient light Node 7 Previous
red rock evaluation
7..55 Visual Neurons Detection sensor
resolution (encoded in genotype) Each visual
neuron represents a voxel on ground z
plane Higher resolution ? Lower probability of
detection success
12
Non-Adaptive Methods Defining agent types

At the individual level Specialized agent
types

Action Preferences set a priori and remained
static throughout a simulation run.

13
Non-Adaptive Methods Non-specialized groups

At the group level Specialized group types

At the group level Non-Specialized group types

14
Results
Adaptive methods Neuro-evolution comparison
Non-adaptive methods benchmarks
15
Results (cont).

Results from heuristic and neuro-evolution
methods Conformed to normal distributions
(Kolmogorov-Smirnov test applied).
T-test applied Specialized and non-specialized
heuristic results. Null hypothesis (two data
sets not significantly different) was rejected.
Supported 1st hypothesis specialization is
advantageous (increases task performance) in
certain environment types.
T-test applied Comparative neuro-evolution
results. Null hypothesis was rejected. Partially
supported 2nd hypothesis collective
neuro-evolution would yield a (relatively) higher
task performance.
Supporting 2nd hypothesis Compared the best
performing specialized heuristic method, and
specialization (composition of agent types in
group) derived by collective neuro-evolution
method.

16
Results (cont.)

Group Composition best performing specialized
non-adaptive group

Group Composition best performing evolved
groups
RRVG Red Rock Value Gathered
DoS Degree of Structure
(Test environment type)

17
Conclusions

Supporting 1st hypothesis Heuristic
pre-designed specialization methods showed that
specialization was beneficial in a range of test
environments (defined by different resource
distributions)
Supporting 2nd hypothesis Collective
neuro-evolution method yielded a higher
performance (in all test environments),
comparative to a conventional neuro-evolution
method
Collective neuro-evolution Best performing
group converged to a specialized group
composition (majority of the agents assumed one
role). Resembled group composition of highest
performing pre-designed (heuristic) specialized
group

19
Experimental Setup (Adaptive Experiments)

Collective (online) Neuro-Evolution
Generations Evolutionary operator applied to
agent controller each time fitness reward
received.
Conventional (offline) Neuro-Evolution
Offline testing phase n test scenarios (250
iterations) testing n genotypes.
End of testing phase best 20 of genotypes
selected, 5 clones of each, corresponding
phenotypes placed into task environment.
Generations At the end of a testing scenario
the fittest 20 (elite portion) of genotypes were
recombined and mutated producing 5 offspring each
so as to replace the entire genotype population.
Next test scenario then began with the next
generation of genotypes.

Write a Comment

User Comments (0)

About PowerShow.com

Collective Specialization for Evolutionary Design of Agent Controllers PowerPoint PPT Presentation