Collective Specialization for Evolutionary Design of Agent Controllers - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Collective Specialization for Evolutionary Design of Agent Controllers

Description:

Experimental Setup. Agent Group: 100 explorers; 1 Lander (recharge station) ... Experimental Setup (Adaptive Experiments) Collective (online) Neuro-Evolution: ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 20
Provided by: geoffni
Category:

less

Transcript and Presenter's Notes

Title: Collective Specialization for Evolutionary Design of Agent Controllers


1
Collective Specialization for Evolutionary
Design of Agent Controllers
  • A.E. Eiben, G.S. Nitschke, M.C. Schut
  • Computational Intelligence Group
  • Department of Computer Science, Faculty of
    SciencesDe Boelelaan 1081a, 1081 HV Amsterdam
    The Netherlands
  • gusz_at_cs.vu.nl, nitschke_at_cs.vu.nl, schut_at_cs.vu.nl

2
Introduction
  • Challenge Developing agent controllers for a
    group of simulated aerial explorers.
  • Task Search and find (collective gathering)
    missions.
  • 1st goal Demonstrate that specialization
    (behavior exhibited at agent group level) is
    beneficial for this task.
  • 2nd goal Use neuro-evolution as an adaptive
    mechanism to derive specialization (to increase
    task performance).
  • Compare Heuristic benchmark methods and other
    adaptive methods with respect to specialized
    behavior in given task.

3
Background
  • Collective behavior systems Where behavior or
    morphology (or both) is specialized increases
    efficiency and performance in many collective
    behavior tasks.
  • Neuro-evolution An approach that combines neural
    networks and evolutionary computation techniques.
  • Neuro-evolution collective behavior applications
    RoboCup, multi-agent computer games, and
    multi-robot system controllers.

4
Hypotheses
Hypothesis 1 There exist particular types of
environments where specialization increases task
performance. Hypothesis 2 The collective
specialization neuro-evolution method is
appropriate for deriving specialized groups that
yield a high task performance.
5
Adaptive MethodsConventional Neuro-Evolution
  • One genotype population (evolved offline)
  • Genotype Population 1000 neurons
  • Each tested in complete controller (9 other
    neurons)
  • All neurons evaluated before recombination
  • Best 20 of neurons selected (each cloned 5
    times) to replace population
  • End of 100 test scenarios Best 20 of neurons
    selected as controllers (cloned 5 times) to run
    in full simulation

6
Adaptive MethodsCollective Neuro-Evolution
  • N genotype populations (evolved online)
  • 100 genotype populations of 100 neurons
  • 10 neurons randomly selected from best 20 of a
    population ? controller
  • Mutation (Cauchy distribution) applied to best
    20, each cloned 5 times to replace current
    population
  • Agent controllers updated when agent received
    fitness reward

7
Experimental Setup
  • Agent Group 100 explorers 1 Lander (recharge
    station)
  • Environment Discrete
  • 200 x 200 x 200 voxels
  • 40000 red rocks

Task Maximize number of features of interest
(red rocks) discovered, given U energy, T time,
to complete task.
8
Experimental Setup (cont.)
  • Non-Adaptive experiment set
  • Specialized Agent Groups
  • Non-Specialized Agent Groups
  • Adaptive experiment set
  • Conventional Neuro-Evolution
  • Collective Neuro-Evolution

9
Experimental Setup (cont.)
  • Test environments
  • 10 distributions (ranging from low to high
    structure of red rocks)
  • 4 clusters (fixed positions), with radius r
    (variable) used Gaussian mixture model data
    generator (Paalanen and Kälviäinen, 2006)

1 Unstructured
10 Structured
10
Agent Design Sensors and Actuators
  • Agent types classified according to
    probabilities to select one of four actions a
    Detect, b Evaluate, c Move, d Communicate

Where ? ( a, b, c, d ) 1.0
a Detect (sensor) b Evaluate
(sensor) c Move (actuator)
d Communicate (actuator)
  • Battery 1000 units Cost 1 unit / sensor or
    actuator use
  • Energy Rewards (Adaptive and Non-Adaptive
    experiments)
  • Equal to number of red rocks discovered x
    Scalar (Given as recharge at Lander)
  • Fitness Rewards (Adaptive Experiments)
  • Equal to number of red rocks discovered (used
    by evolution selection process)

11
Adaptive Methods Agent Neural Controller
0..6 Non-Visual Neurons Nodes 1-4 Previous
motor output values (MO0 - MO3) Node 5 Ambient
wind Node 6 Ambient light Node 7 Previous
red rock evaluation
7..55 Visual Neurons Detection sensor
resolution (encoded in genotype) Each visual
neuron represents a voxel on ground z
plane Higher resolution ? Lower probability of
detection success
12
Non-Adaptive Methods Defining agent types
  • At the individual level Specialized agent
    types
  • Action Preferences set a priori and remained
    static throughout a simulation run.

13
Non-Adaptive Methods Non-specialized groups
  • At the group level Specialized group types
  • At the group level Non-Specialized group types

14
Results
Adaptive methods Neuro-evolution comparison
Non-adaptive methods benchmarks
15
Results (cont).
  • Results from heuristic and neuro-evolution
    methods Conformed to normal distributions
    (Kolmogorov-Smirnov test applied).
  • T-test applied Specialized and non-specialized
    heuristic results. Null hypothesis (two data
    sets not significantly different) was rejected.
    Supported 1st hypothesis specialization is
    advantageous (increases task performance) in
    certain environment types.
  • T-test applied Comparative neuro-evolution
    results. Null hypothesis was rejected. Partially
    supported 2nd hypothesis collective
    neuro-evolution would yield a (relatively) higher
    task performance.
  • Supporting 2nd hypothesis Compared the best
    performing specialized heuristic method, and
    specialization (composition of agent types in
    group) derived by collective neuro-evolution
    method.

16
Results (cont.)
  • Group Composition best performing specialized
    non-adaptive group
  • Group Composition best performing evolved
    groups
  • RRVG Red Rock Value Gathered
  • DoS Degree of Structure
  • (Test environment type)

17
Conclusions
  • Supporting 1st hypothesis Heuristic
    pre-designed specialization methods showed that
    specialization was beneficial in a range of test
    environments (defined by different resource
    distributions)
  • Supporting 2nd hypothesis Collective
    neuro-evolution method yielded a higher
    performance (in all test environments),
    comparative to a conventional neuro-evolution
    method
  • Collective neuro-evolution Best performing
    group converged to a specialized group
    composition (majority of the agents assumed one
    role). Resembled group composition of highest
    performing pre-designed (heuristic) specialized
    group

18
  • ?

19
Experimental Setup (Adaptive Experiments)
  • Collective (online) Neuro-Evolution
  • Generations Evolutionary operator applied to
    agent controller each time fitness reward
    received.
  • Conventional (offline) Neuro-Evolution
  • Offline testing phase n test scenarios (250
    iterations) testing n genotypes.
  • End of testing phase best 20 of genotypes
    selected, 5 clones of each, corresponding
    phenotypes placed into task environment.
  • Generations At the end of a testing scenario
    the fittest 20 (elite portion) of genotypes were
    recombined and mutated producing 5 offspring each
    so as to replace the entire genotype population.
    Next test scenario then began with the next
    generation of genotypes.
Write a Comment
User Comments (0)
About PowerShow.com