Title: Evolutionary Computation
1Evolutionary Computation
2Evolutionary Complexification
- Two major goals in intelligent systems are the
discovery and improvement of solutions to complex
problems. - Complexification, i.e. the incremental
elaboration of solutions through adding new
structure, achieves both these goals. - To discover and improve complex solutions,
evolution, and search in general, should be
allowed to complexify as well as optimize.
3Evolutionary Computation
- Class of algorithms that can be applied to
open-ended learning problems in AI - Traditionally such algorithms evolve fixed length
genomes assuming the space of the genome is
sufficient to encode the solution - In many cases a solution may be known to exist in
that space
4Indefinite numbers of parameters
- Many common structures are defined by an
indefinite number of parameters - E.g., the number of neurons in an ANN
- So it is often not clear what number of genes is
appropriate to solve a problem - Researchers must use heuristics to determine a
priori the appropriate number of genes
5Fixed Length Encoding
- Pre-determination of the appropriate number of
genes is difficult - Larger the genome the larger the search space
- Sometimes the solutions should evolve in an
open-ended way (games) with no final solution - Fixing the maximum size of the genome also fixes
the maximum complexity of the evolved solutions
6Examples
- Ping-pong playing robot - solution is to make the
genome very large - Open-ended problems when no final solution can be
accepted, improving after a certain point not
possible with a fixed length genome
7Continual Evolution
- Such continual evolution is difficult with a
fixed genome for two reasons - When a good strategy is found in a fixed-length
genome, the entire representational space of the
genome is used to encode it. Thus, the only way
to improve it is to alter the strategy, thereby
sacrificing some of the functionality learned
over previous generations. - Fixing the size of the genome in such domains
arbitrarily fixes the maximum complexity of
evolved creatures, defeating the purpose of the
experiment.
8Phenotype and Genotype
Extending the length and size of the genome adds
new genes that lead to increased phenotypic
complexity
A phenotype is an individual's observable traits,
such as height, eye color, and blood type. The
genetic contribution to the phenotype is called
the genotype. Some traits are largely
determined by the genotype, while other traits
are largely determined by environmental factors.
9Complexification
- Extending the length and size of the genome
- Adds new genes that lead to increased phenotypic
complexity - Called complexification
- Specifically with evolving neural nets it means
adding nodes and connections to an already
functional ANN - Allow more complex strategies to elaborate on
simpler strategies.
10Complexification in Nature
- In nature optimization does not occur with fixed
size genes - New genes are occasionally added to the genome
- Speciation protects newly formed more complex
genes
11Evolving neuro-architecture
- Over many generations, new hidden nodes and
connections are added, complexifying the space of
potential solutions. - In this way, more complex strategies elaborate on
simpler strategies, focusing search on solutions
that are likely to maintain existing
capabilities.
12Emergence of Strategies
Dn network with dominance level n Sk best
network in species S at generation k hl lth
hidden node to arise from a structural mutation
Begin with S100 Mature no hidden node strategy,
followed even when the opponent had more energy
leaving it vulnerable to attack S200 Evolved a
resting strategy. Not a complexification S267
h22 appeared. Switched between resting and all
out attack S315 improved ability to attack at
appropriate times.
13Duel Robot Domain
Food is represented by sandwiches and robots by
the circles representing sensors and arrows
representing directions. The objective is to
forage to obtain a higher level of energy than
the opponent and then collide with it
The duel domain supports sophisticated strategies
that are recognizable
http//nn.cs.utexas.edu/pages/research/neatdemo.ht
ml
14Alteration vs. Elaboration
15Alteration vs. Elaboration
- The dark robot must evolve to avoid the lighter
robot, which attempts to cause a collision. - In the alteration scenario (top), the dark robot
first evolves a strategy to go around the left
side of the opponent. However, the strategy fails
in a future generation when the opponent begins
moving to the left. - The dark robot alters its strategy by evolving
the tendency to move right instead of left.
However, when the light robot later moves right,
the new, altered, strategy fails because the dark
robot did not retain its old ability to move
left. - In the elaboration scenario (bottom), the
original strategy of moving left also fails.
However, instead of altering the strategy, it is
elaborated by adding a new ability to move right
as well. Thus, when the opponent later moves
right, the dark robot still has the ability to
avoid it by using its original strategy. - Elaboration is necessary for a coevolutionary
arms race to emerge and it can be achieved
through complexification.
16Key ideas
- Keeping track of which genes match with
differently sized genes throughout evolution - Speciation, so that solutions of differing
complexity can exist independently - Beginning with a uniform population of small
networks
17Scalability
- Open-ended problems with no explicit fitness
function - Fitness depends on comparisons with other agents
performing the same task (uses coevolution) - Robot duel domain. No known best strategy for a
robot.
18Gene duplication
- Gene duplication is a kind of mutation in which
multiple copies of parental genes are copied into
offspring genome - The offspring has redundant genes expressing the
same proteins - Gene duplication is a possible explanation how
natural evolution expanded the size of genomes
throughout evolution
19Evidence for Gene Duplication
- Gene duplication has been responsible for key
innovations in overall body morphology over the
course of natural evolution - A major gene duplication event occurred around
the time that vertebrates separatedfrom
invertebrates. - Invertebrates have a single HOX cluster (of
genes) while vertebrates have four, suggesting
that cluster duplication significantly
contributed to elaborations in vertebrate
bodyplans - Researchers agree that gene duplication in some
form contributed significantly to body-plan
elaboration.
20Gene Duplication and Genetic Programming
- Gene duplication is a possible explanation how
natural evolution indeed expanded the size of
genomes throughout evolution, and provides
inspiration for adding new genes to artificial
genomes as well. - Gene duplication motivated Koza (1995) to allow
entire functions in genetic programs to be
duplicated through a single mutation, and later
differentiated through further mutations. - When evolving neural networks, this process means
adding new neurons and connections to the
networks.
21Challenges
- Such systems evolve different sized and shaped
network topologies which are difficult to
crossover without losing information - Artificial crossover may disrupt evolved
topologies - Optimizing variable length genomes may take
longer and more complex networks be eliminated
before they have had a chance to be optimized
22ImplementingVariable Length Genes
- Crossover causes problems through misalignment
- Optimization takes longer causing early
elimination of possible innovations
23Alignment
- Depending on when new structure was added, the
same gene may exist at different positions, or
conversely, different genes may exist at the same
position. - Thus, artificial crossover may disrupt evolved
topologies through misalignment. - Alignment processes have been observed in nature
synapsis
24Speciation
- Second, innovations in nature are protected
through speciation. Organisms with significantly
divergent genomes never mate because they are in
different species. - If any organism could mate with any other,
organisms with initially larger, less-fit genomes
would be forced to compete for mates with their
simpler, more fit counterparts. - As a result, the larger, more innovative genomes
would fail to produce offspring and disappear
from the population.
25NEAT ALGORITHM
- NeuroEvolution of Augmenting Topologies (NEAT)
improved genetic algorithms by making including
complexification and speciation in the algorithm - Alignment during crossover through synapsis
- Speciation protects complexification
26Competitive Coevolution
- Fitness signifies only the relative strength of
solutions - Ideally solutions evolve in an arms race
towards better performance - Interesting strategies only evolve if the arms
race continues for a large number of generations
27Progress in Evolution
- Evolution finds simplest strategy that can win
- Strategies switch back and forth
opportunistically between variations, losing some
abilities and attaining others
28Pareto Coevolution
Pareto coevolution finds the best learners and
the best teachers in two populations by casting
coevolution as a multiobjective optimization
problem. This information enables choosing the
best individuals to reproduce, as well as
maintaining an informative and diverse set of
opponents.
29Progress in Evolution
- Techniques Hall of Fame, Fitness Sharing,
Pareto Coevolution finding the best learners
and best teachers in a population - These techniques allow sustaining the arms race
longer but do not encourage continual evolution
creating new solutions that maintain existing
capabilities.
30Complexification
- Complexification elaborates strategies by adding
new dimensions, enabling indefinite progress
31NeuroEvolution of Augmenting TopologiesNEAT
- Using historical markings to line up genes for
crossover - Protecting topological evolution through
speciation - Minimization of topologies throughout evolution
32Genetic Encoding
- A genome includes a list of connecting genes, an
in-node, an out-node, weight, expression enable
bit and an innovation number
33Genetic Encoding
34Historical Origins
Two genes with the same historical origin
represent the same structure (although possibly
with different weights), since they were both
derived from the same ancestral gene at some
point in the past. Thus a system needs to do is
to keep track of the historical origin of every
gene in the system.
35Mutation
- Mutation in NEAT can change both connection
weights and network structures. - Connection weights mutate (usual NE algorithm)
- Structural mutation operates in two ways - add
connection and add node connection split, new
in-weight of 1 out-weight same as old weight so
functionality does not change initially
36Structural Mutation in NEAT
The connection between the first node and the old
node is given the weight 1 and the connection
between the new node and the second is given the
same weight of the connection being split.
37Historical Markings
- If the two above mutations occur consecutively
the innovation numbers associated with the new
genes allow the system to keep track of the
histories of every gene in the system
38Crossover using innovation numbers
Historical markings are lined up and randomly
chosen for the offspring Genes that do not match
are inherited from the more fit parent or
randomly. Disabled genes are inherited at 25
39Speciating
- It turns out that a population of varying
complexities cannot maintain topological
innovations on its own. - Because smaller structures optimize faster than
larger structures, and adding nodes and
connections usually initially decreases the
fitness of the network, recently augmented
structures have little hope of surviving more
than one generation even though the innovations
they represent might be crucial towards solving
the task in the long run. - The solution is to protect innovation by
speciating the population.
40Speciation
- NEAT speciates the population so that individuals
compete primarily within their own niches instead
of with the population at large. This way,
topological innovations are protectedand have
time to optimize their structure before they have
to compete with other niches in the population. - Speciation prevents bloating of genomes Species
with smaller genomes survive as long as their
fitness is competitive, ensuring that small
networks are not replaced by larger ones
unnecessarily. - Protecting innovation through speciation follows
the philosophy that new ideas must be given time
to reach their potential before they are
eliminated.
41Speciation
Distance between networks
E is the number of excess genes D is the number
of disjoint genes W is the average weight
difference of matching genes N is the number of
genes in the larger genome
If the distance of from a test gene to a randomly
chosen member of a species is less than the
current compatibility threshold the test gene is
placed in the species
42Speciation
43Fitness Sharing
Organisms in the same species must share the
fitness of their niche. The adjusted fitness f
for organism i is calculated according to its
distance from every other organism j in the
population where sh is set to 0 when the
distance is above the threshold and 1 otherwise.
The factor
reduces to the number of organisms in the same
species as organism i
Every species is assigned a potentially different
number of offspring in proportion to the sum of
adjusted fitnesses fi of its member organisms.
Species reproduce by first eliminating the lowest
performing members from the population. The
entire population is then replaced by the
offspring of the remaining organisms in each
species.
44A Run modelsincreasing complexity
- Run begins with a uniform population with no
hidden nodes that differ in the random
assignments of weights - The gradual production of increasingly complex
structures constitutes the model of
complexification
45Coevolution Domain
- Domain where it is possible to develop a wide
range increasingly sophisticated strategies - Sophistication can be readily measured.
- A coevolution domain is particularly appropriate
because a sustained arms race should lead to
increasing sophistication.
46Duel Robot Domain
Food is represented by sandwiches and robots by
the circles representing sensors and arrows
representing directions. The objective is to
forage to obtain a higher level of energy than
the opponent and then collide with it
The duel domain supports sophisticated strategies
that are recognizable
http//nn.cs.utexas.edu/pages/research/neatdemo.ht
ml
47The Robot ANN
Each has five robot finder sensors and five to
sense food. Each has two wheels controlled by
separate motors and can read the opponents energy
level and has a wall sensor. Energy is consumed
in proportion to the amount applied to the motors.
48About the duel domain
- The observed state taken by the sensors does not
include the internal state of the opponent - The next observed state depends on the decision
of the opponent - It is necessary for the robots to learn to
predict what the opponent is likely to do.
49Opponent Sampling
- Evolve two separate populations.
- In each generation, each population is evaluated
against an intelligently chosen sample of
networks from the other population. - The population currently being evaluated is
called the host population, and the population
from which opponents are chosen is called the
parasite population
50Competition
- Each host was evaluated against the four highest
species champions. They are good opponents
because they are the best of the best species,
and they are guaranteed to be diverse because
their distance must exceed the species threshold - Another eight opponents were chosen randomly from
a Hall of Fame composed of all generation The
Hall of Fame ensures that existing abilities need
to be maintained to obtain a high fitness. - Together speciation, fitness sharing, opponent
sampling and Hall of Fame comprise an effective
competitive coevolution methodology.
51Population and Competition
- Each population had 256 networks
- Host networks received 1 point for each win and 0
for losing - Each host was evaluated in 24 games (12 opponents
x 2 games each) - Of the 12, 4 were species champions and 8 were
Hall of Famers.
52Difficulty of tournaments
- For example, if strategy A defeats 499 out of 500
opponents, and B defeats 498, counting will
designate A as superior to B even if B defeats A
in a direct comparison. - In order to decisively track strategic
innovation, we need to identify dominant
strategies - those that defeat all previous
dominant strategies. - This way, we can make sure that evolution
proceeds by developing a progression of strictly
more powerful strategies, instead of e.g.
switching between alternative ones.
53Dominance Tournament
- A run returns record of every generation champion
from both populations - A network a is superior to a network b if a wins
more games than b out of 288 total games with
different food placements - A generational champion is the winner of a 288
game comparison between the host and parasite
champions of a single generation - The first dominant strategy d1 is the first
generation champion - The dominant strategy dj, jgt 1 is a generation
champion such that for all i lt j dj is superior
to di - Process is called a dominance tournament
54Features of a dominance tournament
- Fewer games than other tournaments
- Allows identification of a sequence of
increasingly sophisticated strategies (dominant
individuals)
55Results 33 Evolutions
- Each of the 33 evolution runs took days,
depending on the progress of evolution and sizes
of the networks involved.
56Measuring Complexity
- Define complexity as the number of nodes and
connections in a network The more nodes and
connections there are in the network, the more
complex behavior it can potentially implement. - The results were analyzed to answer three
questions - (1) As evolution progresses does it also
continually complexify? - (2) Does such complexification lead to more
sophisticated strategies? - (3) Does complexification allow better strategies
to be discovered than does evolving
fixed-topology networks?
57Emergence of Complexity
The hashed lines represent the average over 13
runs of the structure of the highest dominant
network in each generation. A hash mark appears
each time a new dominant network emerged. The two
other lines represent the average over five runs
of the most and least complex networks without
fitness selection (random assignment of fitness).
This shows that without fitness a wide range of
complexity is evolved.
58Emergence of Strategies
Dn network with dominance level n Sk best
network in species S at generation k hl lth
hidden node to arise from a structural mutation
Begin with S100 Mature no hidden node strategy,
followed even when the opponent had more energy
leaving it vulnerable to attack S200 Evolved a
resting strategy. Not a complexification S267
h22 appeared. Switched between resting and all
out attack S315 improved ability to attack at
appropriate times.
59Best Complexifying Network
11 hidden nodes and 202 connections
60Fixed-Topology vs Complexification
61Conclusions
Complexifying Evolution only searches
higher-dimensional structures that are
elaborations of known good lower-dimensional
structures. The values of the existing genes have
already been optimized over preceding
generations. This may mean that the search in the
higher-dimensional space is starting in a
position of some advantage compared to a purely
random position in that space. This may explain
why this method is able to find solutions that
fixed topology coevolution cannot.