Title: Introductory Workshop on Evolutionary Computing
1Introductory Workshop on Evolutionary Computing
Part I Introduction to Evolutionary Algorithms
- Dr. Daniel Tauritz
- Director, Natural Computation Laboratory
- Associate Professor, Department of Computer
Science - Research Investigator, Intelligent Systems Center
- Collaborator, Energy Research Development Center
2Motivation
- Real-world optimization problems are typically
characterized by huge, ill-behaved solution
spaces - Infeasible to exhaustively search
- Defy traditional (gradient-based) optimization
algorithms because they are non-linear,
non-differentiable, non-continuous, or non-convex
3Real-World Example
- Electric Power Transmission Systems
- Supply is not keeping up with demand
- Expansion hampered by
- Social, environmental, and economic constraints
- Transmission system is stressed
- Already carrying more than intended
- Dramatic increase in incidence reports
4The Grid
5The Grid Failure
6The Grid Redistribution
7The Grid A Cascade
8The Grid Redistribution
9The Grid Unsatisfiable
10The Grid Unsatisfiable
11Failure Analysis
- Failure spreads relatively quickly
- Too quickly for conventional control
- Cascade may be avoidable
- Utilize unused capacities (flow compensation)
- Unsatisfiable condition may be avoidable
- Better power flow control to reduce severity
12Possible Solution
- Strategically place a number of power flow
control devices - Flexible A/C Transmission System (FACTS) devices
are a promising type of high-speed
power-electronics power flow control devices - Unified Power Flow Controller (UPFC)
13FACTS Interaction Laboratory
UPFC
Simulation Engine
HIL Line
14The placement optimization problem
- UPFCs are extremely expensive, so only a limited
number can be placed - Placement is a combinatorial problem
- Given 1000 high-voltage lines and 10 UPFCs, there
are 1000C10 total possible placements (about 2.6
x 1023) - If each placement is evaluated in 1 minute, then
it will take about 5 x 1015 centuries to solve
using exhaustive search
15The placement solution space
- Placing individual UPFC devices are not
independent tasks - There are complex non-linear interactions between
UPFC devices - The placement solution space is ill-behaved, so
traditional optimization algorithms are not usable
16Evolutionary Computing
- The field of Evolutionary Computing (EC) studies
the theory and application of Evolutionary
Algorithms (EAs) - EAs can be described as a class of stochastic,
population-based optimization algorithms inspired
by natural evolution, genetics, and population
dynamics
17Very high-level EA schematic
problem instance
EA
representation
fitness function
EA operators
EA parameters
solution
18Intuitive view of why EAs work
- Trial-and-error (aka generate-and-test)
- Graduated solution quality creates virtual
gradient - Stochastic local search of solution landscape
19(No Transcript)
20(Darwinian) Evolution
- The environment contains populations of
individuals of the same species which are
reproductively compatible - Natural selection
- Random variation
- Survival of the fittest
- Inheritance of traits
21(Mendelian) Genetics
- Genotypes vs. phenotypes
- Pleitropy one gene affects multiple phenotypic
traits - Polygeny one phenotypic trait is affected by
multiple genes - Chromosomes (haploid vs. diploid)
- Loci and alleles
22Nature versus the digital realm
Environment Problem (solution space)
Fitness Fitness function
Population Set
Individual Datastructure
Genes Elements
Alleles Datatype
23Scope
- Genotype functional unit of inheritance
- Individual functional unit of selection
- Population functional unit of evolution
24Solution Representation
- Structural types linear, tree, FSM, etc.
- Data types bit strings, integers, permutations,
reals, etc. - EA genotype encodes solution representation and
attributes - EA phenotype expresses the EA genotype in the
current environment - Encoding Decoding
25Fitness Function
- Determines individuals fitness based selection
chances - Transforms objective function to linearly ordered
set with higher fitness values corresponding to
higher quality solutions (i.e., solutions which
better satisfy the objective function) - Knapsack Problem Example
26Initialization
- (Initial) population size
- Uniform random
- Heuristic based
- Knowledge based
- Genotypes from previous runs
- Seeding
27Parent selection
- Fitness Proportional Selection (FPS)
- Roulette wheel sampling
- High risk of premature convergence
- Uneven selective pressure
- Fitness function not transposition invariant
- Fitness Rank Selection
- Mapping function (like a cooling schedule)
- Tournament selection
28Variation operators
- Mutation Stochastic unary variation operator
- Recombination Stochastic multi-ary variation
operator
29Mutation
- Bit-String Representation
- Bit-Flip
- Eflips L pm
- Integer Representation
- Random Reset (cardinal attributes)
- Creep Mutation (ordinal attributes)
30Mutation cont.
- Floating-Point
- Uniform
- Non-uniform from fixed distribution
- Gaussian, Cauche, Levy, etc.
- Permutation
- Swap
- Insert
- Scramble
- Inversion
31Recombination
- Recombination rate asexual vs. sexual
- N-Point Crossover (positional bias)
- Uniform Crossover (distributional bias)
- Discrete recombination (no new alleles)
- (Uniform) arithmetic recombination
- Simple recombination
- Single arithmetic recombination
- Whole arithmetic recombination
32Survivor selection
- (µ?) plus strategy
- (µ,?) comma strategy (aka generational)
- Typically fitness-based
- Deterministic vs. stochastic
- Truncation
- Elitism
- Alternatives include completely stochastic and
age-based
33Termination
- CPU time / wall time
- Number of fitness evaluations
- Lack of fitness improvement
- Lack of genetic diversity
- Solution quality / solution found
- Combination of the above
34Simple Genetic Algorithm (SGA)
- Representation Bit-strings
- Recombination 1-Point Crossover
- Mutation Bit Flip
- Parent Selection Fitness Proportional
- Survival Selection Generational
35Problem solving steps
- Collect problem knowledge (at minimum solution
representation and objective function) - Define gene representation and fitness function
- Creation of initial population
- Parent selection, mate pairing
- Define variation operators
- Survival selection
- Define termination condition
- Parameter tuning
36Typical EA Strategy Parameters
- Population size
- Initialization related parameters
- Selection related parameters
- Number of offspring
- Recombination chance
- Mutation chance
- Mutation rate
- Termination related parameters
37EA Pros
- More general purpose than traditional
optimization algorithms i.e., less problem
specific knowledge required - Ability to solve difficult problems
- Solution availability
- Robustness
- Inherent parallelism
38EA Cons
- Fitness function and genetic operators often not
obvious - Premature convergence
- Computationally intensive
- Difficult parameter optimization
39Behavioral aspects
- Exploration versus exploitation
- Selective pressure
- Population diversity
- Fitness values
- Phenotypes
- Genotypes
- Alleles
- Premature convergence
40Genetic Programming (GP)
- Characteristic property variable-size
hierarchical representation vs. fixed-size linear
in traditional EAs - Application domain model optimization vs. input
values in traditional EAs - Unifying Paradigm Program Induction
41Program induction examples
- Optimal control
- Planning
- Symbolic regression
- Automatic programming
- Discovering game playing strategies
- Forecasting
- Inverse problem solving
- Decision Tree induction
- Evolution of emergent behavior
- Evolution of cellular automata
42GP specification
- S-expressions
- Function set
- Terminal set
- Arity
- Correct expressions
- Closure property
- Strongly typed GP
43GP notes
- Mutation or recombination (not both)
- Bloat (survival of the fattest)
- Parsimony pressure
44Case Study employing GPDeriving Gas-Phase
Exposure History through Computationally Evolved
Inverse Diffusion Analysis
45Introduction
46Background
- Indoor air pollution top five environmental
health risks - 160 billion could be saved every year by
improving indoor air quality - Current exposure history is inadequate
- A reliable method is needed to determine past
contamination levels and times
47Problem Statement
- A forward diffusion differential equation
predicts concentration in materials after
exposure - An inverse diffusion equation finds the timing
and intensity of previous gas contamination - Knowledge of early exposures would greatly
strengthen epidemiological conclusions
48(No Transcript)
49Proposed Solution
- Use Genetic Programming (GP) as a directed search
for inverse equation - Fitness based on forward equation
x5 x4 - tan(y) / pi
x2 sin(x)
sin(cos(xy)2)
sin(xy) e(x2)
5x2 12x - 4
x2 - sin(x)
X
Sin
/
?
50Related Research
- It has been proven that the inverse equation
exists - Symbolic regression with GP has successfully
found both differential equations and inverse
functions - Similar inverse problems in thermodynamics and
geothermal research have been solved
51Interdisciplinary Work
- Collaboration between Environmental Engineering,
Computer Science, and Math
Parent Selection
Forward Diffusion Equation
Competition
Reproduction
Genetic Programming Algorithm
52Genetic Programming Background
Sin
X
X
X
Pi
53Summary
- Ability to characterize exposure history will
enhance ability to assess health risks of
chemical exposure
54Parameter Tuning
- A priori optimization of EA strategy parameters
- Start with stock parameter values
- Manually adjust based on user intuition
- Monte Carlo sampling of parameter values on a few
(short) runs - Meta-tuning algorithm (e.g., meta-EA)
55Parameter Tuning drawbacks
- Exhaustive search for optimal values of
parameters, even assuming independency, is
infeasible - Parameter dependencies
- Extremely time consuming
- Optimal values are very problem specific
- Different values may be optimal at different
evolutionary stages
56Parameter Control
- Blind
- Example replace pi with pi(t)
- akin to cooling schedule in Simulated Annealing
- Adaptive
- Example Rechenbergs 1/5 success rule
- Self-adaptive
- Example mutation-step size control
57Evaluation Function Control
- Example 1 Parsimony Pressure in GP
- Example 2 Penalty Functions in Constraint
Satisfaction Problems (aka Constrained
Optimization Problems)
58Penalty Function Control
- eval(x)f(x)W penalty(x)
- Deterministic example
- WW(t)(C t)a with C,a1
- Adaptive example
- Self-adaptive example
- Note this allows evolution to cheat!
59Parameter Control aspects
- What is changed?
- Parameters vs. operators
- What evidence informs the change?
- Absolute vs. relative
- What is the scope of the change?
- Gene vs. individual vs. population
- Ex one-bit allele for recombination operator
selection (pairwise vs. vote)
60Parameter control examples
- Representation (GPADFs, delta coding)
- Evaluation function (objective function/)
- Mutation (ES)
- Recombination (Davis adaptive operator
fitnessimplicit bucket brigade) - Selection (Boltzmann)
- Population
- Multiple
61Self-Adaptive Mutation Control
- Pioneered in Evolution Strategies
- Now in widespread use in many types of EAs
62Uncorrelated mutation with one ?
- Chromosomes ? x1,,xn, ? ?
- ? ? exp(? N(0,1))
- xi xi ? N(0,1)
- Typically the learning rate ? ? 1/ n½
- And we have a boundary rule ? lt ?0 ? ? ?0
63Mutants with equal likelihood
- Circle mutants having same chance to be created
64Uncorrelated mutation with n ?s
- Chromosomes ? x1,,xn, ?1,, ?n ?
- ?i ?i exp(? N(0,1) ? Ni (0,1))
- xi xi ?i Ni (0,1)
- Two learning rate parmeters
- ? overall learning rate
- ? coordinate wise learning rate
- ? ? 1/(2 n)½ and ? ? 1/(2 n½) ½
- ? and ? have individual proportionality
constants which both have default values of 1 - ?i lt ?0 ? ?i ?0
65Mutants with equal likelihood
- Ellipse mutants having the same chance to be
created
66Correlated mutations
- Chromosomes ? x1,,xn, ?1,, ?n ,?1,, ?k ?
- where k n (n-1)/2
- and the covariance matrix C is defined as
- cii ?i2
- cij 0 if i and j are not correlated
- cij ½ ( ?i2 - ?j2 ) tan(2 ?ij) if i and
j are correlated - Note the numbering / indices of the ?s
67Correlated mutations contd
- The mutation mechanism is then
- ?i ?i exp(? N(0,1) ? Ni (0,1))
- ?j ?j ? N (0,1)
- x x N(0,C)
- x stands for the vector ? x1,,xn ?
- C is the covariance matrix C after mutation of
the ? values - ? ? 1/(2 n)½ and ? ? 1/(2 n½) ½ and ? ? 5
- ?i lt ?0 ? ?i ?0 and
- ?j gt ? ? ?j ?j - 2 ? sign(?j)
68Mutants with equal likelihood
- Ellipse mutants having the same chance to be
created
69Learning Classifier Systems (LCS)
- Note LCS is technically not a type of EA, but
can utilize an EA - Condition-Action Rule Based Systems
- rule format ltconditionactiongt
- Reinforcement Learning
- LCS rule format
- ltconditionactiongt ? predicted payoff
- dont care symbols
70(No Transcript)
71LCS specifics
- Multi-step credit allocation Bucket Brigade
algorithm - Rule Discovery Cycle EA
- Pitt approach each individual represents a
complete rule set - Michigan approach each individual represents a
single rule, a population represents the complete
rule set
72Multimodal Problems
- Multimodal def. multiple local optima and at
least one local optimum is not globally optimal - Basins of attraction Niches
- Motivation for identifying a diverse set of high
quality solutions - Allow for human judgement
- Sharp peak niches may be overfitted
73Restricted Mating
- Panmictic vs. restricted mating
- Finite pop size panmictic mating -gt genetic
drift - Local Adaptation (environmental niche)
- Punctuated Equilibria
- Evolutionary Stasis
- Demes
- Speciation (end result of increasingly
specialized adaptation to particular
environmental niches)
74Implicit Diversity Maintenance (1)
- Multiple runs of standard EA
- Non-uniform basins of attraction problematic
- Island Model (coarse-grain parallel)
- Punctuated Equilibria
- Epoch, migration
- Communication characteristics
- Initialization number of islands and respective
population sizes
75Implicit Diversity Maintenance (2)
- Diffusion Model EAs
- Single Population, Single Species
- Overlapping demes distributed within Algorithmic
Space (e.g., grid) - Equivalent to cellular automata
- Automatic Speciation
- Genotype/phenotype mating restrictions
76Explicit Diversity Maintenance
- Fitness Sharing individuals share fitness within
their niche - Crowding replace similar parents
77Multi-Objective EAs (MOEAs)
- Extension of regular EA which maps multiple
objective values to single fitness value - Objectives typically conflict
- In a standard EA, an individual A is said to be
better than an individual B if A has a higher
fitness value than B - In a MOEA, an individual A is said to be better
than an individual B if A dominates B
78Domination in MOEAs
- An individual A is said to dominate individual B
iff - A is no worse than B in all objectives
- A is strictly better than B in at least one
objective
79Pareto Optimality
- Given a set of alternative allocations of, say,
goods or income for a set of individuals, a
movement from one allocation to another that can
make at least one individual better off without
making any other individual worse off is called a
Pareto Improvement. An allocation is Pareto
Optimal when no further Pareto Improvements can
be made. This is often called a Strong Pareto
Optimum (SPO).
80Pareto Optimality in MOEAs
- Among a set of solutions P, the non-dominated
subset of solutions P are those that are not
dominated by any member of the set P - The non-dominated subset of the entire feasible
search space S is the globally Pareto-optimal set
81Goals of MOEAs
- Identify the Global Pareto-Optimal set of
solutions (aka the Pareto Optimal Front) - Find a sufficient coverage of that set
- Find an even distribution of solutions
82MOEA metrics
- Convergence How close is a generated solution
set to the true Pareto-optimal front - Diversity Are the generated solutions evenly
distributed, or are they in clusters
83Deterioration in MOEAs
- Competition can result in the loss of a
non-dominated solution which dominated a
previously generated solution - This loss in its turn can result in the
previously generated solution being regenerated
and surviving
84Game-Theoretic Problems
- Adversarial search multi-agent problem with
conflicting utility functions - Ultimatum Game
- Select two subjects, A and B
- Subject A gets 10 units of currency
- A has to make an offer (ultimatum) to B, anywhere
from 0 to 10 of his units - B has the option to accept or reject (no
negotiation) - If B accepts, A keeps the remaining units and B
the offered units otherwise they both loose all
units
85Real-World Game-Theoretic Problems
- Real-world examples
- economic military strategy
- arms control
- cyber security
- bargaining
- Common problem real-world games are typically
incomputable
86Armsraces
- Military armsraces
- Prisoners Dilemma
- Biological armsraces
87Approximating incomputable games
- Consider the space of each users actions
- Perform local search in these spaces
- Solution quality in one space is dependent on the
search in the other spaces - The simultaneous search of co-dependent spaces is
naturally modeled as an armsrace
88Evolutionary armsraces
- Iterated evolutionary armsraces
- Biological armsraces revisited
- Iterated armsrace optimization is doomed!
89Coevolutionary Algorithm (CoEA)
- A special type of EAs where the fitness of an
individual is dependent on other individuals.
(i.e., individuals are explicitly part of the
environment) - Single species vs. multiple species
- Cooperative vs. competitive coevolution
90CoEA difficulties (1)
- Disengagement
- Occurs when one population evolves so much faster
than the other that all individuals of the other
are utterly defeated, making it impossible to
differentiate between better and worse
individuals without which there can be no
evolution
91CoEA difficulties (2)
- Cycling
- Occurs when populations have lost the genetic
knowledge of how to defeat an earlier generation
adversary and that adversary re-evolves - Potentially this can cause an infinite loop in
which the populations continue to evolve but do
not improve
92CoEA difficulties (3)
- Suboptimal Equilibrium
- (aka Mediocre Stability)
- Occurs when the system stabilizes in a suboptimal
equilibrium
93Case Study from Critical Infrastructure Protection
- Infrastructure Hardening
- Hardenings (defenders) versus contingencies
(attackers) - Hardenings need to balance spare flow capacity
with flow control
94Case study from Automated Software
EngineeringCoevolutionary Automated Software
Correction (CASC)
95Objective Find a way to automate the process of
software testing and correction.
Approach Create Coevolutionary Automated
Software Correction (CASC) system which will take
a software artifact as input and produce a
corrected version of the software artifact as
output.
96(No Transcript)
97Coevolutionary Cycle
98Population Initialization
99Population Initialization
100Population Initialization
101Population Initialization
102Initial Evaluation
103Initial Evaluation
104Reproduction Phase
105Reproduction Phase
106Reproduction Phase
107Evaluation Phase
108Evaluation Phase
109Competition Phase
110Competition Phase
111Termination
112Termination