Title: 3. Algorithms for self-configuration and evolution
13. Algorithms for self-configuration and evolution
- General perspective on search, optimization and
adaptation algorithms - Essence of evolutionary algorithms
- Details of operation of Genetic Algorithms
- Multi-criteria optimization, Hybrid Search
2Objectives control self-configuration for
desired functionality
- A control C that creates a structure / topology /
architecture S, that has the function F.
Specification in terms of S or F. F may include
constraints, preferences, etc. - Behavior/Function may change in time, in simple
case it doesnt - Often C, even for a set of states which can be
decomposed, but could be a sequence C1 C2 C3 if
system has memory - Digital or analog controls (analog signals often
obtained by conversion from digital)
3Search problems
- A search space consists of a set of objects for
potential consideration during the search. - One may search for the minimum of a function or a
circuit that does the function. Points in the
search space are candidate solutions (or simply
solutions). - The goal of search problems is finding solutions
that respect requirements desired properties,
defined usually in respect to a function, called
the objective function f. - If the search goal is optimization the goal is to
maximize f, in which case it is called utility or
fitness function, or figure of merit, or minimize
f, in which case it is often called cost function
of energy. - If the search goal is constraint satisfaction, f
measures the degree to which a solution violates
the constraints and is called penalty function,
and the goal is to reduce it to zero. - N. Radcliffe in Handbook of evolutionary
computation.
4Search landscapes
- Find peaks or valleys in rugged landscape
- IF looking for peaks higher means higher degree
of satisfying the objective in maximization
problems (Wright) - In an inverted perspective populations advance
toward lower behavioral error. (Atmar) - Searching for peaks depicts evolution as slow and
fragile, and an optimize solution may quickly
fall off the peak under slight perturbations. In
the inverted perspective once a solution is
reached stagnation sets in. In varying
environments this never happens.
5Problems Landscapes
- Landscape in Space
- Linear Functions contain one global maximum that
can be reached using gradient methods - Deceptive Functions contain isolated optima, the
best points tends to be surrounded by the worst - Landspace in Time
- Stationary Environment
- Non-Stationary/Dynamic Environment
- link to one example of deceptive function
http//www.cs.unr.edu/humphrey/deceptive.html
6Search/optimization algorithms and NFL Theorems
- Start with an initial "guess" at a solution,
- The estimated solution is updated on an
iteration-by-iteration basis with the aim of
improving the performance measure (objective
function). - Multiple variables influence the function a
multivariable optimization problem of minimizing
or maximizing an objective function. - No free lunch Theorem No search algorithm is
uniformly better than all other algorithms across
all possible problems. - Cheaper lunches in certain places Some
algorithms may work better than others on certain
classes of problems as a consequence of being
able to exploit the problem structure. - E.g. traditional nonlinear programming methods
(e.g., constrained conjugate gradient) are well
suited to deterministic optimization problems
with exact knowledge of the gradient of the
objective function more generally, stochastic
gradient methods are effective if one has direct
(unbiased) measurements of the gradient of the
objective function.
7Search Techniques
Scope of Evolutionary Algorithms Discontinuous,
non-differentiable, Multimodal and noisy
response surfaces.
8No-gradient population-based searches
- Explicit modeling information describing the
relationship between the parameters to be
optimized and the objective function is often
lacking. - No direct measurements of gradient/derivative of
objective function. - A class of recursive optimization algorithms that
rely on measurements of only the objective
function to be optimized, - Population-based searches -(with generate and
test strategies) - Evolutionary Algorithms search algorithms based
on a loose analogy with evolution in nature, and
mechanisms of natural selection and genetics. Key
characteristic features survival of fittest and
variation. - Found to outperform other techniques on difficult
problems including search on discontinuous,
non-differentiable, multimodal, noisy response
surfaces.
9Principles of evolutionary processes
- Genetic program -gt genotype, expressed behavioral
traits -gt phenotype - Pleiotropy a single gene may simultaneously
affect several phenotypic traits. - Polygeny a single phenotypic characteristic may
be determiend by the simultaneous interaction of
many genes. - Epistasis expression of one gene masks the
phenotypic effects of another - There are no one-gene, one-trait relationships in
naturally evolved systems. - Very different genetic structures may code for
equivalent behaviors various circuits that
implement a function with electronic components.
10Selected characteristics of neo-Darwinism and EA
- The individual is the primary target of selection
- Genetic variation is largely a chance phenomenon
stochastic process. - Genotypic variation is largely a product of
recombination and only ultimately of mutation - Gradual evolution may incorporate phenotypic
discontinuities - Selection is probabilistic not deterministic
Mayr (1988) also in Handbook of Evolutionary
Computation
11Principle of operation of evolutionary algorithms
- Coding solutions as chromosomes. Operates on code
not on solution. - A string is a candidate solution.
- Switch states 11011 Bitstring
- Program (x(x (- x 1)))
- Vector (4.3 3.2 500)
Initialize a population of candidate solutions
Acceptable solution found?
Select the best individuals
Output solution
Evaluate population
Use genetic operators to create a new population
based on old one
No
12Mechanisms that power the EA
- Survival of the best - Selection pressure,
chooses the basis for new samples, in normal
contexts tends to cluster solutions around the
best point, or a few best points if some Pareto
selection is made and solutions are ranked
according to multiple criteria - Diversification of search mutation trying to
expand the search domain to new areas modifying
the code at random - Exploitation of search cross-over trying to
combine known search areas by combining codes at
random - Selection clusters solutions, diversification
spreads them a fine balance needs to be
obtained during the search to avoid falling into
premature convergence into local optima, or, at
the other extreme, with too much mutation, a
derivate of random walk.
13Types of evolutionary algorithms
- Basic components of EAs
- Representation
- Fitness Evaluation Function.
- Selection
- Recombination
- Crossover
- Mutation
Classes of EAs Evolutionary Strategies
(ES) Evolutionary Programming (EP) Genetic
Algorithms (GA) Genetic Programming (GP)
14Evolutionary Programming/Evolutionary Strategies
- Evolutionary Programming (Lawrence J. Fogel
-1962) - Representation State machines
- Motivation forecast nonstationary series
- Population of state machines is deterministically
selected based on fitness value assigned
according to the performance in the particular
task - Selected machines are then mutated, creating
then a new generation of state machines, or a new
population. - Evolutionary strategies (Rechenberg and Schwefel
1965) - Representation tuple (X, ?) ? Rn x R where X is
a real vector representing an individual and ?
is a control parameter for the mutation operator - Motivation Mechanical shape optimization
- (??) Evolutionary Strategy a total of ?
individuals makes up the population. ? new
individuals are then selected among the original
? individuals, with a probability of 1/?. The new
pool of (??) individuals undergoes a process of
mutation and recombination to generate a new
population
Mutation in Evolutionary Strategies New mutation
parameter ? t1 computed to generate new
individual Xt1. Z is a standard normal random
vector and ? is a control parameter.
15Genetic Programming
- Motivation for GP (Koza, early 90s)
evolutionary algorithms to evolve computer - programs
-
- Representation Trees composed of functional
and terminal nodes - Genetic programming breeds a population of
rooted, point-labeled trees (trees without - cycles) with ordered branches opposed to the
labeled cyclic graphs encountered in the - world of electrical circuits
- Developmental approach to map circuits to GP
trees - GP tree encodes a set of instructions to build a
circuit from an initial one (embryonic). - Circuit structure encompasses a fixed and a
modifiable part the fixed part contains - circuit source and load the modifiable part
consist of pieces of wires subject to - the application of building instructions
   Â
16Constructing circuits with Genetic Programming
Constructing tree
Wr1, Wr2, Wr3 Writing Heads
LIST
List
Development
WR1
WR2
WR3
L
L
GND
Rin
Rload   Â
589u
END
SERIES
Vin
C
L
Embryonic
15200n
END
218u
END
L
C
END
0.0367u
END
5490n
END
Final Circuit
17Use of domain knowledge
- Claim EA are blind
- (I.e. work well with no info about problem
domain) - Fact Seeing may help
- Incorporation of domain knowledge helps the
search - Representation (e.g contingency of physical
cells) - E.g. useful for not destroying useful blocks by
crossover - Specific operators
- E.g.
- Known good sub-circuits can get higher
probability for insertion - Constraints
- E.g. restrictions for interconnect (cells,
terminals) - Increased probability for using certain connects
(e.g. more VDD points)
18Representations
- Binary Real-values vectors Trees State
Machines.
- Representation has a critical role in determining
performance - Some Choose the representation that is most
suitable for the search algorithm - My take choose the representation most suitable
to the problem, then choose the search algorithm - A good representation should be
- Simple and compact (small chromosomes)
- Flexible to map solutions of various sizes and
shapes - can be of fixed or variable size, respectively
processing chromosomes of fixed or variable
length. Improved flexibility of variable length
representation may be accompanied by drawbacks,
such as bloating problem
19Representations
- Epistasis expression of one gene masks the
phenotypic effects of another - Problems with low epistasy are too easy for
GAs - Problems with medium epistasy are the most
adequate for GAs - Problems with strong epistasy are too difficult
for GAs, being - random search problems
- -Use of less epistatic representation!
20Common Representations in EHW
- Binary representations when evolving circuits on
reconfigurable devices (JPL) - Tree representation of analog circuits with
developmental rules to grow analog circuits
(Koza) - Linear representation of analog circuits with
developmental rules to grow analog circuits
(Lohn) - Linear representation with correction for
invalid circuits and increasing length
chromosomes (JPL)
21From representation to manifestation
- A1 A2 A3 where each Block Ai has the structure
- B1 B2 B3 B4 B5 B6
B7 B8 B9 B10 B11 B12 - Block Function, Block Interconnect, Analog
Signal, Parameter of passive component - 101Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 2 input Amp with gain
g1, out of 8 choices - 0101 1000 0011 NESW connections
for 2 inputs and output E W connect to In 1, N
connects to In 2, Out goes to S and W - 101 An analog bias of 6/8 on bias node
- 1111 selection of 10K R
22Measurement and evaluation of individuals
- Testbench, or in-system measurement
- Stimulation signals
- Load on the output signals
- What do we measure in the testbench
- Time response, taking samples
- Often A/D conversion for processing in digital
- Frequency response, directly (Spectrum Analyzer)
indirect (FFT) - Other measures, such as current
- Effects derived effects (if electrical device
controls something else) - How we assess quality of individual an overall
fitness value is determined based on individual
fitness function associated to the testbench, and
their weighting
23Objective Functions
Objective function evaluates how well each
individual performs. Goal maximize the objective
function Standard Method compute a distance to a
target
Fitness F is computed over n samples Ri
Individual Response Ti Target Response Wi
Weights reflecting some knowledge of the problem
The design of a good fitness evaluation function
is critical for evolution.
24Components of fitness function
In evolving electronic circuits in simulations
three kinds of circuit analysis are used
transient DC transfer and small signal analysis.
F ?wi. eik
w - weight vector ei - error of the output
sample I to the desired response k - power index
applied to the error, commonly k 2 i - index
related to the time domain, DC transfer domain or
frequency domain.
- To improve the performance
- - Set weight vector components according to the
problem - Consider shape/significant point descriptors for
the analysis domain - e.g. peaks/valleys for DC analysis cut-off
frequency for AC analysis particular time
intervals for transient analysis - - Probe internal points
- - Co-evolve the weights.
25Improvement in individual and population
Fitness of best individual
- Often we care only of best individual
- Sometimes we care of a population
- For monitoring purposes to understand better what
is going on - For fault-tolerance we may want several good
mutants a fault gives a mutant which still
has high fitness
Average Fitness across pop
Example of improvement in fitness of the best
individual over the generations (and improvement
of average)
26Selection
- Based on the principle of survival of the
fittest - Better candidate solutions get more offspring
with same/close genetic code - Deterministic in ES and EP
- Probabilistic in GA and GP
- Selection Techniques
- Proportional Selection
- Rank based selection
- Exponential Selection
- Tournament Selection
Proportional Selection Roulette wheel selection
Spin the roulette
Slice in roulette and fitness of an individual
are proportional
Those who have higher fitness have higher
probability to be selected for mating
27Steady-state/Generational, Elitism
- Stationary environment. Steady-state we keep part
of the old population - Non-stationary/Dynamic Environment Generational
regenerate all the population - Elitist GA - Keep the best from old population
- Keep best and some not as good
28Keeping search diversified
- Islands, regions of search
- Avoids clustering in regions that temporarily
look most promising Gold Rush? - Works well for dynamic environments when best
solution at some moment may not be good at next
29Exploitation by Recombination
- Combine good parents to exploit current solutions
- Replace parents (some or all) with offspring
30Crossover
Offspring
Parents
2-point Crossover
Offspring
Parents
- Recombination of genetic material that
contributes to the variability in the population - Harmful effects destroying potentially useful
building blocks - Automatically Defined Functions (ADFs)protection
against disruptive effect of crossover.
31Various crossovers, mutation as crossover with
random string
1 point Crossover
n point Crossover
uniform Crossover
Parents
Parents
Parents
Offspring
Offspring
Offspring
32Schemata, building blocks, ADFs
- GA/GP (EAs using crossover) use the building
block theory useful components of what makes a
solution (chunks of chromosomes) can be
efficiently manipulated and used to lead to the
solution. - A problem decomposition
- Looking for similarities patterns in chromosomes
of similarly performing solutions - 1100 10
- 0010 3
- 0101 4
- 1101 20 11xx may be a good building block
- schemata set of all combinations based on same
pattern - Goldberg Ensure BB supply, growth, understand BB
spead, ensure good BB decisions, know BB
challenges, ensure good BB mixing - Crossover probability rules of thumb
33Mutation
- Each bit of a new string can be changed (mutated)
with a probability given by the mutation rate - Low values for the mutation rate are often used
- Traditional interpretation only support for
crossover - More recent voices driving force of GAs
(something other EA camps have always stated) (or
it depends on problem/rep) - GAs performance largely affected by the mutation
rate.
Crossover with random string
Parent 1
Parent 2 random string
Offspring
34Adaptive operators
- Adapting the probability associated to
evolutionary operators improves convergence - Crossover probability
- Mutation probability
- Change representation
- Change selection probability and method all is
permitted
35Specific Crossover and Mutation
Example of Mutation Operation
Examples from T. Arslans work
36Specific Genetic Algorithm Implementation
- Problem Specific Crossover
- Pipeline Identified In Parent 1
- Retime Identified In Parent 2
- Transformations Crossed-Over To Produce 2
Children
Examples from T. Arslans work
37Population size, Generations, Stopping, multiple
runs
- Fixed or variable
- Small populations more generations, vice-versa,
ballanced - 100 individuals very common
- Usually GP asks for more eg 640,000 in some of
Kozas experiments - Hundreds of generations
- Sampling a small of space
- See if it is still improving tracing amount of
changes in last generations - Stop nr of gen, time, lack of improvement
- Re-start, change initial population, seed with
solutions
38Evolutionary algorithms visualized
New Population (Pop. Size 512, 24 bits)
Evaluations (30 samples)
Fitness (MSE)
Population
Compare to Target
Population Initialization (Randomly)
0.10
01001110111001
Vout
0.11
0.11
01001110110000
Elite (10 )
Vin
0.15
01001010110000
Vout
0.34
- Evaluation
- Simulators (SPICE)
- Hardware (PTAs)
0.27
Sort
11001110111001
0.29
Vin
01011110111001
0.10
Recombined Individuals
Vout
0.34
01100010100001
0.39
Vin
01000111101000
Vout
0.53
0.53
10110111110110
Vin
Binary Tournament Selection (size 2)
Two Points Crossover (Prob. 70)
Elite (10)
Uniform Mutation (Prob. 4)
Elite (10)
01001010110000
0.15
01001010110000
01001110110000
01101111110000
01000111101000
0.39
22 rand
Best
0.29
01011110111001
01011110111001
01011010111001
11011010011011
10110111110110
0.53
39Guiding evolution
- Shaping increase difficulty
- Guide in certain directions of search space
rewarding certain types of solutions - Pressure can guide away from certain regions
e.g. avoid patented solutions, avoid areas with
faults - Island based GA search with multiple
populations interacting during evolution
40Multi-criteria optimization, trade-offs, Pareto
optimality
- The simultaneous optimization of multiple,
possibly competing, objective functions deviates
from the single-function optimization in that it
seldom admits a perfect (or Utopian) solution - Instead, multi-objective optimization problems
tend to be characterized by a family of
alternatives that must be considered equivalent
in the absence of information concerning the
relevance of each objective relative to the
others - Two different methods Plain aggregating
approaches and Pareto-based approaches - Plain aggregating approaches perform the
scalarization of the objective vectors - each objective, fi(x), multiplied by the weight
wi.
41Multi-criteria optimization, trade-offs, Pareto
optimality
- Given a population of GA individuals, one
particular individual v dominates another
individual u if and only if -
- Where vi and ui represents the fitness value
referring to objective i for the individuals u
and v respectively. In this particular example,
there are n objectives, that have to be
minimized. - Different implementations
- all the non-dominated individuals are removed
from the population and given the same
probability of reproduction, higher than the one
for dominated individuals (Goldberg) - individuals rank (using rank selection)
corresponds to the number of individuals in the
current population by which it is dominated
(Fonseca and Flemming) . - Non-dominated individuals are always selected for
crossover/mutation (Schnier and Yao)
42Multi-Objective Circuit Optimization
- Analog circuit design is intrinsically
multi-objective - Conventional design usually decomposes synthesis
tasks into two sub-tasks - general performance requirements (Ex frequency
response) - specific circuit requirements (Ex noise and
fault-tolerance) - The designer may choose among a number of
solutions provided by the Genetic Algorithm.
General Fitness Expression Fitness ? (wi.
fi) wi ? Weight Vector component for objective
i fi ? Fitness of the objective i.
How to find optimal weight vector?
Co-evolution of circuits and weights (Lohn,1998)
(Zebulum, 1998)
43Enhancements in EAs
- Adaptive mutation rate
- Escape local optima by increasing rate of
mutation - Speciation
- Keep diversity by creating sub-populations
- Multimodal problems subpopulations sampling
different and interesting solutions to a
particular problem. - Variable Length Representations
- Map solutions of different sizes
- Evolution of electronic circuits of different
sizes.
44Co-evolution idea
- Two populations that evolve together
- Powerful evolutionary dynamics arms races
- Predators and preys performance improve
together to maintain equilibrium. If predator
gets faster so does pray in order to survive. If
predator gets too fast they eat too much pray and
later they have not enough left to survive. The
system self-regulates.
45In coevolution a second population mimics
dynamic fitness functions.
Std Evolution
Coevolution
- Static fitness function
- Difficult to weight objectives
- Dynamic, multiple fitness functions
- Objectives/weights are evolved
Lohn in evolving antennas
46Coevolutionary Algorithm
- Concept the goals start out easy, then increase
in difficulty as the hardware designs improve
(ZPD zone of proximal development) - Hardware designs are rewarded for solving goal
vectors, with extra points for solving difficult
goals - Goals that are too difficult (s.t. no design can
solve it) or too easy are given low fitness.
Difficulty is defined
47Other techniques SPSA
- Simultaneous perturbation stochastic
approximation (SPSA) method. - The essential feature of SPSA, which provides its
power and relative ease of use in difficult
multivariate optimization problems, is the
underlying gradient approximation that requires
only two objective function measurements per
iteration regardless of the dimension of the
optimization problem. These two measurements are
made by simultaneously varying in a "proper"
random fashion all of the variables in the
problem (the "simultaneous perturbation"). This
contrasts with the classical ("finite-difference")
method where the variables are varied one at a
time. If the number of terms being optimized is
p, then the finite-difference method takes 2p
measurements of the objective function at each
iteration (to form one gradient approximation)
while SPSA takes only two measurements. A
fundamental result on relative efficiency then
follows -
- Under reasonably general conditions, SPSA and the
standard finite-difference SA method achieve the
same level of statistical accuracy for a given
number of iterations even though SPSA uses p
times fewer measurements of the objective
function at each iteration (since each gradient
approximation uses only 1/p the number of
function measurements). This indicates that SPSA
will converge to the optimal solution within a
given level of accuracy with p times fewer
measurements of the objective function than the
standard method. - http//www.jhuapl.edu/SPSA/Â Â
48Hybrid search efficient-global followed by
efficient-local. Combine evolution with learning
- Standard GA are efficient global search
algorithms but no as good for local search - Solutions Hybrid GA
- Combine with local search
- Genetic local search
- Lamarkian GA
- Imanishian GAs
In general, methods involving a population of
candidate solutions, such as evolutionary
algorithms, may be useful for a broad search over
the domain of the parameters being optimized and
subsequent initialization of more powerful local
search algorithms. (the local search is not as
often implemented as it should)
49Multi-stage Search Search for topology followed
by parameter optimization
- First Stage GA-based Evolution of the circuit
topology - Second Stage GA-based Optimization of the
transistor sizes for the best topology resulted
in the first stage. Initialization is made with
the best topology and random parameters.
Second Stage (40 generations)
First Stage (200 generations)
Multiplier Evolved Through Multi-stage search W/L
in um
50Example of Software Structure
T. Arslans work
51Currents and Execution Times for GA subroutines
Using ARM and Thumb
Name ARM Thumb Power Code Size Current Time
Code Size Current Time Code Size Ratio Ratio (mA
) (CLK) (Byte) (mA) (CLK) (Byte) () () App
malloc 3.3 146 1835 3.1 253 1563 61 85 Advance
random 3.8 8051 3763 3.6 10751 3335 79 88 Objfun
3.9 38910 11739 3.6 46619 10023 90 85 Statistic 4
.0 14875 4259 3.7 19134 4187 84 98 Initdata 4.7 1
789 20 3.5 2555 12 94 60 App init 3.4 403 488 3.4
559 324 72 66 Preselect 4.3 6334 3691 4.0 7826 3
567 87 96 Initmalloc 3.6 14008 1987 3.1 21996 166
7 74 83 Warmup random 4.0 44572 4431 3.6 59904 4
063 83 91 Randomperc 3.8 8281 4631 3.6 12159 4195
72 90 Randomize 4.0 43020 4519 3.6 57608 4219 83
93 Flip 3.7 8332 4759 3.6 11061 4556 77 95 Rnd
3.8 8375 6143 3.2 11064 5415 90 88 Initpop 3.7 94
8736 22508 3.6 1355264 15815 72 70 Mutation 3.9 4
27520 4859 3.6 627712 4591 74 94 Crossover 3.8 87
04 6403 3.6 12544 6011 73 93 Select 3.8 17018 821
9 3.6 24752 7871 72 95 Initialize 4.0 1001472 743
9 3.5 1268480 6583 90 88 Generation 3.8 1014272 1
6343 3.6 1281792 15207 91 80
T. Arslans work