3. Algorithms for self-configuration and evolution - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

3. Algorithms for self-configuration and evolution

Description:

Example of improvement in fitness of the best individual over the generations ... force of GAs: (something other EA camps have always stated) (or it depends on ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 52
Provided by: gis76
Category:

less

Transcript and Presenter's Notes

Title: 3. Algorithms for self-configuration and evolution


1
3. Algorithms for self-configuration and evolution
  • General perspective on search, optimization and
    adaptation algorithms
  • Essence of evolutionary algorithms
  • Details of operation of Genetic Algorithms
  • Multi-criteria optimization, Hybrid Search

2
Objectives control self-configuration for
desired functionality
  • A control C that creates a structure / topology /
    architecture S, that has the function F.
    Specification in terms of S or F. F may include
    constraints, preferences, etc.
  • Behavior/Function may change in time, in simple
    case it doesnt
  • Often C, even for a set of states which can be
    decomposed, but could be a sequence C1 C2 C3 if
    system has memory
  • Digital or analog controls (analog signals often
    obtained by conversion from digital)

3
Search problems
  • A search space consists of a set of objects for
    potential consideration during the search.
  • One may search for the minimum of a function or a
    circuit that does the function. Points in the
    search space are candidate solutions (or simply
    solutions).
  • The goal of search problems is finding solutions
    that respect requirements desired properties,
    defined usually in respect to a function, called
    the objective function f.
  • If the search goal is optimization the goal is to
    maximize f, in which case it is called utility or
    fitness function, or figure of merit, or minimize
    f, in which case it is often called cost function
    of energy.
  • If the search goal is constraint satisfaction, f
    measures the degree to which a solution violates
    the constraints and is called penalty function,
    and the goal is to reduce it to zero.
  • N. Radcliffe in Handbook of evolutionary
    computation.

4
Search landscapes
  • Find peaks or valleys in rugged landscape
  • IF looking for peaks higher means higher degree
    of satisfying the objective in maximization
    problems (Wright)
  • In an inverted perspective populations advance
    toward lower behavioral error. (Atmar)
  • Searching for peaks depicts evolution as slow and
    fragile, and an optimize solution may quickly
    fall off the peak under slight perturbations. In
    the inverted perspective once a solution is
    reached stagnation sets in. In varying
    environments this never happens.

5
Problems Landscapes
  • Landscape in Space
  • Linear Functions contain one global maximum that
    can be reached using gradient methods
  • Deceptive Functions contain isolated optima, the
    best points tends to be surrounded by the worst
  • Landspace in Time
  • Stationary Environment
  • Non-Stationary/Dynamic Environment
  • link to one example of deceptive function
    http//www.cs.unr.edu/humphrey/deceptive.html

6
Search/optimization algorithms and NFL Theorems
  • Start with an initial "guess" at a solution,
  • The estimated solution is updated on an
    iteration-by-iteration basis with the aim of
    improving the performance measure (objective
    function).
  • Multiple variables influence the function a
    multivariable optimization problem of minimizing
    or maximizing an objective function.
  • No free lunch Theorem No search algorithm is
    uniformly better than all other algorithms across
    all possible problems.
  • Cheaper lunches in certain places Some
    algorithms may work better than others on certain
    classes of problems as a consequence of being
    able to exploit the problem structure.
  • E.g. traditional nonlinear programming methods
    (e.g., constrained conjugate gradient) are well
    suited to deterministic optimization problems
    with exact knowledge of the gradient of the
    objective function more generally, stochastic
    gradient methods are effective if one has direct
    (unbiased) measurements of the gradient of the
    objective function.

7
Search Techniques
Scope of Evolutionary Algorithms Discontinuous,
non-differentiable, Multimodal and noisy
response surfaces.
8
No-gradient population-based searches
  • Explicit modeling information describing the
    relationship between the parameters to be
    optimized and the objective function is often
    lacking.
  • No direct measurements of gradient/derivative of
    objective function.
  • A class of recursive optimization algorithms that
    rely on measurements of only the objective
    function to be optimized,
  • Population-based searches -(with generate and
    test strategies)
  • Evolutionary Algorithms search algorithms based
    on a loose analogy with evolution in nature, and
    mechanisms of natural selection and genetics. Key
    characteristic features survival of fittest and
    variation.
  • Found to outperform other techniques on difficult
    problems including search on discontinuous,
    non-differentiable, multimodal, noisy response
    surfaces.

9
Principles of evolutionary processes
  • Genetic program -gt genotype, expressed behavioral
    traits -gt phenotype
  • Pleiotropy a single gene may simultaneously
    affect several phenotypic traits.
  • Polygeny a single phenotypic characteristic may
    be determiend by the simultaneous interaction of
    many genes.
  • Epistasis expression of one gene masks the
    phenotypic effects of another
  • There are no one-gene, one-trait relationships in
    naturally evolved systems.
  • Very different genetic structures may code for
    equivalent behaviors various circuits that
    implement a function with electronic components.

10
Selected characteristics of neo-Darwinism and EA
  • The individual is the primary target of selection
  • Genetic variation is largely a chance phenomenon
    stochastic process.
  • Genotypic variation is largely a product of
    recombination and only ultimately of mutation
  • Gradual evolution may incorporate phenotypic
    discontinuities
  • Selection is probabilistic not deterministic

Mayr (1988) also in Handbook of Evolutionary
Computation
11
Principle of operation of evolutionary algorithms
  • Coding solutions as chromosomes. Operates on code
    not on solution.
  • A string is a candidate solution.
  • Switch states 11011 Bitstring
  • Program (x(x (- x 1)))
  • Vector (4.3 3.2 500)

Initialize a population of candidate solutions
Acceptable solution found?
Select the best individuals
Output solution
Evaluate population
Use genetic operators to create a new population
based on old one
No
12
Mechanisms that power the EA
  • Survival of the best - Selection pressure,
    chooses the basis for new samples, in normal
    contexts tends to cluster solutions around the
    best point, or a few best points if some Pareto
    selection is made and solutions are ranked
    according to multiple criteria
  • Diversification of search mutation trying to
    expand the search domain to new areas modifying
    the code at random
  • Exploitation of search cross-over trying to
    combine known search areas by combining codes at
    random
  • Selection clusters solutions, diversification
    spreads them a fine balance needs to be
    obtained during the search to avoid falling into
    premature convergence into local optima, or, at
    the other extreme, with too much mutation, a
    derivate of random walk.

13
Types of evolutionary algorithms
  • Basic components of EAs
  • Representation
  • Fitness Evaluation Function.
  • Selection
  • Recombination
  • Crossover
  • Mutation

Classes of EAs Evolutionary Strategies
(ES) Evolutionary Programming (EP) Genetic
Algorithms (GA) Genetic Programming (GP)
14
Evolutionary Programming/Evolutionary Strategies
  • Evolutionary Programming (Lawrence J. Fogel
    -1962)
  • Representation State machines
  • Motivation forecast nonstationary series
  • Population of state machines is deterministically
    selected based on fitness value assigned
    according to the performance in the particular
    task
  • Selected machines are then mutated, creating
    then a new generation of state machines, or a new
    population.
  • Evolutionary strategies (Rechenberg and Schwefel
    1965)
  • Representation tuple (X, ?) ? Rn x R where X is
    a real vector representing an individual and ?
    is a control parameter for the mutation operator
  • Motivation Mechanical shape optimization
  • (??) Evolutionary Strategy a total of ?
    individuals makes up the population. ? new
    individuals are then selected among the original
    ? individuals, with a probability of 1/?. The new
    pool of (??) individuals undergoes a process of
    mutation and recombination to generate a new
    population

Mutation in Evolutionary Strategies New mutation
parameter ? t1 computed to generate new
individual Xt1. Z is a standard normal random
vector and ? is a control parameter.
15
Genetic Programming
  • Motivation for GP (Koza, early 90s)
    evolutionary algorithms to evolve computer
  • programs
  • Representation Trees composed of functional
    and terminal nodes
  • Genetic programming breeds a population of
    rooted, point-labeled trees (trees without
  • cycles) with ordered branches opposed to the
    labeled cyclic graphs encountered in the
  • world of electrical circuits
  • Developmental approach to map circuits to GP
    trees
  • GP tree encodes a set of instructions to build a
    circuit from an initial one (embryonic).
  • Circuit structure encompasses a fixed and a
    modifiable part the fixed part contains
  • circuit source and load the modifiable part
    consist of pieces of wires subject to
  • the application of building instructions

       
16
Constructing circuits with Genetic Programming
Constructing tree
Wr1, Wr2, Wr3 Writing Heads
LIST
List

Development
WR1
WR2
WR3
L
L
GND
Rin
Rload      
589u
END
SERIES
Vin

C
L
Embryonic
15200n
END
218u
END
L
C
END
0.0367u
END
5490n
END
Final Circuit
17
Use of domain knowledge
  • Claim EA are blind
  • (I.e. work well with no info about problem
    domain)
  • Fact Seeing may help
  • Incorporation of domain knowledge helps the
    search
  • Representation (e.g contingency of physical
    cells)
  • E.g. useful for not destroying useful blocks by
    crossover
  • Specific operators
  • E.g.
  • Known good sub-circuits can get higher
    probability for insertion
  • Constraints
  • E.g. restrictions for interconnect (cells,
    terminals)
  • Increased probability for using certain connects
    (e.g. more VDD points)

18
Representations
  • Binary Real-values vectors Trees State
    Machines.
  • Representation has a critical role in determining
    performance
  • Some Choose the representation that is most
    suitable for the search algorithm
  • My take choose the representation most suitable
    to the problem, then choose the search algorithm
  • A good representation should be
  • Simple and compact (small chromosomes)
  • Flexible to map solutions of various sizes and
    shapes
  • can be of fixed or variable size, respectively
    processing chromosomes of fixed or variable
    length. Improved flexibility of variable length
    representation may be accompanied by drawbacks,
    such as bloating problem

19
Representations
  • Epistasis expression of one gene masks the
    phenotypic effects of another
  • Problems with low epistasy are too easy for
    GAs
  • Problems with medium epistasy are the most
    adequate for GAs
  • Problems with strong epistasy are too difficult
    for GAs, being
  • random search problems
  • -Use of less epistatic representation!

20
Common Representations in EHW
  • Binary representations when evolving circuits on
    reconfigurable devices (JPL)
  • Tree representation of analog circuits with
    developmental rules to grow analog circuits
    (Koza)
  • Linear representation of analog circuits with
    developmental rules to grow analog circuits
    (Lohn)
  • Linear representation with correction for
    invalid circuits and increasing length
    chromosomes (JPL)

21
From representation to manifestation
  • A1 A2 A3 where each Block Ai has the structure
  • B1 B2 B3 B4 B5 B6
    B7 B8 B9 B10 B11 B12
  • Block Function, Block Interconnect, Analog
    Signal, Parameter of passive component
  • 101                        2 input Amp with gain
    g1, out of 8 choices
  • 0101 1000 0011 NESW connections
    for 2 inputs and output E W connect to In 1, N
    connects to In 2, Out goes to S and W
  • 101 An analog bias of 6/8 on bias node
  • 1111 selection of 10K R

22
Measurement and evaluation of individuals
  • Testbench, or in-system measurement
  • Stimulation signals
  • Load on the output signals
  • What do we measure in the testbench
  • Time response, taking samples
  • Often A/D conversion for processing in digital
  • Frequency response, directly (Spectrum Analyzer)
    indirect (FFT)
  • Other measures, such as current
  • Effects derived effects (if electrical device
    controls something else)
  • How we assess quality of individual an overall
    fitness value is determined based on individual
    fitness function associated to the testbench, and
    their weighting

23
Objective Functions
Objective function evaluates how well each
individual performs. Goal maximize the objective
function Standard Method compute a distance to a
target
Fitness F is computed over n samples Ri
Individual Response Ti Target Response Wi
Weights reflecting some knowledge of the problem
The design of a good fitness evaluation function
is critical for evolution.
24
Components of fitness function
In evolving electronic circuits in simulations
three kinds of circuit analysis are used
transient DC transfer and small signal analysis.
F ?wi. eik
w - weight vector ei - error of the output
sample I to the desired response k - power index
applied to the error, commonly k 2 i - index
related to the time domain, DC transfer domain or
frequency domain.
  • To improve the performance
  • - Set weight vector components according to the
    problem
  • Consider shape/significant point descriptors for
    the analysis domain
  • e.g. peaks/valleys for DC analysis cut-off
    frequency for AC analysis particular time
    intervals for transient analysis
  • - Probe internal points
  • - Co-evolve the weights.

25
Improvement in individual and population
Fitness of best individual
  • Often we care only of best individual
  • Sometimes we care of a population
  • For monitoring purposes to understand better what
    is going on
  • For fault-tolerance we may want several good
    mutants a fault gives a mutant which still
    has high fitness

Average Fitness across pop
Example of improvement in fitness of the best
individual over the generations (and improvement
of average)
26
Selection
  • Based on the principle of survival of the
    fittest
  • Better candidate solutions get more offspring
    with same/close genetic code
  • Deterministic in ES and EP
  • Probabilistic in GA and GP
  • Selection Techniques
  • Proportional Selection
  • Rank based selection
  • Exponential Selection
  • Tournament Selection

Proportional Selection Roulette wheel selection
Spin the roulette
Slice in roulette and fitness of an individual
are proportional
Those who have higher fitness have higher
probability to be selected for mating
27
Steady-state/Generational, Elitism
  • Stationary environment. Steady-state we keep part
    of the old population
  • Non-stationary/Dynamic Environment Generational
    regenerate all the population
  • Elitist GA - Keep the best from old population
  • Keep best and some not as good

28
Keeping search diversified
  • Islands, regions of search
  • Avoids clustering in regions that temporarily
    look most promising Gold Rush?
  • Works well for dynamic environments when best
    solution at some moment may not be good at next

29
Exploitation by Recombination
  • Combine good parents to exploit current solutions
  • Replace parents (some or all) with offspring

30
Crossover
  • Genetic algorithms

Offspring
Parents
2-point Crossover
  • Genetic Programming

Offspring
Parents
  • Recombination of genetic material that
    contributes to the variability in the population
  • Harmful effects destroying potentially useful
    building blocks
  • Automatically Defined Functions (ADFs)protection
    against disruptive effect of crossover.

31
Various crossovers, mutation as crossover with
random string
1 point Crossover
n point Crossover
uniform Crossover
Parents
Parents
Parents
Offspring
Offspring
Offspring
32
Schemata, building blocks, ADFs
  • GA/GP (EAs using crossover) use the building
    block theory useful components of what makes a
    solution (chunks of chromosomes) can be
    efficiently manipulated and used to lead to the
    solution.
  • A problem decomposition
  • Looking for similarities patterns in chromosomes
    of similarly performing solutions
  • 1100 10
  • 0010 3
  • 0101 4
  • 1101 20 11xx may be a good building block
  • schemata set of all combinations based on same
    pattern
  • Goldberg Ensure BB supply, growth, understand BB
    spead, ensure good BB decisions, know BB
    challenges, ensure good BB mixing
  • Crossover probability rules of thumb

33
Mutation
  • Each bit of a new string can be changed (mutated)
    with a probability given by the mutation rate
  • Low values for the mutation rate are often used
  • Traditional interpretation only support for
    crossover
  • More recent voices driving force of GAs
    (something other EA camps have always stated) (or
    it depends on problem/rep)
  • GAs performance largely affected by the mutation
    rate.

Crossover with random string
Parent 1
Parent 2 random string
Offspring
34
Adaptive operators
  • Adapting the probability associated to
    evolutionary operators improves convergence
  • Crossover probability
  • Mutation probability
  • Change representation
  • Change selection probability and method all is
    permitted

35
Specific Crossover and Mutation
Example of Mutation Operation
Examples from T. Arslans work
36
Specific Genetic Algorithm Implementation
  • Problem Specific Crossover
  • Pipeline Identified In Parent 1
  • Retime Identified In Parent 2
  • Transformations Crossed-Over To Produce 2
    Children

Examples from T. Arslans work
37
Population size, Generations, Stopping, multiple
runs
  • Fixed or variable
  • Small populations more generations, vice-versa,
    ballanced
  • 100 individuals very common
  • Usually GP asks for more eg 640,000 in some of
    Kozas experiments
  • Hundreds of generations
  • Sampling a small of space
  • See if it is still improving tracing amount of
    changes in last generations
  • Stop nr of gen, time, lack of improvement
  • Re-start, change initial population, seed with
    solutions

38
Evolutionary algorithms visualized
New Population (Pop. Size 512, 24 bits)
Evaluations (30 samples)
Fitness (MSE)
Population
Compare to Target
Population Initialization (Randomly)

0.10
01001110111001
Vout

0.11
0.11
01001110110000
Elite (10 )
Vin
0.15
01001010110000
Vout
0.34
  • Evaluation
  • Simulators (SPICE)
  • Hardware (PTAs)

0.27
Sort
11001110111001
0.29
Vin
01011110111001
0.10
Recombined Individuals
Vout
0.34
01100010100001
0.39
Vin
01000111101000
Vout
0.53
0.53
10110111110110
Vin
Binary Tournament Selection (size 2)

Two Points Crossover (Prob. 70)
Elite (10)
Uniform Mutation (Prob. 4)
Elite (10)

01001010110000
0.15

01001010110000
01001110110000
01101111110000
01000111101000
0.39
22 rand
Best
0.29

01011110111001

01011110111001

01011010111001
11011010011011
10110111110110
0.53
39
Guiding evolution
  • Shaping increase difficulty
  • Guide in certain directions of search space
    rewarding certain types of solutions
  • Pressure can guide away from certain regions
    e.g. avoid patented solutions, avoid areas with
    faults
  • Island based GA search with multiple
    populations interacting during evolution

40
Multi-criteria optimization, trade-offs, Pareto
optimality
  • The simultaneous optimization of multiple,
    possibly competing, objective functions deviates
    from the single-function optimization in that it
    seldom admits a perfect (or Utopian) solution
  • Instead, multi-objective optimization problems
    tend to be characterized by a family of
    alternatives that must be considered equivalent
    in the absence of information concerning the
    relevance of each objective relative to the
    others
  • Two different methods Plain aggregating
    approaches and Pareto-based approaches
  • Plain aggregating approaches perform the
    scalarization of the objective vectors
  • each objective, fi(x), multiplied by the weight
    wi.

41
Multi-criteria optimization, trade-offs, Pareto
optimality
  • Given a population of GA individuals, one
    particular individual v dominates another
    individual u if and only if
  • Where vi and ui represents the fitness value
    referring to objective i for the individuals u
    and v respectively. In this particular example,
    there are n objectives, that have to be
    minimized.
  • Different implementations
  • all the non-dominated individuals are removed
    from the population and given the same
    probability of reproduction, higher than the one
    for dominated individuals (Goldberg)
  • individuals rank (using rank selection)
    corresponds to the number of individuals in the
    current population by which it is dominated
    (Fonseca and Flemming) .
  • Non-dominated individuals are always selected for
    crossover/mutation (Schnier and Yao)

42
Multi-Objective Circuit Optimization
  • Analog circuit design is intrinsically
    multi-objective
  • Conventional design usually decomposes synthesis
    tasks into two sub-tasks
  • general performance requirements (Ex frequency
    response)
  • specific circuit requirements (Ex noise and
    fault-tolerance)
  • The designer may choose among a number of
    solutions provided by the Genetic Algorithm.

General Fitness Expression Fitness ? (wi.
fi) wi ? Weight Vector component for objective
i fi ? Fitness of the objective i.
How to find optimal weight vector?
Co-evolution of circuits and weights (Lohn,1998)
(Zebulum, 1998)
43
Enhancements in EAs
  • Adaptive mutation rate
  • Escape local optima by increasing rate of
    mutation
  • Speciation
  • Keep diversity by creating sub-populations
  • Multimodal problems subpopulations sampling
    different and interesting solutions to a
    particular problem.
  • Variable Length Representations
  • Map solutions of different sizes
  • Evolution of electronic circuits of different
    sizes.

44
Co-evolution idea
  • Two populations that evolve together
  • Powerful evolutionary dynamics arms races
  • Predators and preys performance improve
    together to maintain equilibrium. If predator
    gets faster so does pray in order to survive. If
    predator gets too fast they eat too much pray and
    later they have not enough left to survive. The
    system self-regulates.

45
In coevolution a second population mimics
dynamic fitness functions.
Std Evolution
Coevolution
  • Static fitness function
  • Difficult to weight objectives
  • Dynamic, multiple fitness functions
  • Objectives/weights are evolved

Lohn in evolving antennas
46
Coevolutionary Algorithm
  • Concept the goals start out easy, then increase
    in difficulty as the hardware designs improve
    (ZPD zone of proximal development)
  • Hardware designs are rewarded for solving goal
    vectors, with extra points for solving difficult
    goals
  • Goals that are too difficult (s.t. no design can
    solve it) or too easy are given low fitness.
    Difficulty is defined

47
Other techniques SPSA
  • Simultaneous perturbation stochastic
    approximation (SPSA) method.
  • The essential feature of SPSA, which provides its
    power and relative ease of use in difficult
    multivariate optimization problems, is the
    underlying gradient approximation that requires
    only two objective function measurements per
    iteration regardless of the dimension of the
    optimization problem. These two measurements are
    made by simultaneously varying in a "proper"
    random fashion all of the variables in the
    problem (the "simultaneous perturbation"). This
    contrasts with the classical ("finite-difference")
    method where the variables are varied one at a
    time. If the number of terms being optimized is
    p, then the finite-difference method takes 2p
    measurements of the objective function at each
    iteration (to form one gradient approximation)
    while SPSA takes only two measurements. A
    fundamental result on relative efficiency then
    follows
  • Under reasonably general conditions, SPSA and the
    standard finite-difference SA method achieve the
    same level of statistical accuracy for a given
    number of iterations even though SPSA uses p
    times fewer measurements of the objective
    function at each iteration (since each gradient
    approximation uses only 1/p the number of
    function measurements). This indicates that SPSA
    will converge to the optimal solution within a
    given level of accuracy with p times fewer
    measurements of the objective function than the
    standard method.
  • http//www.jhuapl.edu/SPSA/  

48
Hybrid search efficient-global followed by
efficient-local. Combine evolution with learning
  • Standard GA are efficient global search
    algorithms but no as good for local search
  • Solutions Hybrid GA
  • Combine with local search
  • Genetic local search
  • Lamarkian GA
  • Imanishian GAs

In general, methods involving a population of
candidate solutions, such as evolutionary
algorithms, may be useful for a broad search over
the domain of the parameters being optimized and
subsequent initialization of more powerful local
search algorithms. (the local search is not as
often implemented as it should)
49
Multi-stage Search Search for topology followed
by parameter optimization
  • First Stage GA-based Evolution of the circuit
    topology
  • Second Stage GA-based Optimization of the
    transistor sizes for the best topology resulted
    in the first stage. Initialization is made with
    the best topology and random parameters.

Second Stage (40 generations)
First Stage (200 generations)
Multiplier Evolved Through Multi-stage search W/L
in um
50
Example of Software Structure
T. Arslans work
51
Currents and Execution Times for GA subroutines
Using ARM and Thumb
Name ARM Thumb Power Code Size Current Time
Code Size Current Time Code Size Ratio Ratio (mA
) (CLK) (Byte) (mA) (CLK) (Byte) () () App
malloc 3.3 146 1835 3.1 253 1563 61 85 Advance
random 3.8 8051 3763 3.6 10751 3335 79 88 Objfun
3.9 38910 11739 3.6 46619 10023 90 85 Statistic 4
.0 14875 4259 3.7 19134 4187 84 98 Initdata 4.7 1
789 20 3.5 2555 12 94 60 App init 3.4 403 488 3.4
559 324 72 66 Preselect 4.3 6334 3691 4.0 7826 3
567 87 96 Initmalloc 3.6 14008 1987 3.1 21996 166
7 74 83 Warmup random 4.0 44572 4431 3.6 59904 4
063 83 91 Randomperc 3.8 8281 4631 3.6 12159 4195
72 90 Randomize 4.0 43020 4519 3.6 57608 4219 83
93 Flip 3.7 8332 4759 3.6 11061 4556 77 95 Rnd
3.8 8375 6143 3.2 11064 5415 90 88 Initpop 3.7 94
8736 22508 3.6 1355264 15815 72 70 Mutation 3.9 4
27520 4859 3.6 627712 4591 74 94 Crossover 3.8 87
04 6403 3.6 12544 6011 73 93 Select 3.8 17018 821
9 3.6 24752 7871 72 95 Initialize 4.0 1001472 743
9 3.5 1268480 6583 90 88 Generation 3.8 1014272 1
6343 3.6 1281792 15207 91 80
T. Arslans work
Write a Comment
User Comments (0)
About PowerShow.com