Evolutionary Computation I - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Evolutionary Computation I

Description:

the action of the genetic operators used in the EA provided advantages over ... fortunately genetically endowed conspecifics, and thus will survive and pass on ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 44
Provided by: jhall9
Category:

less

Transcript and Presenter's Notes

Title: Evolutionary Computation I


1
Evolutionary Computation I
  • COMP4001/7001
  • 5 September 2005

2
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Effects of evolutionary operators
  • The course of computational evolution
  • Interactions between evolution and

3
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Effects of evolutionary operators
  • The course of computational evolution
  • Interactions between evolution and

4
Why look to evolution?
  • Evolution is inherently interesting
  • Evolutionary theories are easy to generate,
    almost impossible to test
  • EC modelling can be used as a proof of principle,
    but never existence proof
  • Evolution is an optimization technique
  • Biological evolution optimizes organisms to their
    environments
  • EC can be used to optimize programs / processes
  • Evolution is the only process to date which has
    produced intelligence
  • EC can be used in an attempt to understand how
    human intelligence works
  • May be the best hope for AI

5
History of EC
  • 1957 Fraser (Australian geneticist)
  • bitstring representation of a chromosome and a
    stochastic Monte Carlo approach
  • investigating questions such as the effects of
    linkage on the efficiency of selection, the
    relationship between the fitnesses of alleles and
    factors such as population size and intensity of
    selection, and the comparison of the efficiencies
    of different breeding plans for varying degrees
    of inter-locus interactions
  • The algorithm was run on SILLIAC, the parallel
    computer at the University of Sydney. SILLIAC
    (Sydney ILLIAC) was a slightly modified version
    of the ILLIAC developed by the University of
    Illinois, Urbana, United States. It cost 50,000
    to construct, had a store of 1024 40-bit words,
    could perform 13,333 additions/subtractions per
    second, and read its input off punched paper tape

6
More history
  • 1957 Box
  • evolutionary optimisation process for the
    improvement of processes in a chemical plant,
    involving carefully planned variations to the
    procedures used in the operation of the plant
    itself
  • 1966 Fogel, Owens and Walsh
  • groundbreaking analysis of the possibilities of
    simulated evolution for the development of
    artificial intelligence
  • 1975 John Holland
  • Adaptation in Natural and Artificial Systems an
    Introductory Analysis with Applications to
    Biology, Control and Artificial Intelligence
  • Classic GA

7
Evolutionary algorithms
  • All EC algorithms involve
  • a population of individuals
  • which undergo repeated generations of genetic
    modification, fitness evaluation and
    fitness-proportionate selection.
  • The genetic operators used to perform the
    genetic modifications are simplified versions of
    those found in biological systems.
  • Many operators have been described in the
    literature
  • Lots of different flavours of EA
  • Each makes different decisions about
    implementation

8
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Effects of evolutionary operators
  • The course of computational evolution
  • Interactions between evolution and

9
Operators
  • Representation
  • Hollands GA used binary chromosomes
    (bitstrings)
  • representations ranging from strings of floating
    point numbers to entire Lisp programs are used
    for different problems by various practitioners
  • Mutation
  • acts to introduce variability into the population
    by altering the chromosome
  • most usual mutation operator for a bitstring
    chromosome consists of flipping a bit from 0 to 1
    or vice versa, with a given probability, the
    mutation rate.

10
More operators
  • Crossover
  • recombines parts of two (or more) chromosomes to
    form new individuals
  • Single point crossover

11
Selection
  • Selection should be fitness proportionate
  • fitter individuals should contribute more to the
    next generation, on average, than less fit
    individuals
  • selection method should have an element of
    stochasticity so that every individual, no matter
    how unfit, has a chance of becoming a parent
  • If only the fittest individuals in each
    generation are allowed to breed the population
    rapidly converges to the best solution found
    early, which is very unlikely to be the global
    best solution
  • Lots of different selections algorithms, produce
    different types of selection pressure

12
The Simple Genetic Algorithm
13
Other Approaches
  • Evolutionary Programming (EP)
  • Fogel in the early 1960s, it has no genomic
    representation. Each individual in the population
    is an algorithm chosen at random over an
    appropriate sample space. Mutation is the only
    genetic operator used EP does not use crossover
  • Evolution Strategies (ES)
  • Schwefel, also in the 1960s, as an optimisation
    tool. ES uses a real-valued chromosome with a
    population size of one and mutation as the only
    genetic operator. In each generation the parent
    is mutated to produce a descendant if the
    descendant it fitter it becomes the parent for
    the next generation, otherwise the original
    parent is retained.

14
And more
  • Classifier Systems
  • Holland (1975). A classifier takes inputs from
    the environment and produces outputs indicating a
    classification of the input events. A classifier
    system produces new classifiers through the
    action of a genetic algorithm on the systems
    population of classifiers
  • Genetic Programming (GP)
  • Koza in the late 1980s, the aim of GP is the
    automatic programming of computers allowing
    programs to evolve to solve a given problem. The
    population consists of programs expressed as
    parse trees operators used include crossover,
    mutation and architecture-altering operations
    patterned after gene duplication and gene
    deletion in nature
  • Many others, often tailored to problem at hand

15
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Effects of evolutionary operators
  • The course of computational evolution
  • Interactions between evolution and

16
Fitness landscapes
  • Wright (1932) for a given set of genes each
    possible combination of gene values (alleles)
    could be assigned a fitness value for a
    particular set of conditions
  • Entire genotype space can then be visualized as a
    landscape, with genotypes of high fitness
    occupying peaks and those of low fitness forming
    troughs
  • Generally very high-dimensional

17
The course of evolution in silico
This EA has a chromosome length of 10 bits and a
population of 10 individuals. The fitness
function is simply a count of the number of 1s in
the chromosome maximum fitness is therefore 10.
The EA uses elitism, where the fittest individual
in each generation is retained. Elitism ensures
that a good solution, once found, is never lost,
and means that the maximum fitness in the
population always increases
18
Computational evolution
  • Fitness originally random
  • Increases over time
  • Faster at first
  • Eventually converges to a local optimum
  • Not necessarily the global optimum
  • Stochastic, so usually must be repeated
  • Can be time consuming
  • Can produce good solutions that work unexpectedly

19
Schema Theorem
  • Holland, 1975
  • short, low-order, above-average schemata receive
    exponentially increasing trials in subsequent
    generations
  • If the chromosome is a bit string, a schema is a
    set of building blocks described by a template
    consisting of ones, zeros and asterisks
  • Template 100011 can be
  • 10100111
  • 10100101
  • 10000101
  • 10000111

20
Schema theorem
  • an evolutionary algorithm proceeds by identifying
    short schemas of high fitness in different
    individuals, and recombining them using crossover
    in order to produce longer schemas of higher
    fitness, and eventually entire individuals having
    high fitness
  • attractive because it suggests that schemas can
    be identified and the effects of mutation and
    crossover upon schemas in a population of a given
    size can be calculated exactly
  • mathematical tractability would potentially
    provide useful insights into the way in which an
    EA functions

21
Testing schema theory
  • Royal road functions - Mitchell, Forrest and
    Holland (1991)
  • structured to provide a smooth, easy path to
    maximum fitness under the assumptions of schema
    theory
  • hierarchical fitness landscape, in which
    crossover between instances of fit lower-order
    schemas tends to produce ever fitter higher-order
    schemas
  • relatively highly fit intermediate stages could
    in fact interfere with the finding of fit
    higher-order solutions, since once an instance of
    a fit intermediate schema is discovered its
    relatively high fitness allows it to spread
    quickly throughout the population, carrying with
    it hitchhiking genes in positions not included
    in the schema. Low-order schemas tend to be
    discovered more-or-less sequentially, rather than
    in parallel

22
Variability
  • Basis of evolution
  • Mostly mutation
  • In Eas, mostly point mutations

23
Mutation
24
Mutation rate
  • Mutational meltdown a mutation rate so high
    that the species cannot survive in the face of
    the number of errors generated
  • about 1 mutation per genome per generation given
    that mutations occur at random
  • maximum rate at which an organism can expect to
    produce at least one error-free offspring in its
    lifetime
  • Many EC implementations use a mutation rate of
    1/genome
  • In RNA viruses, about one nucleotide per genome
    is incorrectly reproduced per replication for
    retroviruses the rate is one nucleotide per ten
    genomic replications and for DNA-based microbes
    it is about one per 300 replications
  • Longer genomes do not have higher mutation rates
    error-correcting machinery

25
Error correction (Ridley, 2000)
  • Autocopying the first reproducers were probably
    molecules of RNA or something similar, that could
    copy themselves using bases from their
    environment
  • Copying enzymes the evolution of enzymes which
    catalysed the copying process would also have
    made the process more reliable
  • Double stranded genetic material organisms which
    used DNA rather then RNA have the advantage of
    having a more stable information carrying
    molecule, plus the advantage of having a two
    complementary copies of the sequence, to
    facilitate error checking
  • Suite of proofreading and repair enzymes
  • Development a developmental process translating
    a genotype into a phenotype allows for the
    correction of errors on the fly in the course of
    development all errors do not have to be
    corrected in the genome
  • Ploidy using two or more copies of each
    chromosome provides redundancy of the genetic
    information, permitting the identification and
    correction of errors
  • Sex recombination of genetic material from more
    than one individual introduces the possibility of
    concentrating genetic errors in a small
    proportion of scapegoat offspring, allowing the
    other offspring to be error-free

26
Neutrality
  • Before the details of the molecular basis of
    genetics were worked out in the late 1950s, it
    was generally assumed that most mutations cause
    phenotypic alterations that are immediately
    subject to selection. Under these circumstances
    all the variation in a population is adaptive
  • Electrophoresis huge amount of variability at
    the protein level
  • Motoo Kimura (1968) evolution is driven
    primarily by random drift among equally
    well-adapted sequence variants
  • Ohta (1973) Nearly neutral variants which do,
    in fact, have a small selective difference can
    become effectively neutral in small populations,
    where random events become more important
  • Neutral networks in EC have been demonstrated
    to affect the course of evolution by facilitating
    random drift to more useful areas of the search
    space

27
Managing variability
  • Variability is systematically eroded by
    selection, while at the same time being
    replenished via mutation and recombination
  • Different flavours of EA emphasise the importance
    of mutation (e.g. evolutionary programming)
    versus recombination (e.g. genetic algorithms) in
    generating novelty
  • Effects of selection tend to outweigh those of
    mutation and recombination, and the population
    converges towards a peak in fitness space
  • Neutral mutations rarely occur, unless
    deliberately designed into the algorithm

28
Premature convergence
  • In most EAs the entire population eventually
    reaches a single peak and tends to stay there
  • If this peak is not the global maximum, the EA is
    considered to have converged prematurely
  • Premature convergence occurs when the population
    loses the genetic variability which is essential
    to continued evolution
  • This almost complete loss of genetic diversity is
    never observed in biological populations

29
Causes of premature convergence
  • Haploid genotype exposes every mutation to
    selection
  • Diploid genotype have been used require a
    dominance map or equivalent
  • EAs using diploid chromosomes do
  • tend to maintain more genetic variability
  • than haploid EAs, but they rarely find
  • better solutions
  • benefit of recessively masked variability
  • will only be realised if the environment
  • in which the population is evolving changes

30
Psuedo Founder Effects
  • The Founder Effect occurs when a population
    passes through a population size bottleneck, from
    which only a few individuals emerge to establish
    a new population, for example when a small number
    of individuals colonize a new island
  • In EAs a related phenomenon frequently occurs
    when a very fit individual arises in the
    population it tends to dominate future
    generations
  • since most individuals are descended from a
    single individual they tend to be very similar in
    sequence, and so the crossover operator will have
    little effect
  • any genes which happen to be on the
    pseudo-founders chromosome will also spread
    throughout the population, whether or not they
    are valuable, a phenomenon know as hitchhiking

31
Other factors
  • Intense, unidirectional selection pressure
  • Development
  • Troubleshooting mechanisms
  • Added source of noise
  • Environmental interactions

32
Speciation
  • Preselection two individuals are mated to
    produce an offspring, which is compared with both
    the parents. If the fitness of the child is
    greater than that of the worst parent, it
    replaces that parent in the population. The idea
    is that individuals are replaced by others which
    are fitter than they are, but similar in
    sequence, so that a number of different solutions
    can be maintained in the population, improving
    gradually over time
  • Crowding the crowding of solutions in search
    space is discouraged, for example by comparing a
    new individual with a subset of the existing
    population, and replacing the most similar of
    that subset with the new individual.
  • Fitness Sharing when there are a number of
    individuals with very similar sequences, the
    fitness of that genotype is shared amongst them
    all. This is a very popular diversity maintenance
    operator, and there are a large number of
    variants on the scheme.

33
More speciation
  • Niching encouraging the development of different
    ecological niches in the population, using an
    approach such as the spatially restrained
    grid-based algorithm
  • Coevolution evolving more than one type of
    individual at once, with different species
    attempting as part of their fitness function to
    maintain as much genetic distance from other
    species as possible.
  • Restricted Mating individuals are only allowed
    to mate if they are in the basin of attraction of
    the same optimum. Once again, this scheme
    attempts to replace like with like in the
    population. There are a number of variants on the
    restricted mating approach

34
Hill Climbing
  • implicit parallelization by maintaining a
    population of candidate solutions which are
    modified by mutation and/or crossover, the
    algorithm is, in effect, exploring different
    regions of its search space in parallel
  • simplest alternative to a population based EA is
    a hill climber, an algorithm which has a
    population of one individual, and performs a
    strictly local search using mutation
  • The parallel nature of an EA provides no
    advantages over multiple random restarts of a
    hill climber in terms of the number of solution
    evaluations performed

35
When is an EA better?
  • the action of the genetic operators used in the
    EA provided advantages over local search, which
    would, indeed, be the case if the schema theorem
    was acting as described, with useful partial
    solutions discovered by different individuals
    being recombined to produce fitter individuals
    more rapidly than could be done by mutation
    alone or
  • the structure of the fitness landscape was such
    that the implicit memory of a population-based
    algorithm (i.e. the memory encoded into the
    structure of the population itself as a result of
    evolution) allowed it to concentrate its search
    in areas of high fitness in a manner that would
    not be possible for a hill climber
  • In practice, hill climbers with multiple restarts
    often perform as efficiently as or better than
    population-based algorithms

36
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Effects of evolutionary operators
  • The course of computational evolution
  • Interactions between evolution and

37
Coevolution
  • The interactions between two or more species as
    they evolve
  • Kauffmans rubber sheet evolution by one species
    modifies the fitness landscape for both species
    the coevolving species is thus given a spur to
    further evolution, as its environment changes
  • Fitness landscape is constantly changing
  • powerful strategy for avoiding premature
    convergence in evolutionary algorithms is less
    chance of the population converging to a local
    minimum, since local (and global) minima are
    constantly forming and dissolving as the fitness
    landscape changes

38
Using coevolution
  • Samuels checker players (1963)
  • hill climber, in which two programs played
    against each other
  • In the course of the game one program modified
    its parameter settings, while the other remained
    static
  • If the modified copy won the game, it was
    accepted, otherwise the original was retained
  • eventually played checkers at the level of a
    human champion
  • Fogel (2001) still using evolution to develop
    checkers players (Blondie21)

39
Learning and evolution
  • Neural networks may be evolved architecture,
    connection weights, or both
  • Baldwin Effect (Baldwin, 1896) learning on the
    part of individuals could guide the course of
    evolution in the population as a whole
  • A particular trait may be learned, or it may be
    innate
  • A learned trait has the advantage of providing
    flexibility, but the disadvantage of being slow
    to acquire an innate trait is present from
    birth, but inflexible
  • Traits which are initially learned may become,
    over time, encoded into the genotype of the
    population

40
The Baldwin effect
  • Two preconditions must be met
  • The trait in question (which may be a behaviour
    or a physical trait) must be influenced by
    several interacting genes, so that a mutation in
    one of these genes will make the phenotypic
    expression of the trait more likely and
  • an individual bearing such a mutation can learn
    to express the trait
  • learning acts to provide partial credit for a
    mutation
  • An individual carrying a mutation that
    predisposes it towards an advantageous phenotype
    will learn the trait more easily than its less
    fortunately genetically endowed conspecifics, and
    thus will survive and pass on more copies of that
    allele to the next generation. Over time,
    multiple mutations will accumulate in the genes
    for the desirable trait, which will thus become
    innate in the population

41
Baldwin landscapes
42
Conclusions
  • Evolutionary computation can be used to
  • Model biological evolution
  • Optimize arbitrary functions
  • Evolve artificial intelligence?
  • EC is a very simplified version of real evolution
  • It is important to understand what these
    simplifications involve
  • Can be combined with many of the other approaches
    discussed to date
  • EC can be used to solve problems which are
    otherwise intractable

43
Learning ObjectivesAt the end of this lecture
students will understand
  • What is an evolutionary algorithm?
  • Population-based, adaptive, optimization, not
    necessarily global optimum
  • Effects of evolutionary operators
  • Mutation, crossover, selection, others
  • The course of computational evolution
  • Fitness increase, variability, premature
    convergence, etc
  • Interactions between evolution and
  • Coevolution, evolution and learning
Write a Comment
User Comments (0)
About PowerShow.com