An Introduction to Genetic Algorithms - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

An Introduction to Genetic Algorithms

Description:

COMP5318 Data Mining, w6, s1 2005. 2. Outline of the lecture. History of ... Species ... Randomly choose two individuals, the fitter one will be left in ... – PowerPoint PPT presentation

Number of Views:236
Avg rating:3.0/5.0
Slides: 51
Provided by: josia1
Category:

less

Transcript and Presenter's Notes

Title: An Introduction to Genetic Algorithms


1
An Introduction toGenetic Algorithms
Lecture 6 COMP4044/5318
  • Josiah Poon

2
Outline of the lecture
  • History of Evolutionary Algorithms
  • Genetic Algorithm (GA)
  • Overview
  • The Basic GA Algorithm
  • Two Simple Examples
  • Why GA works? Schema Theorem
  • Advantages Disadvantages
  • GA for Rule Discovery
  • Case Study of Applying GA to ML DM
  • GABIL

3
Biological Evolution
  • Lamarck and others
  • Species transmute over time
  • Experience of an organism directly affects the
    genetic makeup of its offspring
  • Darwin and Wallace
  • Consistent, heritable variation among individuals
    in population
  • Natural selection of the fittest
  • Mendel and genetics
  • A mechanism for inheriting traits
  • Mapping of genotypes to phenotype

4
The Big Picture (primitive organisms)
t
selection
reproduction
mutation
5
The Big Picture (higher level organisms)
t
selection
reproduction
Sex is good
chromsomes
recombination
6
The Metaphor
  • Machine Learning
  • Evolution
  • Individual
  • Fitness
  • Environment

7
Taxonomy
Computational Intelligence or Soft Computing
Neural Networks
Evolutionary Computation
Fuzzy Systems
Evolutionary Programming
Evolution Strategies
Genetic Algorithms
Genetic Programming
8
Family of Evolutionary Computation
  • Computational procedures patterned after
    biological evolution
  • Search procedure that probabilistically applies
    search operators to set of points in the search
    space

9
GA Overview
  • Developed by John Holland
  • Search algorithms based on the mechanics of
    natural evolution survival of the fittest
  • Ability to create an initial population of
    feasible solutions, then recombine them in such a
    way to direct the path to the most promising
    areas of the search space.
  • Each individual solution is encoded as a
    chromosome (also called genotype) with binary
    representation, a fitness function is used to
    measure the fitness of the phenotype.
  • The fitness of a phenotype determines its chances
    of survival.

10
SGA The Basic Algorithm
t 0 Generate initial population, Pt Evaluate
all individuals in Pt using a fitness
function While not end of evolution t t
1 Reproduce Pt from Pt-1 Perform Crossover in
Pt Perform Mutation in Pt Evaluate all
individuals in Pt using a fitness function
11
Simple GA (continued)
  • Use probabilistic rules to evolve a population
    from one generation to the next
  • Biased (not random) reproduction
  • Crossover
  • Mutation
  • A few parameters to twist
  • Population size
  • Crossover rate
  • Mutation rate

12
GA Preliminary Considerations
  • Representation
  • Bit string
  • Can be real numbers, integers, characters, list
    of rules, matrices
  • All chromosomes in a population are of the same
    type
  • Choice of alphabets
  • Length of chromosome
  • Chromosomes are of the same length
  • Most GAs use haploid representation as compared
    against humans diploid representation
  • Fitness Function

13
GA Preliminary Considerations
  • Population Size
  • Remain constant throughout all generations
  • What is the problem of a small or a large
    population?
  • Too small efficient computation but premature
    convergence, i.e. trapped in local optima
  • Too large greater chance to find the global
    optimum, but higher computational cost

14
GA Selection for Reproduction
  • Fitness Proportionate Methods
  • Roulette Wheel
  • The classical selection operator for generational
    GA as described by GA.
  • Each member of the pool is assigned space on a
    roulette wheel proportional to its fitness.
  • The members with the greatest fitness have the
    highest probability of selection.
  • Tournament
  • Randomly choose two individuals, the fitter one
    will be left in the next generation
  • Rank
  • Sort the individuals according to their fitness
  • The selection is based on their ranks

15
GA Crossover
  • Mate each chromosome randomly
  • In each mating,
  • randomly select the crossover positions
  • genetic materials between two parents are swapped
  • Various crossover techniques
  • One-point
  • Two points
  • Uniform

16
GA Crossover Example
Single Point Crossover Parent1 100
1001010 Parent2 001 0110111 Child1 100
0110111 Child2 001 1001010
Two Points Crossover Parent1 100 1001
010 Parent2 001 0110 111 Child1 100 0110
010 Child2 001 1001 111
Uniform Crossover Parent1 1001001010 Parent2 0010
110111 Template 1001110001 Child1 1011000000 Chil
d2 0000111011
17
GA Mutation
  • Randomly change the allele in a random locus
  • Able to search the entire state-space if enough
    time is given
  • Restore lost information or add information to
    the population
  • Perform on a child after crossover
  • Perform very infrequently, pm usually ? 0.01

Mutation Child 1001001010 After
Mutation 1001101010
18
Generational versus Steady State
  • Generational GA
  • Replace the whole population with new individuals
    in the next generation
  • Steady State GA
  • Keep the old population but replace the k weakest
    individuals by new offspring

19
GA versus Traditional Search Algorithm
st1
st
20
GA versus Traditional Search Algorithm
  • GA works from a population of strings instead of
    a single point.
  • Application of GA operators causes information
    from the previous generation to be carried over
    to the next.
  • GA uses probabilistic transition rules, not
    deterministic rules.

21
The Search Mechanism
  • A search is composed of exploration and
    exploitation
  • The search in GA
  • Exploration by
  • Recombination
  • Mutation
  • Exploitation by
  • Selection

22
An Example
  • f(x) 4cos(x) x 2.5
  • 0 ? x ? 31
  • Representation a 5-bit binary string
  • Parameter setting
  • Population size 8
  • Crossover rate 0.75
  • Mutation rate 0.001
  • Max. generation 40

23
f(x) when 0 ? x ? 31
best 28.33144
24
Another Example
  • Same f(x)
  • 0 ? x ? 232-1
  • Representation a 32-bit binary string
  • Parameter setting
  • Population size 20
  • Crossover rate 0.75
  • Mutation rate 0.001
  • Max. generation 50

25
f(x) when 0 ? x ? 232-1
26
f(x) when 0 ? x ? 232-1
Best 2.0708049813083885E9
27
Examplef(x) 4cos(x) x 2.5
Selection Using Roulette Wheel
28
Why it works?
  • An abstract way to view the complexities of
    crossover
  • Schema string of 0, 1, (dont care)
  • Consider a 6-bit representation
  • 0 represents a subset of 32 strings
  • 100 represents a subset of 8 strings
  • Let H represent a schema such as 11
  • Order o(H), the number of fixed positions in the
    schema, H
  • o(1) 1
  • o(100) 3
  • Length delta(H), the distance between sentinel
    fixed positions in H
  • ?(11) 4 1 3
  • ?(1) 0

29
Why it works?- Consider just the selection (1)
m(s, t) no. of instances having schema s in
population at time t
30
Why it works?- Consider just the selection (2)
Probability of selecting h in one selection step
31
Why it works?- Schema Theorem
  • m(s,t) instances of schema s in pop. at time t
  • avg. fitness of pop. at time t
  • avg. fitness of instances of s at time t
  • pc probability of single point crossover
  • pm probability of mutation
  • l length of single bit strings
  • o(s) number of defined (non ) bits in s
  • ?(s) distance between the leftmost and
    rightmost in s

32
GA (and other EAs) Advantages
  • A robust search technique
  • No (little) knowledge (assumption) w.r.t. the
    problem space
  • Fairly simple to develop low development costs
  • Easy to incorporate with other methods
  • Solutions are interpretable

33
GA Advantages (continued)
  • Can be run interactively, i.e. accommodate user
    preference
  • Provide many alternative solutions
  • Acceptable performance at acceptable costs on a
    wide range of problems
  • Intrinsic parallelism (robustness, fault
    tolerance)

34
GA Disadvantages
  • No guarantee for optimal solution within a finite
    time
  • Weak theoretical basis
  • Interdependency of genes
  • Parameter tuning is an issue
  • Often computationally expensive, i.e. slow

35
GP An Example
36
GP - Crossover
37
GA for Rule Discovery
  • Representation
  • How rules are encoded?
  • Rule antecedent
  • Rule consequent
  • Genetic Operators
  • Selection
  • Generalisation/Specialisation
  • Fitness Function

38
GA for Rule Discovery
  • Representation how rules are encoded?
  • Michigan versus Pittsburgh Approach
  • Michigan each individual encodes a single rule
  • Pittsburgh each individual encodes a set of
    rules

39
GA Application
40
GA for Rule Discovery- Representing the Rule
Consequent
  • 3 ways
  • Encode the predicted class in the genome
  • Associate all individuals of the population with
    the same predicted class
  • Choose the predicted class most suitable for a
    rule
  • Can be the class that has more representatives
  • The class that maximizes the individual fitness

41
GABIL Representation
a1 a2 c a1 a2 c 10 01 1 11 10 0
a1 a2 c a1 a2 c 01 11 0 10 01 0
  • Binary String
  • Conjunctive forms with internal disjunction
  • The LHS of each rule consists of a conjunction of
    one or more tests involving feature values
  • Concept represented as a disjunctive set of
    overlapping classification rules, i.e. in
    Disjunctive Normal Form

42
GA for Rule Discovery- Genetic Operators
  • Selection
  • In the Michigan approach
  • Avoid the convergence to the same single rule
  • Forming niches encouraging the evolution of
    several different rules (each covering a
    different part of the data space)
  • Generalizing/Specialization
  • Can be implemented via bitwise logical function
  • Generalization
  • Subtraction of a small quantity from the value
  • deletion of a condition
  • Specialization
  • Add a small quantity to a value
  • Addition of another condition in the antecedent

43
GA for Rule Discovery- Fitness Function (1)
  • The discovered rules should
  • Have high predicted accuracy
  • Be comprehensible
  • Be interesting
  • For a rule A ? C, accuracy is measured using the
    confidence factor, A?C/A
  • But such a simple measurement may lead to
    overfitting the data, e.g.
  • just entertaining one example in the training set

44
GA for Rule Discovery- Fitness Function (2)
Actual class
Predicted class
  • Confidence TP/(TPFP)
  • Completeness TP/(TPFN)
  • Simplicity ? 1/num_conditions_in_antecedent
  • Fitness w1 x confidence x completeness w2 x
    simplicity

45
An Example GABIL (DeJong, 1993)
  • Learn disjunctive set of propositional rules,
    competitive with C4.5
  • Fitness Fitness(h) (correct(h))2
  • Representation
  • If a1 T ? a2 F Then c T If a2 T Then c
    F they can be represented by

a1 a2 c 10 01 1
a1 a2 c 11 10 0
46
GABIL Crossover with Variable-Length Bit-strings
h1
h2
a1 a2 c a1 a2 c 10 01 1 11 10 0
a1 a2 c a1 a2 c 01 11 0 10 01 0
  • Choose crossover points for h1, e.g. after bits
    1, 8
  • Restrict points in h2 to those that produce
    bitstrings with well-defined semantics, e.g.
    lt1,3gt, lt1,8gt, lt6,8gt
  • If we choose lt1, 3gt, result is

h3
h4
a1 a2 c 10 11 0
a1 a2 c a1 a2 c a1 a2 c 01 01 1 11 10 0
10 01 0
47
GABIL Extensions
  • New genetic operators (applied probabilistically)
  • AddAlternative generalize by changing a
    constraint ai from 0 to 1
  • DropCondition generalize by changing a
    constraint on ai from every 0 to 1

48
GABIL Results
  • Performance of GABIL comparable to symbolic
    rule/tree learning methods C4.5, ID5R, AQ14
  • Average performance on a set of 12 synthetic
    problems
  • GABIL without AC DC 92.1 accuracy
  • GABIL with AC DC 95.2 accuracy
  • Symbolic learning methods ranged from 91.2 to
    96.6

49
Books
  • T. Mitchell, Machine Learning (Ch.9).
    McGraw-Hill. 1997
  • M. Mitchell, An Introduction to Genetic
    Algorithms, MIT Press, 1996.
  • J. Koza, Genetic Programming, MIT Press, 1992.
  • D.E. Goldberg, Genetic Algorithms in Search,
    Optimization and Machine Learning,
    Addison-Wesley, 1989.
  • T. Bäck. Evolutionary Algorithms in Theory and
    Practice. Oxford University Press, 1996.
  • D.B. Fogel, Evolutionary Computation, IEEE, 1995.
  • Z. Michalewics, Genetic Algorithms Data
    Structures Evolution Programs, Springer, 3rd
    ed, 1996.

50
Papers
  • Freitas, Alex (2002). A Survey of Evolutionary
    Algorithms for Data Mining and Knowledge
    Discovery. In A Ghosh and S Tsutsui, editors,
    Advances in Evolutionary Computation, pages
    819-845. Springer-Verlag, August 2002
  • DeJong, K., Spears, W. and Gordon, D. (1993).
    Using Genetic Algorithms for Concept Learning.
    Machine Learning, 13, pp.161-188.
  • Weiss, G. (1999). Timeweaver A Genetic Algorithm
    for Identifying Predictive Patterns in Sequences
    of Events. Proceedings of the Genetic and
    Evolutionary Computation Conference (GECCO-99),
    Morgan Kaufmann, San Francisco, CA, 718-725.
  • Kim, Y, and Street, W. and Menczerin, F. (2003).
    Feature Selection in Data Mining in Data Mining
    Opportunities and Challenges. (ed) John Wang,
    published by Hershey, Pa London. 2003. pp.80-105
  • Mitra, S. and Pal, S. (2000). Data Mining in Soft
    Computing Framework A Survey. IEEE Trans. on
    Neural Networks. 13(1).
Write a Comment
User Comments (0)
About PowerShow.com