GENETIC ALGORITHMS AND GENETIC PROGRAMMING - PowerPoint PPT Presentation

About This Presentation
Title:

GENETIC ALGORITHMS AND GENETIC PROGRAMMING

Description:

Truss has 10 members (6 are length of 30 feet and 4 are length 302 = 41 feet) ... The weight is based on volume (i.e., cross-sectional area length) TRUSS GENOME ... – PowerPoint PPT presentation

Number of Views:830
Avg rating:3.0/5.0
Slides: 243
Provided by: leej152
Category:

less

Transcript and Presenter's Notes

Title: GENETIC ALGORITHMS AND GENETIC PROGRAMMING


1
GENETIC ALGORITHMS AND GENETIC PROGRAMMING
2
  • John R. Koza
  • Consulting Professor (Medical Informatics)
  • Department of Medicine
  • School of Medicine
  • Consulting Professor
  • Department of Electrical Engineering
  • School of Engineering
  • Stanford University
  • Stanford, California 94305
  • koza_at_stanford.edu
  • http//www.smi.stanford.edu/people/koza/

3
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

4
GENETIC ALGORITHM (GA)
Generation 0 Generation 1
Individuals Fitness Offspring
011 3 111
001 1 010
110 6 110
010 2 010
5
HAMBURGER RESTAURANT PROBLEM
  • Price
  • 1 0.50 price
  • 0 10.00 price
  • Drink
  • 1 Coca Cola
  • 0 Wine
  • Ambiance
  • 1 Fast snappy service
  • 0 Leisurely service with tuxedoed waiter

6
CHROMOSOME (GENOME) OF THE GLOBAL OPTIMUM
  • McDONALD's

1 1 1
7
THE SEARCH SPACE
1 000
2 001
3 010
4 011
5 100
6 101
7 110
8 111
  • Alphabet size K2, Length L3
  • Size of search space KL2L238

8
IMPRACTICALITY OF RANDOM OR ENUMERATIVE SEARCH
  • 81-bit problems are very small for GA
  • However, even if L is as small as 81, 281 1027
    number of nanoseconds since the beginning of
    the universe 15 billion years ago

9
GA FLOWCHART
10
GENERATION 0
Generation 0 Generation 0 Generation 0
1 011 3
2 001 1
3 110 6
4 010 2
Total Total
Worst Worst
Average Average
Best Best
11
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

12
PROBABILISTIC SELECTION BASED ON FITNESS
  • Better individuals are preferred
  • Best is not always picked
  • Worst is not necessarily excluded
  • Nothing is guaranteed
  • Mixture of greedy exploitation and adventurous
    exploration
  • Similarities to simulated annealing (SA)

13
PROBABILISTIC SELECTION BASED ON FITNESS
14
DARWINIAN FITNESS PROPORTIONATE SELECTION
Generation 0 Generation 0 Generation 0 Mating pool Mating pool
1 011 3 .25 011 3
2 001 1 .08 110 6
3 110 6 .50 110 6
4 010 2 .17 010 2
Total Total 12 12 17
Worst Worst 1 1 2
Average Average 3.00 3.00 4.5
Best Best 6 6 6
15
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

16
MUTATION OPERATION
  • Parent chosen probabilistically based on fitness
  • Mutation point chosen at random
  • One offspring

Parent
010
Parent
--0
Offspring
011
17
AFTER MUTATION OPERATION
Generation 0 Generation 0 Generation 0 Mating pool Mating pool Generation 1 Generation 1 Generation 1
1 011 3 .25 011 3
2 001 1 .08 110 6
3 110 6 .50 110 6
4 010 2 .17 010 2 --- 011 3
Total Total 12 12 17
Worst Worst 1 1 2
Average Average 3.00 3.00 4.5
Best Best 6 6 6
18
CROSSOVER OPERATION
  • 2 parents chosen probabilistically based on
    fitness

Parent 1 Parent 2
011 110
19
CROSSOVER (CONTINUED)
  • Interstitial point picked at random
  • 2 remainders
  • 2 offspring produced by crossover

Fragment 1 Fragment 2
01- 11-
Remainder 1 Remainder 2
- - 1 - - 0
Offspring 1 Offspring 2
111 010
20
AFTER CROSSOVER OPERATION
Generation 0 Generation 0 Generation 0 Mating pool Mating pool Generation 1 Generation 1 Generation 1
1 011 3 .25 011 3 2 111 7
2 001 1 .08 110 6 2 010 2
3 110 6 .50 110 6
4 010 2 .17 010 2
Total Total 12 12 17
Worst Worst 1 1 2
Average Average 3.00 3.00 4.5
Best Best 6 6 6
21
AFTER REPRODUCTION OPERATION
Generation 0 Generation 0 Generation 0 Mating pool Mating pool Generation 1 Generation 1 Generation 1
1 011 3 .25
2 001 1 .08
3 110 6 .50 110 6 --- 110 6
4 010 2 .17
Total Total 12 12 17
Worst Worst 1 1 2
Average Average 3.00 3.00 4.5
Best Best 6 6 6
22
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

23
GENERATION 1
Generation 0 Generation 0 Generation 0 Mating pool Mating pool Generation 1 Generation 1 Generation 1
1 011 3 .25 011 3 2 111 7
2 001 1 .08 110 6 2 010 2
3 110 6 .50 110 6 --- 110 6
4 010 2 .17 010 2 --- 011 3
Total Total 12 12 17 18
Worst Worst 1 1 2 2
Average Average 3.00 3.00 4.5 4.5
Best Best 6 6 6 7
24
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

25
DEFINITION OF THE GENETIC ALGORITHM (GA)
  • The genetic algorithm is a probabalistic search
    algorithm that iteratively transforms a set
    (called a population) of mathematical objects
    (typically fixed-length binary character
    strings), each with an associated fitness value,
    into a new population of offspring objects using
    the Darwinian principle of natural selection and
    using operations that are patterned after
    naturally occurring genetic operations, such as
    crossover (sexual recombination) and mutation.

26
PROBABILISTIC STEPS
  • The initial population is typically random
  • Probabilistic selection based on fitness
  • - Best is not always picked
  • - Worst is not necessarily excluded
  • Random picking of mutation and crossover points
  • Often, there is probabilistic scenario as part of
    the fitness measure

27
ANTENNA DESIGN
28
ANTENNA DESIGN
  • The problem (Altshuler and Linden 1998) is to
    determine the x-y-z coordinates of the
    3-dimensional position of the ends (X1, Y1, Z1,
    X2, Y2, Z2, , X7, Y7, Z7) of 7 straight wires so
    that the resulting 7-wire antenna satisfies
    certain performance requirements
  • The first wire starts at feed point (0, 0, 0) in
    the middle of the ground plane
  • The antenna must fit inside the 0.5? cube

29
ANTENNA GENOME
X1 Y1 Z1 X2 Y2 Z2
0010 -1110 0001 0011 -1011 0011
  • 105-bit chromosome (genome)
  • Each x-y-z coordinate is represented by 5 bits
    (4-bit granularity for data plus a sign bit)
  • Total chromosome is 3 ? 7 ? 5 105 bits

30
ANTENNA FITNESS
  • Antenna is for ground-to-satellite communications
    for cars and handsets
  • We desire near-uniform gain pattern 10? above the
    horizon
  • Fitness is measured based on the antenna's
    radiation pattern. The radiation pattern is
    simulated by National Electromagnetics Code (NEC)

31
ANTENNA FITNESS
  • Fitness is sum of the squares of the difference
    between the average gain and the antenna's gain
  • Sum is taken for angles ? between -90? and 90?
    and all azimuth angles ? from 0? to 180?
  • The smaller the value of fitness, the better

32
GRAPH OF ANTENNA FITNESS
33
U. S. PATENT 5,719,794
34
10-MEMBER TRUSS
35
10-MEMBER TRUSS
  • Prespecified topological arrangement of the 10
    members, the load, and the wall (Goldberg and
    Samtani 1986)
  • Truss has 10 members (6 are length of 30 feet and
    4 are length 30v2 41 feet)
  • The problem is to determine the cross-sectional
    areas (A1, , A10) of each of the 10 members so
    as to minimize weight of the material for a truss
    that supports the 2 loads
  • The weight is based on volume (i.e.,
    cross-sectional area ? length)

36
TRUSS GENOME
A1 A2 A3 A4 A5 A6 A7 A8 A9 A10
0010 1110 0001 0011 1011 0011 1111 0011 0011 1010
  • 40-bit chromosome (genome)
  • 4-bit granularity for truss diameters
  • 0000 smallest diameter
  • 1111 largest diameter
  • Total chromosome is 4 ? 10 40 bits

37
TRUSS FITNESS
  • Two-part (multiobjective) fitness measure
  • First, fitness is computed by taking the sum,
    over the 10 members, of the cross-sectional area
    of each member times the length of each member
    (30 feet or 30v2 41 feet).
  • Second, a penalty (up to 10) is imposed for
    violating the stress constraints. Stresses are
    computed using standard mechanical engineering
    techniques.
  • The smaller the total fitness, the better

38
CELLULAR AUTOMATA
39
STATE TRANSITION TABLE
WWW WW W X E EE EEE Rule
0 0 0 0 0 0 0 1 a0
1 0 0 0 0 0 1 0 a1
2 0 0 0 0 1 0 0 a2
3 0 0 0 0 1 1 0 a3
4 0 0 1 1 0 0 0 a4

127 1 1 1 1 1 1 1 a127
40
CELLULAR AUTOMATA
A0 A1 A2 A127
a0 a1 a2 a127
  • 128-bit chromosome (genome)

41
PROBLEM-SPECIFIC GENOMES
  • N ? M GENOME

1 1 0 1 1 1 1
1 1 1 1 0 1 1
1 0 1 1 1 0 0
1 1 1 1 1 1 1
1 1 0 1 1 1 1
1 1 1 1 0 1 1
1 1 0 1 1 1 0
42
GENETIC ALGORITHM USING VARIABLE-LENGTH STRINGS
  • 5-WIRE ANTENNA (5 ? 15 75 bits)
  • 4-WIRE ANTENNA (4 ? 15 60 bits)

X1 Y1 Z1 X5 Y5 Z5
0010 -1110 0001 0010 -1110 0001
X1 Y1 Z1 X4 Y4 Z4
1010 -0110 1101 1010 -0110 1001
43
GENETIC PROGRAMMING
44
THE CHALLENGE
  • "How can computers learn to solve problems
    without being explicitly programmed? In other
    words, how can computers be made to do what is
    needed to be done, without being told exactly how
    to do it?"
  • ? Attributed to Arthur Samuel (1959)

45
CRITERION FOR SUCCESS
  • "The aim is ... to get machines to exhibit
    behavior, which if done by humans, would be
    assumed to involve the use of intelligence.
  • ? Arthur Samuel (1983)

46
REPRESENTATIONS
  • Binary decision diagrams
  • Formal grammars
  • Coefficients for polynomials
  • Reinforcement learning tables
  • Conceptual clusters
  • Classifier systems
  • Decision trees
  • If-then production rules
  • Horn clauses
  • Neural nets
  • Bayesian networks
  • Frames
  • Propositional logic

47
A COMPUTER PROGRAM
48
GENETIC PROGRAMMING (GP)
  • GP applies the approach of the genetic algorithm
    to the space of possible computer programs
  • Computer programs are the lingua franca for
    expressing the solutions to a wide variety of
    problems
  • A wide variety of seemingly different problems
    from many different fields can be reformulated as
    a search for a computer program to solve the
    problem.

49
GP ? MAIN POINTS
  • Genetic programming now routinely delivers
    high-return human-competitive machine
    intelligence.
  • Genetic programming is an automated invention
    machine.
  • Genetic programming has delivered a progression
    of qualitatively more substantial results in
    synchrony with five approximately
    order-of-magnitude increases in the expenditure
    of computer time.

50
DEFINITION OF HIGH-RETURN
  • The AI ratio (the artificial-to-intelligence
    ratio) of a problem-solving method as the ratio
    of that which is delivered by the automated
    operation of the artificial method to the amount
    of intelligence that is supplied by the human
    applying the method to a particular problem

51
DEFINITION OF ROUTINE
  • A problem solving method is routine if it is
    general and relatively little human effort is
    required to get the method to successfully handle
    new problems within a particular domain and to
    successfully handle new problems from a different
    domain.

52
CRITERIA FOR HUMAN-COMPETITIVENESS
  • Previously patented, an improvement over a
    patented invention, or patentable today
  • Publishable in its own right as a new scientific
    result ¾ independent of the fact that the result
    was mechanically created
  • Holds it own in regulated competition against
    humans (or programs)
  • 5 other similar criteria that are arms-length
    from the fields of AI, ML, GP

53
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

54
GP FLOWCHART
55
A COMPUTER PROGRAM IN C
  • int foo (int time)
  • int temp1, temp2
  • if (time gt 10)
  • temp1 3
  • else
  • temp1 4
  • temp2 temp1 1 2
  • return (temp2)

56
OUTPUT OF C PROGRAM
Time Output
0 6
1 6
2 6
3 6
4 6
5 6
6 6
7 6
8 6
9 6
10 6
11 7
12 7
57
PROGRAM TREE
  • ( 1 2 (IF (gt TIME 10) 3 4))

58
CREATING RANDOM PROGRAMS
59
CREATING RANDOM PROGRAMS
  • Available functions
    F , -, , , IFLTE
  • Available terminals
    T X, Y, Random-Constants
  • The random programs are
  • Of different sizes and shapes
  • Syntactically valid
  • Executable

60
GP GENETIC OPERATIONS
  • Reproduction
  • Mutation
  • Crossover (sexual recombination)
  • Architecture-altering operations

61
MUTATION OPERATION
62
MUTATION OPERATION
  • Select 1 parent probabilistically based on
    fitness
  • Pick point from 1 to NUMBER-OF-POINTS
  • Delete subtree at the picked point
  • Grow new subtree at the mutation point in same
    way as generated trees for initial random
    population (generation 0)
  • The result is a syntactically valid executable
    program
  • Put the offspring into the next generation of the
    population

63
CROSSOVER OPERATION
64
CROSSOVER OPERATION
  • Select 2 parents probabilistically based on
    fitness
  • Randomly pick a number from 1 to NUMBER-OF-POINTS
    for 1st parent
  • Independently randomly pick a number for 2nd
    parent
  • The result is a syntactically valid executable
    program
  • Put the offspring into the next generation of the
    population
  • Identify the subtrees rooted at the two picked
    points

65
REPRODUCTION OPERATION
  • Select parent probabilistically based on fitness
  • Copy it (unchanged) into the next generation of
    the population

66
FIVE MAJOR PREPARATORY STEPS FOR GP
  • Determining the set of terminals
  • Determining the set of functions
  • Determining the fitness measure
  • Determining the parameters for the run
  • Determining the method for designating a result
    and the criterion for terminating a run

67
ILLUSTRATIVE GP RUN
68
SYMBOLIC REGRESSION
Independent variable X Dependent variable Y
-1.00 1.00
-0.80 0.84
-0.60 0.76
-0.40 0.76
-0.20 0.84
0.00 1.00
0.20 1.24
0.40 1.56
0.60 1.96
0.80 2.44
1.00 3.00
69
PREPARATORY STEPS
Objective Find a computer program with one input (independent variable X) whose output equals the given data
1 Terminal set T X, Random-Constants
2 Function set F , -, ,
3 Fitness The sum of the absolute value of the differences between the candidate programs output and the given data (computed over numerous values of the independent variable x from 1.0 to 1.0)
4 Parameters Population size M 4
5 Termination An individual emerges whose sum of absolute errors is less than 0.1
70
SYMBOLIC REGRESSION
  • POPULATION OF 4 RANDOMLY CREATED INDIVIDUALS FOR
    GENERATION 0

71
SYMBOLIC REGRESSION x2 x 1
  • FITNESS OF THE 4 INDIVIDUALS IN GEN 0

72
SYMBOLIC REGRESSION x2 x 1
  • GENERATION 1

73
CLASSIFICATION
74
GP TABLEAU INTERTWINED SPIRALS
Objective Create a program to classify a given point in the x-y plane to the red or blue spiral
1 Terminal set T X,Y,Random-Constants
2 Function set F ,-,,,IFLTE,SIN,COS
3 Fitness The number of correctly classified points (0 194)
4 Parameters M 10,000. G 51
5 Termination An individual program scores 194
75
WALL-FOLLOWER
76
FITNESS
77
BEST OF GENERATION 57
78
BOX MOVER BEST OF GEN 0
79
BOX MOVERGEN 45 FITNESS CASE 1
80
TRUCK BACKER UPPER
81
TRUCK BACKER UPPER
  • 4-Dimensional control problem
  • horizontal position, x
  • vertical position, y
  • angle between trailer and horizontal, Qt
  • angle between trailer and cab, Qd
  • One control variable (steering wheel turn angle)
  • State transition equations map the 4 state
    variables into 1 output (the control variable)
  • Simulation run over many initial conditions and
    over hundreds of time steps

82
GENETIC PROGRAMMING ON THE PROGRAMMING OF
COMPUTERS BY MEANS OF NATURAL SELECTION(Koza
1992)
83
2 MAIN POINTS FROM 1992 BOOK
  • Virtually all problems in artificial
    intelligence, machine learning, adaptive systems,
    and automated learning can be recast as a search
    for a computer program.
  • Genetic programming provides a way to
    successfully conduct the search for a computer
    program in the space of computer programs.

84
SOME RESULTS FROM 1992 BOOK
  • Intertwined Spirals
  • Truck Backer Upper
  • Broom Balancer
  • Wall Follower
  • Box Mover
  • Artificial Ant
  • Differential Games
  • Inverse Kinematics
  • Central Place Foraging
  • Block Stacking
  • Randomizer
  • Cellular Automata
  • Task Prioritization
  • Image Compression
  • Econometric Equation
  • Optimization
  • Boolean Function Learning
  • Co-Evolution of Game-Playing Strategies

85
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

86
COMPUTER PROGRAMS
  • Subroutines provide one way to REUSE code ?
    possibly with different instantiations of the
    dummy variables (formal parameters)
  • Loops (and iterations) provide a 2nd way to REUSE
    code
  • Recursion provide a 3rd way to REUSE code
  • Memory provides a 4th way to REUSE the results of
    executing code

87
SYMBOLIC REGRESSION
Fitness case L0 W0 H0 L1 W1 H1 Dependent variable D
1 3 4 7 2 5 3 54
2 7 10 9 10 3 1 600
3 10 9 4 8 1 6 312
4 3 9 5 1 6 4 111
5 4 3 2 7 6 1 -18
6 3 3 1 9 5 4 -171
7 5 9 9 1 7 6 363
8 1 2 9 3 9 2 -36
9 2 6 8 2 6 10 -24
10 8 1 10 7 5 1 45
88
EVOLVED SOLUTION
  • (- ( ( W0 L0) H0)
  • ( ( W1 L1) H1))

89
DIFFERENCE IN VOLUMES
  • D L0W0H0 L1W1H1

90
AUTOMATICALLY DEFINED FUNCTION volume
91
AUTOMATICALLY DEFINED FUNCTION volume
  • (progn
  • (defun volume (arg0 arg1 arg2)
  • (values
  • ( arg0 ( arg1 arg2))))
  • (values
  • (- (volume L0 W0 H0)
  • (volume L1 W1 H1))))

92
AUTOMATICALLY DEFINED FUNCTIONS
  • ADFs provide a way to REUSE code
  • Code is typically reused with different
    instantiations of the dummy variables (formal
    parameters)

93
ADDITION OF V0 AND V1
Fitness case L0 W0 H0 L1 W1 H1 V0 V1 D
1 3 4 7 2 5 3 84 30 54
2 7 10 9 10 3 1 630 30 600
3 10 9 4 8 1 6 360 48 312
4 3 9 5 1 6 4 135 24 111
5 4 3 2 7 6 1 24 42 -18
6 3 3 1 9 5 4 9 180 -171
7 5 9 9 1 7 6 405 42 363
8 1 2 9 3 9 2 18 54 -36
9 2 6 8 2 6 10 96 120 -24
10 8 1 10 7 5 1 80 35 45
94
DIVIDE AND CONQUER
95
DIVIDE AND CONQUER
  • Decompose a problem into sub-problems
  • Solve the sub-problems
  • Assemble the solutions of the sub-problems into a
    solution for the overall problem

96
CHANGE OF REPRESENTATION
97
CHANGE OF REPRESENTATION
  • Identify regularities
  • Change the representation
  • Solve the overall problem

98
ADF IMPLEMENTATION
  • Each overall program in population includes
  • a main result-producing branch (RPB) and
  • function-defining branch (i.e., automatically
    defined function, ADF)
  • In generation 0, create random programs with
    different ingredients for the RPB and the ADF
  • Terminal set for ADF typically contains dummy
    arguments (formal parameters), such as ARG0,
    ARG1,
  • Function set of the RPB contains ADF0
  • ADFs are private and associated with a particular
    individual program in the population

99
ADF MUTATION
  • Select parent probabilistically on the basis of
    fitness
  • Pick a mutation point from either RPB or an ADF
  • Delete sub-tree rooted at the picked point
  • Grow a new sub-tree at the picked point composed
    of the allowable ingredients appropriate for the
    picked point
  • The offspring is a syntactically valid executable
    program

100
ADF CROSSOVER
  • Select parent probabilistically on the basis of
    fitness
  • Pick a crossover point from either RPB or an ADF
    of the FIRST patent
  • The choice of crossover point in the SECOND
    parent is RESTRICTED to the picked RPB or to the
    picked ADF
  • The sub-trees are swapped
  • The offspring are syntactically valid executable
    programs

101
GENETIC PROGRAMMING II AUTOMATIC DISCOVERY OF
REUSABLE PROGRAMS(Koza 1994)
102
MAIN POINTS OF 1994 BOOK
  • Scalability is essential for solving non-trivial
    problems in artificial intelligence, machine
    learning, adaptive systems, and automated
    learning
  • Scalability can be achieved by reuse
  • Genetic programming provides a way to
    automatically discover and reuse subprograms in
    the course of automatically creating computer
    programs to solve problems

103
COMPUTER PROGRAMS
  • Subroutines provide one way to REUSE code ?
    possibly with different instantiations of the
    dummy variables (formal parameters)
  • Loops (and iterations) provide a 2nd way to REUSE
    code
  • Recursion provide a 3rd way to REUSE code
  • Memory provides a 4th way to REUSE the results of
    executing code

104
MEMORY
Settable (named) variables Indexed vector memory Matrix memory Relational memory
105
LANGDON'S DATA STRUCTURES
  • Stacks
  • Queues
  • Lists
  • Rings

106
COMPUTER PROGRAMS
  • Subroutines provide one way to REUSE code ?
    possibly with different instantiations of the
    dummy variables (formal parameters)
  • Loops (and iterations) provide a 2nd way to REUSE
    code
  • Recursion provide a 3rd way to REUSE code
  • Memory provides a 4th way to REUSE the results of
    executing code

107
AUTOMATICALLY DEFINED ITERATION (ADI)
  • The overall program includes an
    iteration-performing branch (IPB) in addition to
    a result-producing branch (RPB) and
    function-defining branches (ADF)
  • There are no infinite loops because the iteration
    is performed over a known, fixed set
  • protein or DNA sequence (of varying length)
  • time-series data
  • two-dimensional array of pixels
  • Memory is usually involved and is used to
    communicate between IPB, RPB, and ADF

108
TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM
  • Goal is to classify a given protein segment as
    being a transmembrane domain or non-transmembrane
    area of the protein

109
TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM
  • (progn
  • (defun ADF0 ()
  • (ORN (ORN (ORN (I?) (H?)) (ORN (P?) (G?))) (ORN
    (ORN (ORN (Y?) (N?)) (ORN (T?) (Q?))) (ORN (A?)
    (H?))))))
  • (defun ADF1 ()
  • (values (ORN (ORN (ORN (A?) (I?)) (ORN (L?)
    (W?))) (ORN (ORN (T?) (L?)) (ORN (T?) (W?))))))
  • (defun ADF2 ()
  • (values (ORN (ORN (ORN (ORN (ORN (D?) (E?)) (ORN
    (ORN (ORN (D?) (E?)) (ORN (ORN (T?) (W?)) (ORN
    (Q?) (D?)))) (ORN (K?) (P?)))) (ORN (K?) (P?)))
    (ORN (T?) (W?))) (ORN (ORN (E?) (A?)) (ORN (N?)
    (R?))))))
  • (progn (loop-over-residues (SETM0 ( (-
    (ADF1) (ADF2)) (SETM3 M0))))
  • (values ( ( M3 M0) ( ( ( (- L -0.53) ( M0
    M0)) ( ( ( M3 M0) ( ( M0 M3) ( M1 M2)))
    M2)) ( M3 M0))))))

110
TRANSMEMBRANE SEGMENT IDENTIFICATION PROBLEM
  • in-sample correlation of 0.976
  • out-of-sample correlation of 0.968
  • out-of-sample error rate 1.6

111
AUTOMATICALLY DEFINED LOOP (ADL)
  • loop initialization branch, LIB
  • loop condition branch, LCB
  • loop body branch, LBB
  • loop update branch, LUB

112
ADL
113
COMPUTER PROGRAMS
  • Subroutines provide one way to REUSE code ?
    possibly with different instantiations of the
    dummy variables (formal parameters)
  • Loops (and iterations) provide a 2nd way to REUSE
    code
  • Recursion provide a 3rd way to REUSE code
  • Memory provides a 4th way to REUSE the results of
    executing code

114
AUTOMATICALLY DEFINED RECURSION (ADR)
  • recursion condition branch, RCB
  • recursion body branch, RBB
  • recursion update branch, RUB
  • recursion ground branch, RGB

115
ADR
116
HUMAN-COMPETITIVE RESULTS(NOT RELATED TO PATENTS)
Transmembrane segment identification problem for proteins
Motifs for DEAD box family and manganese superoxide dismutase family of proteins
Cellular automata rule for Gacs-Kurdyumov-Levin (GKL) problem
Quantum algorithm for the Deutsch-Jozsa early promise problem
Quantum algorithm for Grovers database search problem
Quantum algorithm for the depth-two AND/OR query problem
Quantum algorithm for the depth-one OR query problem
Protocol for communicating information through a quantum gate
Quantum dense coding
Soccer-playing program that won its first two games in the 1997 Robo Cup competition
Soccer-playing program that ranked in the middle of field in 1998 Robo Cup competition
Antenna designed by NASA for use on spacecraft
Sallen-Key filter
117
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

118
GENETIC PROGRAMMING III DARWINIAN INVENTION AND
PROBLEM SOLVING(Koza, Bennett, Andre, Keane 1999)
119
SUBROUTINE DUPLICATION
120
SUBROUTINE CREATION
121
SUBROUTINE DELETION
122
ARGUMENT DUPLICATION
123
ARGUMENT DELETION
124
16 ATTRIBUTES OF A SYSTEM FOR AUTOMATICALLY
CREATING COMPUTER PROGRAMS
  • Starts with "What needs to be done"
  • Tells us "How to do it"
  • Produces a computer program
  • Automatic determination of program size
  • Code reuse
  • Parameterized reuse
  • Internal storage
  • Iterations, loops, and recursions
  • Self-organization of hierarchies
  • Automatic determination of program architecture
  • Wide range of programming constructs
  • Well-defined
  • Problem-independent
  • Wide applicability
  • Scalable
  • Competitive with human-produced results

125
GENETIC PROGRAMMING PROBLEM SOLVER (GPPS)
126
AUTOMATIC SYNTHESIS OF BOTH THE TOPOLOGY AND
SIZING OF ANALOG ELECTRICAL CIRCUITS BY MEANS OF
DEVELOPMENTAL GENETIC PROGRAMMING
127
AUTOMATED CIRCUIT SYNTHESIS
  • The topology of a circuit includes specifying the
    gross number of components in the circuit, the
    type of each component (e.g., a capacitor), and a
    netlist specifying where each lead of each
    component is to be connected.
  • Sizing involves specifying the values (typically
    numerical) of each of the circuit's components.

128
COMPONENT-CREATING FUNCTIONS
  • Resistor R function
  • Capacitor C function
  • Inductor L function
  • Diode D function
  • Transistor Q function (3-leaded)

129
COMPONENT-CREATING FUNCTIONS
130
TOPOLOGY-MODIFYING FUNCTIONS
  • SERIES division
  • PARALLEL division
  • VIA
  • FLIP

131
TOPOLOGY-MODIFYING FUNCTIONS
132
DEVELOPMENT-CONTROLLING FUNCTIONS
  • END function
  • NOP (No Operation) function
  • SAFE_CUT function

133
THE INITIAL CIRCUIT
134
DEVELOPMENTAL GP
  • (LIST (C ( 0.963 ( ( -0.875 -0.113) 0.880))
    (series (flip end) (series (flip end) (L -0.277
    end) end) (L ( -0.640 0.749) (L -0.123 end))))
    (flip (nop (L -0.657 end)))))

135
CAPACITOR-CREATING FUNCTION
  • (LIST (C ( 0.963 ( ( -0.875
  • -0.113) 0.880)) (series (flip
  • end) (series (flip end) (L
  • 0.277 end) end) (L ( -0.640
  • 0.749) (L -0.123 end)))) (flip
  • (nop (L -0.657 end)))))

136
CAPACITOR-CREATING FUNCTION
137
SERIES DIVISION FUNCTION
  • (LIST (C ( 0.963 ( ( -0.875
  • -0.113) 0.880)) (series (flip
  • end) (series (flip end) (L
  • 0.277 end) end) (L ( -0.640
  • 0.749) (L -0.123 end)))) (flip
  • (nop (L -0.657 end)))))

138
SERIES DIVISION
139
DEVELOPMENTAL GP
140
EVALUATION OF FITNESS
141
DESIRED BEHAVIOR OF A LOWPASS FILTER
142
EVOLVED CAMPBELL FILTER
  • U. S. patent 1,227,113
  • George Campbell
  • American Telephone and Telegraph
  • 1917

143
EVOLVED ZOBEL FILTER
  • U. S. patent 1,538,964
  • Otto Zobel
  • American Telephone and Telegraph Company
  • 1925

144
EVOLVED SALLEN-KEY FILTER
145
EVOLVED DARLINGTON EMITTER-FOLLOWER SECTION
U. S. patent 2,663,806 Sidney Darlington Bell
Telephone Laboratories 1953
146
NEGATIVE FEEDBACK
147
HAROLD BLACKS RIDE ON THE LACKAWANNA FERRY
Courtesy of Lucent Technologies
148
20th-CENTURY PATENTS
Campbell ladder topology for filters
Zobel M-derived half section and constant K filter sections
Crossover filter
Negative feedback
Cauer (elliptic) topology for filters
PID and PID-D2 controllers
Darlington emitter-follower section and voltage gain stage
Sorting network for seven items using only 16 steps
60 and 96 decibel amplifiers
Analog computational circuits
Real-time analog circuit for time-optimal robot control
Electronic thermometer
Voltage reference circuit
Philbrick circuit
NAND circuit
Simultaneous synthesis of topology, sizing, placement, and routing
149
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

150
SIX POST-2000 PATENTED INVENTIONS
151
EVOLVED HIGH CURRENT LOAD CIRCUIT
152
REGISTER-CONTROLLED CAPACITOR CIRCUIT
153
LOW-VOLTAGE CUBIC CIRCUIT
154
VOLTAGE-CURRENT-CONVERSION CIRCUIT
155
LOW-VOLTAGE BALUN CIRCUIT
156
TUNABLE INTEGRATED ACTIVE FILTER
157
21st-CENTURY PATENTED INVENTIONS
Low-voltage balun circuit
Mixed analog-digital variable capacitor circuit
High-current load circuit
Voltage-current conversion circuit
Cubic function generator
Tunable integrated active filter
158
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

159
NOVELTY-DRIVEN EVOLUTION
  • Two factors in fitness measure
  • Circuits behavior in the frequency domain
  • Largest number of nodes and edges (circuit
    components) of a subgraph of the given circuit
    that is isomorphic to a subgraph of a template
    representing the prior art. Graph isomorphism
    algorithm with the cost function being based on
    the number of shared nodes and edges (instead of
    just the number of nodes).

160
NOVELTY-DRIVEN EVOLUTION
  • For circuits not scoring the maximum number of
    hits (101), the fitness of a circuit is the
    product of the two factors.
  • For circuits scoring 101 hits (100-compliant
    individuals), fitness is the number of shared
    nodes and edges divided by 10,000.

161
PRIOR ART TEMPLATE
162
NON-INFRINGING SOLUTION NO. 1
163
NON-INFRINGING SOLUTION NO. 5
164
GP AS AN INVENTION MACHINE
165
CIRCUIT-CONSTRUCTING PROGRAM TREE WITH ADFs
166
LOWPASS FILTER WITH ADFs
167
ADF0
168
AUTOMATIC SYNTHESIS OF CIRCUIT LAYOUTINCLUDING
THE PLACEMENT OF COMPONENTS AND ROUTING OF WIRES
ALONG WITH THE TOPOLOGY AND SIZING
169
CIRCUIT LAYOUT
  • Circuit placement involves the assignment of each
    of the circuit's components to a particular
    physical location on a printed circuit board or
    silicon wafer.
  • Routing involves the assignment of a particular
    physical location to the wires between the leads
    of the circuit's components.

170
LAYOUT
171
LAYOUT GENERATION 0
172
100-COMPLIANT LOWPASS FILTER GENERATION 25
WITH 5 CAPACITORS AND 11 INDUCTORS ? AREA OF
1775.2
173
100-COMPLIANT LOWPASS FILTERGENERATION 30 WITH
10 INDUCTORS AND 5 CAPACITORS ? AREA OF 950.3
174
100-COMPLIANT LOWPASS FILTERBEST-OF-RUN
CIRCUIT OF GENERATION 138 WITH 4 INDUCTORS AND 4
CAPACITORS ? AREA OF 359.4
175
LAYOUT ? 60 DB AMPLIFIER
176
AUTOMATIC SYNTHESIS OF BOTH THE TOPOLOGY AND
TUNING OF CONTROLLERS
177
PROGRAM TREE FOR A CONTROLLER
178
CONTROLLER BLOCKS
  • gain
  • integrator
  • differentiator
  • adder
  • subtractor
  • multiplier
  • differential input integrators
  • inverter
  • lead
  • lag
  • two-parameter lag
  • absolute value
  • limiter
  • divider
  • delay
  • conditional operators (switches)

179
FUNCTION SET FOR CONTROLLER SYNTHESIS
  • F GAIN,INVERTER,LEAD,LAG,LAG2,
    DIFFERENTIAL_INPUT_INTEGRATOR, DIFFERENTIATOR,
    ADD_SIGNAL, SUB_SIGNAL,ADD_3_SIGNAL,ADF0,
    ADF1,ADF2,ADF3,ADF4

180
TERMINAL SET FOR CONTROLLER SYNTHESIS
  • T REFERENCE_SIGNAL, CONTROLLER_OUTPUT,
    PLANT_OUTPUT

181
CONSTRAINED SYNTACTIC STRUCTURE
  • A grammar that specifies what functions and
    terminals are allowed as arguments to particular
    functions
  • For example, the first argument of the GAIN
    function must be a value-setting subtree whereas
    the second can be from the general pool of
    functions
  • Also known as strong typing

182
TWO-LAG PLANT
183
8 FITNESS CASES
  • 8 elements of the fitness measure represent 2 ? 2
    ? 2 choices
  • 2 different values of the plant's internal gain,
    K (1.0 and 2.0),
  • 2 different values of the plant's time constant ?
    (0.5 and 1.0),
  • 2 different values for the height of the
    reference signal (rising from 0 to 1 volts or
    from 0 to 1 microvolts at t 100 milliseconds

184
FITNESS MEASURE
  • For each of these 8 fitness cases, a transient
    analysis (time domain) is performed using the
    SPICE simulator.
  • The contribution to fitness for the 8 elements is
  • Integral of time-weighted absolute error (ITAE)
  • e(t) is difference between plant output and
    reference signal.
  • Multiplication by B (106 or 1) makes both
    reference signals equally influential.
  • Additional weighting function, A, heavily
    penalizes non-compliant amounts of overshoot. A
    weights all variations up to 2 above the
    reference signal by 1.0, but bigger variations by
    10.0.

185
EVOLVED CONTROLLER FOR TWO-LAG PLANT
186
LESS ITAE AND OVERSHOOT
187
BETTER DISTURBANCE REJECTION
188
REVERSE ENGINEERING OF METABOLIC PATHWAYS
189
EVOLVED PATHWAY
190
ANTENNA SYNTHESIS USING GP
  • (PROGN3
  • (TURN-RIGHT 0.125)
  • (LANDMARK
  • (REPEAT 2
  • (PROGN2
  • (DRAW 1.0 HALF-MM-WIRE)
  • (DRAW 0.5 NO-WIRE)))
  • (TRANSLATE-RIGHT 0.125 0.75))

191
USING A TURTLE TO DRAW TWO-DIMENSIONAL ANTENNA
192
BEST-OF-RUN ANTENNA FROM GENERATION 90
193
3-DIMENSIONAL ANTENNA
194
NASA EVOLVED ANTENNA
  • To be on satellite to be launched in 2004

195
OTHER STRUCTURES
196
GENETIC NETWORK FOR lac operon
197
EVOLVED NETWORK
  • (IF (lt LACTOSE_LEVEL 9.139 ) (IF (lt
  • REPRESSOR_LEVEL 6.270 ) (IF (gt GLUCOSE_LEVEL
  • 5.491 ) 2.02 (IF (lt CAP_LEVEL 0.639 ) 2.033 (IF
  • (lt CAP_LEVEL 4.858 ) (IF (gt LACTOSE_LEVEL 2.511 )
  • (IF (gt CAP_LEVEL 7.807 ) 5.586 (IF (gt
  • LACTOSE_LEVEL 2.114 ) 1.978 2.137 ) ) 0.0 ) (IF
  • (gt REPRESSOR_LEVEL 4.015 ) 0.036 (IF (lt
  • GLUCOSE_LEVEL 5.128 ) 10.0 (IF (lt REPRESSOR_LEVEL
  • 4.268 ) 2.022 9.122 ) ) ) ) ) ) (IF (gt CAP_LEVEL
  • 0.842 ) 0.0 5.97 ) ) (IF (lt CAP_LEVEL 1.769 )
  • 2.022 (IF (lt GLUCOSE_LEVEL 2.382 ) (IF (gt
  • LACTOSE_LEVEL 1.256 ) (IF (gt LACTOSE_LEVEL 1.933
  • ) (IF (gt GLUCOSE_LEVEL 2.022 ) (IF (lt
  • GLUCOSE_LEVEL 5.183 ) 6.323 (IF (gt CAP_LEVEL
  • 1.208 ) 9.713 0.842 ) ) 10.0 ) (IF (gt
  • GLUCOSE_LEVEL 6.270 ) 2.109 ) 1.965 ) ) 0.665 )
  • 1.982 ) ) )

198
IN C-STYLE PSEUDO CODE
  • else
  • if(CAP_LEVEL lt 1.769)
  • LAC_mRNA_LEVEL 2.022
  • else
  • if(GLUCOSE_LEVEL lt 2.382)
  • LAC_mRNA_LEVEL 10.0
  • else
  • LAC_mRNA_LEVEL 1.982
  • if(LACTOSE_LEVEL lt 9.139)
  • if(REPRESSOR_LEVEL lt 6.270)
  • LAC_mRNA_LEVEL 2.022
  • else
  • LAC_mRNA_LEVEL 0.0

199
PARAMETERIZED TOPOLOGIES
  • One of the most important characteristics of
    computer programs is that they ordinarily contain
    inputs (free variables) and conditional operations

200
PARAMETERIZED TOPOLOGY FOR LOWPASS FILTER
201
PARAMETERIZED TOPOLOGY FOR HIGHPASS FILTER
202
PARAMETERIZED TOPOLOGY FOR GENERAL-PURPOSE
CONTROLLER
203
EVOLVED EQUATIONS FOR GENERAL-PURPOSE CONTROLLER
204
EVOLVED EQUATIONS FOR GENERAL-PURPOSE CONTROLLER
205
PATENTABLE NEW INVENTIONS
PID tuning rules that outperform the Ziegler-Nichols and Åström-Hägglund tuning rules
General-purpose controllers outperforming Ziegler-Nichols and Åström-Hägglund rules


206
PROGRESSION OF QUALITATIVELY MORE SUBSTANTIAL
RESULTS PRODUCED BY GP
  • Toy problems
  • Human-competitive non-patent results
  • 20th-century patented inventions
  • 21st-century patented inventions
  • Patentable new inventions

207
PARALLELIZATION WITH SEMI-ISOLATED SUBPOPULATIONS
208
GP PARALLELIZATION
  • Like Hormel, Get Everything Out of the Pig,
    Including the Oink
  • Keep on Trucking
  • It Takes a Licking and Keeps on Ticking
  • The Whole is Greater than the Sum of the Parts

209
PETA-OPS
  • Human brain operates at 1012 neurons operating at
    103 per second 1015 ops per second
  • 1015 ops 1 peta-op 1 bs (brain second)

210
EVOLVABLE HARDWARECORNER OF XILINX XC6216
211
FUNCTION UNIT FOR CELL OF XILINIX XC6216
212
SORTING NETWORK
213
EVOLVED SORTING NETWORK
214
GP 19872002
System Dates Speed-up over first system Human-competitive results Problem Category
Serial LISP 19871994 1 (base) 0 toy problems
64 transputers 19941997 9 2 human-competitive results not related to patented inventions
64 PowerPCs 19952000 204 12 20th-century patented inventions
70 Alphas 19992001 1,481 2 20th-century patented inventions
1,000 Pentium IIs 20002002 13,900 12 21st-century patented inventions
4-week runs on 1,000 Pentium IIs 2002-2003 130,000 2 patentable new inventions
215
PROMISING GP APPLICATION AREAS
  • Problem areas involving many variables that are
    interrelated in highly non-linear ways
  • Inter-relationship of variables is not well
    understood
  • Discovery of the size and shape of the solution
    is a major part of the problem
  • "Black art" problems

216
PROMISING GP APPLICATION AREAS (CONTINUED)
  • Areas where you simply have no idea how to
    program a solution, but where you know what you
    want

217
PROMISING GP APPLICATION AREAS (CONTINUED)
  • Problem areas where a good approximate solution
    is satisfactory
  • ? design
  • ? control
  • ? bioinformatics
  • ? classification
  • ? data mining
  • ? system identification
  • ? forecasting

218
PROMISING GP APPLICATION AREAS (CONTINUED)
  • ? Areas where large computerized databases are
    accumulating and computerized techniques are
    needed to analyze the data
  • ? genome, protein, microarray data
  • ? satellite image data
  • ? astronomical data
  • ? petroleum databases
  • ? financial databases
  • ? medical records
  • ? marketing databases

219
PROMISING GP APPLICATION AREAS (CONTINUED)
  • ? Areas for which humans find it very difficult
    to write good programs
  • ? parallel computers
  • ? cellular automata
  • ? multi-agent strategies
  • ? field-programmable game arrays
  • ? digital signal processors
  • ? swarm intelligence

220
CHARACTERISTICS SUGGESTING USE OF GP
  • (1) discovering the size and shape of the
    solution,
  • (2) reusing substructures,
  • (3) discovering the number of substructures,
  • (4) discovering the nature of the hierarchical
    references among substructures,
  • (5) passing parameters to a substructure,
  • (6) discovering the type of substructures (e.g.,
    subroutines, iterations, loops, recursions, or
    storage),
  • (7) discovering the number of arguments possessed
    by a substructure,
  • (8) maintaining syntactic validity and locality
    by means of a developmental process, or
  • (9) discovering a general solution in the form of
    a parameterized topology containing free variables

221
DESIGNING A GIRAFFE
  • Long neck
  • Long tongue
  • Vegetable-digesting enzymes in stomach
  • 4 legs
  • Long legs
  • Brown coloration

222
THE DESIGN OF A GOOD GIRAFE
Neck length Tongue length Carnivorous? Number of legs Leg length Coloration
15.11 feet 14 inches No 4 9.96 feet Brown
Floating point Floating point Boolean Integer Floating point Categorical
223
NON-LINEARITY GIRAFE
  • Taken one-by-one, some gene values found in a
    giraffe, such as the long neck contribute (alone)
    negatively to fitness
  • requires considerable material to construct
  • requires considerable energy to maintain
  • prone to injury (thereby hurting rate of survival
    and reproduction)
  • Thus, maximizing any one variable will not lead
    to the global optimum solution

224
NON-LINEARITY (CONTINUED)
  • When the variables are taken in pairs (there are
    15 possible pairs), many combinations of pairs
    (e.g., Long neck and long tongue) are doubly
    detrimental

225
NON-LINEARITY (CONTINUED)
  • But, certain combinations of traits, when taken
    together, are "co-adapted sets of alleles" that
    yield a very fit animal for eating high acacia
    leaves in the jungle environment, having good
    camouflage, having high escape velocity when
    faced with predators, and exploiting a niche (and
    avoiding competition) with other animals feeding
    on low-hanging vegetation

226
SEARCH METHODS IN GENERAL
  • Initial structure(s)
  • Fitness measure
  • Operations for creating new structures
  • Parameters
  • Termination criterion and method of designating
    the result

227
SPACE WITH MANY LOCAL OPTIMA
228
SEARCH METHODS
  • Blind random search does not use acquired
    information in deciding on the future direction
    of the search
  • Hill combing and gradient descent use acquired
    information however, they are prone to becoming
    trapped on local optima
  • The previous point is especially true for
    non-trivial search spaces

229
7 DIFFERENCES BETWEEN GP AND ARTIFICIAL
INTELLIGENCE AND MACHINE LEARNING APPROACHES
230
REPRESENTATION
  • Genetic programming overtly conducts it
  • search for a solution to the given problem
  • in program space

231
ROLE OF POINT-TO-POINT TRANSFORMATIONS IN THE
SEARCH
  • Genetic programming does not conduct its
  • search by transforming a single point in the
  • search space into another single point, but
  • instead transforms a set of points into
  • another set of points

232
ROLE OF HILL CLIMBING IN THE SEARCH
  • Genetic programming does not rely
  • exclusively on greedy hill climbing to
  • conduct its search, but instead allocates a
  • certain number of trials, in a principled
  • way, to choices that are known to be
  • inferior

233
DETERMINISM IN THE SEARCH
  • Genetic programming conducts its search
  • probabilistically

234
ROLE OF AN EXPLICIT KNOWLEDGE BASE
  • Genetic programming does NOT make use
  • of a knowledge base

235
ROLE OF FORMAL LOGIC IN THE SEARCH
  • Genetic programming does not utilize
  • formal logic in its search strategy.
    Contradictory alternatives are created and
    actively maintained.

236
UNDERPINNINGS OF THE TECHNIQUE
  • Biologically inspired

237
TURING (1948)
  • Turing made the connection between
  • searches and the challenge of getting a
  • computer to solve a problem without
  • explicitly programming it in his 1948 essay
  • Intelligent Machines
  • "Further research into intelligence of machinery
    will probably be very greatly concerned with
    'searches' ...

238
TURINGS 3 APPROACHES TO MACHINE INTELLIGENCE
(1948)
  • LOGIC-BASED SEARCH
  • One approach that Turing identified is a
  • search through the space of integers
  • representing candidate computer
  • programs.

239
TURINGS 3 APPROACHES (CONTINUED)
  • CULTURAL SEARCH
  • A second approach is the "cultural search
  • which relies on knowledge and expertise
  • acquired over a period of years from
  • others (akin to present-day knowledge-
  • based systems).

240
TURINGS 3 APPROACHES (CONTINUED)
  • GENETICAL OR EVOLUTIONARY SEARCH
  • "There is the genetical or evolutionary
  • search by which a combination of genes is
  • looked for, the criterion being the survival
  • value.

241
TURING (1950)
  • From Turings 1950 paper "Computing
  • Machinery and Intelligence"
  • We cannot expect to find a good child-machine at
    the first attempt. One must experiment with
    teaching one such machine and see how well it
    learns. One can then try another and see if it
    is better or worse. There is an obvious
    connection between this process and evolution, by
    the identifications

242
TURING (1950) (CONTINUED)
  • Structure of the child machine
  • Hereditary material
  • Changes of the child machine
  • Mutations
  • Natural selection
    Judgment of the experimenter
Write a Comment
User Comments (0)
About PowerShow.com