Advanced Artificial Intelligence Lecture 23: Salustowicz - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Advanced Artificial Intelligence Lecture 23: Salustowicz

Description:

EDAs have been extremely successful in fixed-complexity search spaces. Can they extend to GP representations? EDA for GP must ... Weak parsimony pressure ... – PowerPoint PPT presentation

Number of Views:404
Avg rating:3.0/5.0
Slides: 16
Provided by: scSn
Category:

less

Transcript and Presenter's Notes

Title: Advanced Artificial Intelligence Lecture 23: Salustowicz


1
Advanced Artificial IntelligenceLecture 23
Salustowicz SchmidhuberProbabilistic
Incremental Program Evolution (PIPE)
  • Bob McKay
  • School of Computer Science and Engineering
  • College of Engineering
  • Seoul National University

2
Outline
  • PIPE
  • Probability representation
  • Update function
  • Experiments

3
Estimation of Distribution Algorithms
  • EDAs have been extremely successful in
    fixed-complexity search spaces
  • Can they extend to GP representations?
  • EDA for GP must explicitly distinguish between
  • Structure learning
  • Content learning
  • In EDA terms, explicitly distinguish
  • Probability Model
  • Probabilities

4
Prototype Tree EDAs
  • The underlying model is a full tree of maximum
    arity
  • Each node holds a probability table for the
    content of the node
  • Original version (PIPE, Salustowicz
    Schmidthuber 1997) has the node probabilities
    independent
  • More recent versions learn dependent probabilities

5
PIPE Prototype Tree
6
Prototype Tree Reduction
  • PIPE doesnt hold the whole prototype tree
  • Initially, the PPT only holds the root node
  • It is grown on demand
  • Whenever a node is probabilistically selected in
    sampling, the corresponding node is created and
    initialised
  • uniform distribution
  • It is pruned based on convergence
  • If the probability of a branch exceeds threshhold
    t lt 1
  • We prune all branches not required as arguments

7
PIPE Selection
  • PIPE selects only one best individual in each
    generation
  • Selection is primarily based on the fitness
    function
  • Whichever has the lower fitness value is selected
  • If two individuals have the same fitness value,
    the smaller is selected
  • Weak parsimony pressure
  • This best individual b is compared with the best
    so far, el if b is better than el, b replaces el
  • PIPE uses a conservative learning approach
  • Hence small populations can be used
  • One bad selection doesnt change the learning much

8
PIPE Generational Update
  • The probability table is updated to increase the
    probability of generating b to some target value
  • Target value is set by formula
  • Ptarg P(Progb)(1-P(Progb).?.(?fit(Progel))/(?
    fit(Progb))
  • Probabilities of all symbols in b are incremented
    by small amounts
  • This is repeated until P(Progb) gt Ptarg
  • Note that there is no particular statistical
    justification for this update function

9
PIPE Mutation
  • After generational update, each node in PIPEs
    probability table is mutated (!?) with
    probability
  • PM / (lk).?Progb
  • The update amount is governed by a learning rate
    ?
  • Pnew Pold?.(1-Pold)
  • Small probabilities (close to zero) are
    influenced more
  • Use of mutation in EDAs is relatively unusual

10
Elite update
  • Essentially the same as generation update, but
    using the elite
  • The probability table is updated to increase the
    probability of generating el to the target value
  • Ptarg P(Progel)(1- ?.P(Progel))
  • Probabilities of all symbols in el are
    incremented by small amounts
  • This is repeated until P(Progel) gt Ptarg
  • Mutation is not used with elite update

11
PIPE Experiments
  • Symbolic regression
  • 24 of PIPE runs found programs better than any
    found by GP (training set)
  • But 33 are worse than any found by GP
  • Corresponding test set ratios are 33, 29
  • 6-multiplexer
  • 70 of runs found solution (vs 60)
  • Median run time to a solution around 1/2 of GPs

12
PIPE Experiments
  • The paper also reports experiments on a difficult
    maze-learning task
  • The results are impressive, but unfortunately no
    comparison with GP or other technique is given

13
Discussion
  • PIPE is the earliest attempt to use EDA methods
    for GP
  • Fairly unsophisticated representation - no
    dependent probabilities
  • At the time, nobody was using dependent
    probabilities in GA-EDA either
  • Fairly unsophisticated update function
  • Makes no attempt to generate a statistically
    justiable posterior probability
  • Possibly fairer to describe PIPE as an ant
    algorithm than as an EDA

14
Summary
  • PIPE
  • Probability representation
  • Update function
  • Experiments

15
?????
Write a Comment
User Comments (0)
About PowerShow.com