Integrating Probabilistic Modeling and Representation-Building - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Integrating Probabilistic Modeling and Representation-Building

Description:

Pre-representational - how to describe the problem as formalized input? ... Pelikan, Goldberg, & Lobo, 1999. 9. The Bayesian Optimization Algorithm ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 37
Provided by: moshe8
Learn more at: http://metacog.org
Category:

less

Transcript and Presenter's Notes

Title: Integrating Probabilistic Modeling and Representation-Building


1
Integrating Probabilistic Modeling and
Representation-Building
  • Doctoral Thesis Proposal
  • Moshe Looks
  • March 2nd, 2006

2
Outline
  • Background
  • Thesis
  • Proposed Approach
  • Proposed Goals

3
Problems and Problem-Solving
  • Levels of Analysis
  • Pre-representational - how to describe the
    problem as formalized input?
  • Post-representational - how to solve the
    formal problem?

4
Problems and Problem-Solving
  • Hofstadter, 1985
  • Knob creation - discovering novel values to
    parameterize
  • Knob twiddling - adjusting the values of
    existing parameters

5
General Optimization
  • Formal Representation
  • Solution space S (e.g., 0,1n)
  • Scoring function maps solutions to reals
  • Solving the problem means maximizing the score
  • To outperform enumeration and random sampling
  • assume some knowledge of the space

6
What Knowledge?
  • Complete separability would be nice
  • Near-decomposability (Simon, 1969) is more
    realistic

Weaker Interactions
Stronger Interactions
7
How to Exploit This?
  • Separability Independence Assumptions
  • Given a prior over the solution space
  • Represented as a probability vector
  • Sample solutions from the model (distribution)
  • Update the model toward higher-scoring points
  • Iterate...
  • Baljua, 1994
  • Works surprisingly well, even when the
    assumptions dont hold completely
  • when the interactions are weak
  • or there is little deception

8
How to Exploit This?
  • A known correct problem decomposition may be
    incorporated into the model
  • Mühlenbein Manhig, 1998
  • An unknown decomposition may be learned
  • algorithms that adaptively learn such linkages
    are termed competent
  • Optimization via probabilistic modeling is
    surveyed in
  • Pelikan, Goldberg, Lobo, 1999

9
The Bayesian Optimization Algorithm
  • Represents problem decomposition as a Bayes Net
  • learned greedily, via a network scoring metric
  • Augmented in Hierarchical BOA
  • BOA Bayesian Optimization Algorithm
  • uses Bayes Nets with local structure
  • allows smaller model-building steps
  • leads to more accurate models
  • restricted tournament replacement
  • promotes diversity
  • Robust and scalable results on problems with both
    known and unknown decompositions
  • Pelikan Goldberg, 2003

10
Decompositions Representations
  • Competent adaptive optimization algorithms
  • can overcome a poor choice of representation
  • via problem decomposition
  • Requires the existence of a problem decomposition
  • compact
  • satisficing
  • in the model-space searched by the algorithm

11
Decompositions Representations
  • I propose extending methods such as hBOA to
    domains where a compact decomposition does not
    exist directly in the user-specified problem

12
Representation-Building An Example
  • Optimizing over strings (X1, X2, , Xn)
  • A separate distribution maintained for each xi
  • What if there is positional ambiguity?
  • Some features refer to absolute position, some do
    not
  • E.g., DNA - a gene's positions is sometimes
    critical, and sometimes irrelevant
  • Consider abstracted features, defined in terms of
    "base-level variables" (Xis)
  • E.g., contains a prime number of ones
  • E.g., does not contain the substring AATGC
  • Model-based instance generation (sampling) must
    be generalized to accommodate features

13
Representation-Building An Example
  • Exploit background knowledge to choose effective
    feature-classes
  • E.g., motifs (variable-position substrings)
  • motifs may be prespecified
  • or learned via information-theoretic criteria
  • Demonstrated performance gains with learned
    motifs (with respect to the BOA)
  • Looks, 2006 (in submission)

14
Representation-Building - Observations
  • A superior decomposition may exist that cannot be
    compactly represented
  • Generalize the representational language?
  • Computationally intractable!
  • Representation-building mechanisms
  • Tractable if they incorporate inductive bias
  • Goal is to provide salient parameters to the
    optimization algorithm

15
Learning Open-Ended Hierarchical Structures
  • User selects (pre-representationally)
  • a set of functions
  • E.g., , -, , log, sin
  • a set of terminals
  • E.g., x, y, z, 0, 1
  • a scoring function over trees
  • Decrease pre-representational effort
  • Solution structure and content must both be
    learned
  • Claim
  • Representation-building is thus correspondingly
    more instrumental in finding a compact problem
    decomposition

16
Current Evolutionary Approaches
  • Genetic Programming (GP)
  • Koza, 1992
  • Many variants
  • Population-based search with new instances
    generated via
  • swapping of subtrees (crossover)
  • random insertions/deletions/modifications
    (mutation)

17
Current Evolutionary Approaches
  • Probabilistic model building approaches without
    decomposition-learning
  • Probabilistic Incremental Program Evolution
  • Salustowicz Schmidhuber, 1997
  • Hierarchical generalization, 1998
  • Based on absolute tree-position (address from the
    root)
  • Assumes complete independence
  • Estimation-of-Distribution Programming
  • Yanai Iba, 2003
  • Assumes a fixed network of dependency
    relationships

18
Current Evolutionary Approaches
  • Probabilistic model building approaches with
    decomposition-learning
  • Grammar-learning methods
  • Shan et al., 2004
  • Bosman de Jong, 2004
  • Based on relative tree-position
  • Methods from competent optimization algorithms
  • Extended Compact Genetic Programming
  • Sastry Goldberg, 2003
  • Bayesian-Optimization-Algorithm Programming
  • Looks, Goertzel, Pennachin, 2005

19
Claim
  • Compact problem decompositions rarely exist for
    non-trivial problems with generic representation
    of general expressions
  • generic representation
  • E.g., trees
  • E.g., grammars
  • general expressions
  • E.g., Boolean formulae
  • E.g., symbolic equations
  • E.g., finite automatons

20
Justification
  • Solution scores are assumed to only vary based on
    semantics
  • Determining (semantic) equivalence of general
    expressions is NP-hard!
  • Says nothing about approx. decompositions
  • However, a compact decomposition derived from a
    generic representation is still implausible
  • assuming no knowledge of semantics
  • and no explicit computational effort towards
    specialized representational reduction

21
Thesis
  • General expressions may be organized so that
    compact decompositions may often be found for
    non-trivial problems, via representation-building
  • Representation-building will require
  • knowledge of semantics (i.e., domain knowledge)
  • explicit computational effort towards
    representational reduction
  • Comparable to the notion of a heuristic solver
    for a NP-hard problems

22
Meta-Adaptive Programming (MAP)
  1. Generate a random population of trees
  2. Select promising trees from the population for
    modeling
  3. Build a parameterized representation of these
    trees, and transform them into parameter
    assignments
  4. Model these assignments using a Bayesian network
    with local structure to discover the problem
    decomposition
  5. Sample the model to generate new parameter
    assignments, apply the inverse transformation to
    convert them into trees, and integrate them into
    the population
  6. Go to step 2.

23
Constructing a Parameterized Representation
24
Simplification Normalization
25
Alignment
26
Parameterization
27
Constructing a Parameterized Representation
  • Simplify trees via rewrite rules and convert them
    into a normal form
  • Incrementally align all trees
  • Based on an alignment scoring function
  • May be solved optimally via dynamic programming
  • Unfortunately, is NP-hard for
  • unordered operators (e.g., )
  • multiple trees
  • Pairwise greedy alignment (agglomerative
    clustering)
  • quadratic in the number of trees
  • Feng Doolittle, 1987
  • For unordered operators, do greedy alignment of
    children

28
Proposed Goals
  • Theoretical
  • Modeling tree growth
  • GP schema theory
  • Experimental (and Implementational)
  • Adversarial problems
  • Normal forms
  • Challenge problems
  • Conceptual
  • The role of representation-building in AI

29
Theoretical Goals
  • Modeling Tree Growth
  • How does the average / maximal tree size change
    over time?
  • GP is prone to bloat
  • Cf. Langon Poli, 2002
  • Probabilistic modeling approaches may avoid this
  • pressure toward solutions that are easy to model

30
Theoretical Goals
  • Tree growth in meta-adaptive programming
  • is constrained by the size of the representations
  • in turn constrained by the alignment scoring
    functions
  • Alignment scoring function
  • may lead to a completely bounded space
  • may lead to unbounded growth
  • Subject to the fitness functions
  • Goal is to analyze this theoretically
  • leading to speed limit results for scoring
    functions

31
Theoretical Goals
  • Exact GP schema theory
  • Recently developed
  • Cf. Poli Langdon, 2002
  • Equivalent to Markov Chain Models
  • Provides exact distributional data for the next
    generation based on fitness
  • Intractable for real problems!
  • Goal is to analyze the differences in schema
    processing between GP and MAP
  • crossover (subcomponent mixing) is not random
  • Controlled by alignment and probabilistic
    modeling
  • no notion of problem semantics in GP
  • In GP, schema (2,a) and (a,a) are completely
    separate

32
Theoretical Goals - Checklist
Goal Status
Modeling Tree Growth Content-Free Binary Trees Binary Trees With Content Effects of Rewrite Rules ???
Schema-Processing Comparative Analysis ?
33
Experimental Goals
  • Design Benchmarking on Adversarial Problems
  • Decomposition should be known to the user, not
    the algorithm
  • Dimensions of Deceptiveness for Trees
  • Relative-position (subtree) deceptiveness
  • Absolute-position deceptiveness
  • Operator deceptiveness

34
Experimental Goals
  • Normal Forms
  • Heuristically remove redundancy
  • Preserve hierarchical structure
  • Domains
  • Simple Agent Control (Artificial Ants)
  • E.g., progn(turn-left, turn-right, move) ? move
  • Boolean Formulae
  • CNF doesnt preserve hierarchical structure
  • Holmans normal form does
  • Advanced Agent Control
  • Including general programmatic constructs

35
Experimental Goals - Checklist
Goal Status
Modeling and Sampling with Features ?
Adversarial Problems ?
Domains Simple Agent Control Boolean Formulae (CNF) Boolean Formulae (Hierarchical) Advanced Agent Control ? ? ??
Tree Alignment and Representation-Building 75
36
Conceptual Goals
  • A central challenge of AI create systems with
    representations that are
  • Dynamic
  • Informed by background knowledge
  • Built by the system, not humans
  • Facilitate effective problem decomposition for
    learning
Write a Comment
User Comments (0)
About PowerShow.com