Learning Classifier Systems - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Learning Classifier Systems

Description:

Binary, ternary, Grey or enumerated. Integer, real, floatingpoint or mantisssa ... NB is ternary alphabet a subset? S-Expressions. http://www.genetic-programming.com ... – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 19
Provided by: Willb46
Category:

less

Transcript and Presenter's Notes

Title: Learning Classifier Systems


1
Learning Classifier Systems
  • Navigating the fitness landscape?
  • Why use evolutionary computation?
  • Whats the concept of LCS?
  • Early pioneers
  • Competitive vs Grouped Classifiers
  • Beware the Swampy bits!
  • Niching
  • Selection for mating and effecting
  • Balance exploration with exploitation.
  • Balance the pressures
  • Zeroth level classifier system
  • The X-factor
  • Alphabet soup
  • New Slants - piecewise linear approximators
  • Why don't LCS rule the world?
  • Simplification schemes
  • Cognitive Classifiers
  • Neuroscience Inspirations
  • Application Domains

2
Representation
  • Genetic information can be any symbol.
  • Require dictionary and rules to manipulate the
    symbols.
  • Some symbols make the search space easier to
    explore/exploit
  • Some symbols are easier to store, manipulate and
    test
  • Different problems may be suited to different
    symbol sets

3
States
  • Environmental state is passed to the LCS via the
    message
  • Often the environmental state is preprocessed to
    create the message
  • Values are normalised
  • Out of bound data removed
  • Known irrelevant conditions removed
  • Missing values addressed

4
Multiple Representations
  • Binary, ternary, Grey or enumerated.
  • Integer, real, floatingpoint or mantisssa
  • Rank, order, series, histogram or array
  • Bounded, ellipsoidal
  • Horn clauses and second-order logic
  • S-type expressions, Gene expressions
  • Hybrid fuzzy sets, neural networks
  • Piecewise linear approximation
  • Problem specific

5
Encoding
  • Binary (including gray) is a very crude method to
    solve real valued problems.
  • Simply, divide the range of interest by the
    number of intervals encoded by the binary
    representation and determine the real number that
    the binary interval represents. e.g. 0000
    represents 0.0, 1111 represents 6.23 and 0001
    represents 6.23/15.
  • The more bits, the more accuracy in your
    solution, but the limitations of this encoding
    are obvious.

6
Schema
  • 1
  • 1

7
Non-optimum Niching
  • rule x lt 4 ? 0 00
  • rule x lt 3 ? 0 00 0111
  • rule x lt 5 ? 0 ???

8
Binary enumeration for Niching
  • Can use enumeration
  • 0 ? 0000000
  • 1 ? 0000001
  • 2 ? 0000011
  • 3 ? 0000111
  • 4 ? 0001111
  • 5 ? 0011111
  • 6 ? 0111111
  • 7 ? 1111111
  • rule x lt 4 ? 0 00
  • rule x lt 3 ? 0 00
  • rule x lt 5 ? 0 00
  • Useful in discretised environments
  • Trades increased search space less compactness
    for better Niching

9
Integer and Real Encodings
  • Match encoding to the environmental message
    using upper and lower bounds
  • could use centre and spread, but this assumes a
    gaussian distribution and recombination more
    difficult to implement
  • For each allele a, lb x ub, to give match.
  • Could use lt instead of but lcs determines
    the correct bound automatically
  • 0 x 5 is equivalent to 0 x lt 5.01 or
  • 0 x 4.99 is equivalent to 0 x lt 5

10
Mutating at the Limits
  • Crossover point can either be between alleles or
    in the middle of an allele.
  • Mutation increases/decreases either or both of
    the two bounds.
  • Repair is occasionally needed to check that
    lower bound lt upper bound
  • Note that most bounds have a limit,
  • e.g. WBC 0 a 10
  • We probabilistically decide to mutate the general
    allele 0 x 10 lower bound
  • We decrease by 10 of range to -1
  • we then repair back to 0.
  • We increase by 10 of range to 1
  • we do not repair it as valid!
  • Thus some alphabets have a specificity bias

11
Hyper Partitioning
  • We have a sparse search space with only two
    classes to identify 0 1
  • Its real numbered so we decide to use bounds
    e.g. 0 x 10
  • We form Hypercubes with the number of dimensions
    the number of conditions
  • Approximates actual niches, maybe problems

. 1 . 0 N(x) S
. 1 . 0 N(x) S
12
Oblique domains
  • We have a search space with only two classes to
    identify 0 1
  • Its real numbered so we decide to use bounds
    e.g. 0 x 10
  • We form Hypercubes / Hyperrectangles are not
    often suited to oblique domains
  • Imagine sine wave domains..

. 1 . 0 S
13
Hyper-ellipsoidal
  • The general ellipsoid, also called a triaxial
    ellipsoid, is a quadratic surface which is given
    in Cartesian coordinates by
  • where the semi-axes are of lengths a, b, and c.
    Wolfram maths
  • N-dimensional ellipsoids can be used to more
    effectively represent oblique domains
  • Implementation and analysis becomes harder
  • Kernel-based, ellipsoidal conditions in the
    real-valued XCS classifier system Butz, M.V.
    Proceedings of the Genetic and Evolutionary
    Computation Conference (GECCO-2005) pp.
    1835-1842.
  • Hyper-ellipsoidal conditions in XCS Rotation,
    linear approximation, and solution structure
    Butz, M. V., Lanzi, Pier-Luca, Wilson, S. W.
    Proceedings of the Genetic and Evolutionary
    Computation Conference (GECCO-2006) pp. 1457-1464

14
Horn clause logic
  • A Horn clause is a clause with at most one
    positive literal.
  • A rule 1 positive literal, at least 1 negative
    literal. A rule has the form "P1 V P2 V ... V
    Pk V Q". This is logically equivalent to
    "P1P2 ... Pk gt Q" thus, an if-then
    implication with any number of conditions but one
    conclusion. Examples "man(X) V mortal(X)" (All
    men are mortal)
  • A fact or unit 1 positive literal, 0 negative
    literals. Examples "man(socrates)", (Everyone is
    an ancestor of themselves (in the trivial
    sense).)
  • A negated goal 0 positive literals, at least 1
    negative literal. In virtually all
    implementations of Horn clause logic, the negated
    goal is the negation of the statement to be
    proved the knowledge base consists entirely of
    facts and goals. The statement to be proven,
    therefore, called the goal, is therefore a single
    unit or the conjuction of units an existentially
    quantified variable in the goal turns into a free
    variable in the negated goal. E.g. If the goal to
    be proven is "exists (X) male(X)
    ancestor(elizabeth,X)" (show that there exists a
    male descendent of Elizabeth) the negated goal
    will be "male(X) V ancestor(elizabeth,X)".
  • The null clause 0 positive and 0 negative
    literals. Appears only as the end of a resolution
    proof.

15
Horn clause logic
  • A Horn clause is a clause with at most one
    positive literal.
  • a V b V c V V t V u (only u positive)
    or
  • (a b c t) ? u (equivalent
    implication)
  • A definite clause is a Horn clause that has
    exactly one positive literal.
  • A Horn clause without a positive literal is
    called a goal.
  • Horn clauses express a subset of statements of
    first-order logic.
  • Prolog is built on top of Horn clauses.
  • Prolog programs are comprised of definite clauses
    and any question in Prolog is a goal.
  • Strict set of operations and useful scaling
    properties.
  • FOIL Produces Horn clauses from data expressed
    as relations
  • Quinlan, J.R. (1990), "Learning Logical
    Definitions from Relations", Machine Learning 5,
    239-266.
  • http//www.cs.cmu.edu/afs/cs/project/ai-repository
    /ai/areas/learning /systems/0.html

16
Fuzzy
  • Use of Fuzzy sets to encode membership of message
    to classifier
  • LCS encodes membership function of each allele
  • Casillas, J. Carse, B. Bull, L. (2007) Fuzzy
    XCS a Michigan Genetic Fuzzy System. IEEE
    Transactions on Fuzzy Systems 15(4) 536-550.

Membership 1 COLD
WARM HOT Temperature
17
Neural
  • Use of Neural Networks to encode membership of
    message to classifier
  • LCS encodes one NN of each allele.
  • Number of rules and composition of rules learnt.
  • Three optimised NNs avoid, seek and orientate
  • Jacob Hurst, Matt Studley UWE

18
S-Expressions
Lisp like expressions For people who speak in
brackets
( (- sqrt(- ( b b) ( ( 2 2) ( a c)))) b) (
2 a))
Genetic Programming inspired (see work by P-L
Lanzi) Logical functions AND, OR
,NOT Terminals 0, 1, , NULL Exponential/Logs
e, ln, log, Hyperbolic Sine, Cos,
Tan Mathematical SQRT, POW Tailored Value
at, Address of, ... Need to tailor match and
reproduction NB is ternary alphabet a subset?
http//www.genetic-programming.com/
19
S-Expressions
Lisp like expressions For people who speak in
brackets
( (- sqrt(- ( b b) ( ( 2 2) ( a c)))) b) (
2 a))
Bloat? Is a LCS with S-expressions not just
GP? How to tailor functions without introducing
bias? How to identify building blocks of
Subexpressions? When are two Subexpressions
equivalent? Is trade-off between reduced problem
search space to increased alphabet search space
worth it?
http//www.genetic-programming.com/
Write a Comment
User Comments (0)
About PowerShow.com