Symbolic Machine Learning Ib - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Symbolic Machine Learning Ib

Description:

The algorithm that finds the maximally specific hypothesis is limited in that it ... The size of H is unmanageable. Previous Exercise: Mushrooms ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 25
Provided by: ricardo125
Category:

less

Transcript and Presenter's Notes

Title: Symbolic Machine Learning Ib


1
Symbolic Machine Learning Ib
  • Preliminaries
  • Candidate Elimination Algorithm
  • Inductive Bias
  • Summary

2
Main Ideas
  • The algorithm that finds the maximally specific
    hypothesis is limited in that it only finds one
    of many hypotheses consistent with the training
    data.
  • The Candidate Elimination Algorithm (CEA) finds
    ALL hypotheses consistent with the training data.
  • CEA does that without explicitly enumerating all
    consistent hypotheses.
  • Applications
  • Chemical Mass Spectroscopy
  • Control Rules for Heuristic Search

3
Consistency vs Coverage
h2
Training set D
h1
Positive examples
Negative examples
h1 covers a different set of examples than h2 h2
is consistent with training set D h1 is not
consistent with training set D
4
Version Space VS
Hypothesis space H
Version space Subset of hypotheses from H
consistent with
training set D.
5
List-Then-Eliminate Algorithm
  • Algorithm
  • 1. Version Space VS All hypotheses in H
  • 2. For each training example X
  • Remove every hypothesis h in H
    inconsistent
  • with X h(x) c(x)
  • Output the version space VS

Comments This is unfeasible. The size of H is
unmanageable.
6
Previous Exercise Mushrooms
Lets remember our exercise in which we tried to
classify mushrooms as poisonous or not-poisonous.
Training set D ((red,small,round,humid,low,smoot
h), poisonous) ((red,small,elongated,humid,low,smo
oth), poisonous) ((gray,large,elongated,humid,low,
rough), not-poisonous) ((red,small,elongated,humi
d,high,rough), poisonous)
7
Consistent Hypotheses
Our first algorithm found only one out of six
consistent hypotheses
(red,?,?,?,?,?)
(?,small,?,?,?,?)
G
(?,small,?,humid,?,?)
(red,?,?,humid,?,?)
(red,small,?,?,?,?)
(red,small,?,humid,?,?)
S
S Most specific G Most general
8
Symbolic Machine Learning Ib
  • Preliminaries
  • Candidate Elimination Algorithm
  • Inductive Bias
  • Summary

9
Candidate-Elimination Algorithm
The candidate elimination algorithm keeps two
lists of hypotheses consistent with the training
data The list of most specific hypotheses S
and the list of most general hypotheses G This
is enough to derive the whole version space VS.
(red,?,?,?,?,?)
(?,small,?,?,?,?)
G
VS
(red,small,?,humid,?,?)
S
10
Candidate-Elimination Algorithm
  • Initialize G to the set of maximally general
    hypotheses in H
  • Initialize S to the set of maximally specific
    hypotheses in H
  • For each training example X do
  • If X is positive generalize S if necessary
  • If X is negative specialize G if necessary
  • Output G,S

11
Positive Examples
  • If X is positive
  • Remove from G any hypothesis inconsistent with X
  • For each hypothesis h in S not consistent with X
  • Remove h from S
  • Add all minimal generalizations of h consistent
    with X
  • such that some member of G is more
    general than h
  • Remove from S any hypothesis more general than
    any other
  • hypothesis in S

G
h
inconsistent
add minimal generalizations
S
12
Negative Examples
  • b) If X is negative
  • Remove from S any hypothesis inconsistent with X
  • For each hypothesis h in G not consistent with X
  • Remove g from G
  • Add all minimal generalizations of h consistent
    with X
  • such that some member of S is more
    specific than h
  • Remove from G any hypothesis more general than
    any other
  • hypothesis in G

G
add minimal specializations
S
h
inconsistent
13
An Exercise
Initialize the S and G sets S (0,0,0,0,0,0) G
(?,?,?,?,?,?) Lets look at the first two
examples ((red,small,round,humid,low,smooth), po
isonous) ((red,small,elongated,humid,low,smooth),
poisonous)
14
An Exercise two positives
The first two examples are positive ((red,small,r
ound,humid,low,smooth), poisonous) ((red,small,elo
ngated,humid,low,smooth), poisonous)
G (?,?,?,?,?,?)
specialize
(red,small,?,humid,low,smooth)
(red,small,round,humid,low,smooth)
generalize
S (0,0,0,0,0,0)
15
An Exercise first negative
The third example is a negative
example ((gray,large,elongated,humid,low,rough),
not-poisonous)
G (?,?,?,?,?,?)
specialize
(red,?,?,?,?,?,?) (?,small,?,?,?,?)
(?,?,?,?,?,smooth)
S(red,small,?,humid,low,smooth)
generalize
Why is (?,?,round,?,?,?) not a valid
specialization of G
16
An Exercise another positive
The fourth example is a positive
example ((red,small,elongated,humid,high,rough),
poisonous)
specialize
G (red,?,?,?,?,?,?) (?,small,?,?,?,?)
(?,?,?,?,?,smooth)
(red,small,?,humid,?,?)
generalize
S(red,small,?,humid,low,smooth)
17
The Learned Version Space VS
G (red,?,?,?,?,?,?) (?,small,?,?,?,?)
(red,?,?,humid,?,?) (red,small,?,?,?,?)
(?,small,?,humid,?,?)
S (red,small,?,humid,?,?)
18
Points to Consider
  • Will the algorithm converge to the right
    hypothesis?
  • The algorithm is guaranteed to converge to
    the right
  • hypothesis provided the following
  • No errors exist in the examples.
  • The target concept is included in the hypothesis
    space H.
  • What happens if there exists errors in the
    examples?
  • The right hypothesis would be inconsistent and
    thus eliminated.
  • If the S and G sets converge to an empty space
    we have evidence
  • that the true concept lies outside space H.

19
Classifying Examples
What if the version space VS has not collapsed
into a single hypothesis and we are asked to
classify a new instance? Suppose all hypotheses
in set S agree that the instance is
positive? Then we are sure that all hypotheses in
VS agree the instance is positive. Why? The
same can be said if the instance is negative by
all members of set G. Why? In general one can
vote over all hypotheses in VS if there is no
unanimous agreement.
20
Symbolic Machine Learning Ib
  • Preliminaries
  • Candidate Elimination Algorithm
  • Inductive Bias
  • Summary

21
Inductive Bias
Inductive bias is the preference for a hypothesis
space H and a search mechanism over H. What
would happen if we choose an H that contains all
possible hypotheses? What would the size of H
be? H Size of the power set of the input
space X. Example. You have n Boolean features.
X 2n And the size of H is 22
n
22
Inductive Bias
In this case, the candidate elimination algorithm
would simply classify as positive the training
examples it has seen. This is because H is so
large, every possible hypothesis is contained
within it.
A Property of any Inductive Algorithm It must
have some embedded assumptions about the nature
of H. Without assumptions learning is impossible.
23
Symbolic Machine Learning Ib
  • Preliminaries
  • Candidate Elimination Algorithm
  • Inductive Bias
  • Summary

24
Summary
  • The candidate elimination algorithm exploits the
    general-specific
  • ordering of hypotheses to find all
    hypotheses consistent with the
  • training data.
  • The version space contains all consistent
    hypotheses and is simply
  • represented by two lists S and G.
  • The candidate elimination algorithm is not
    robust to noise and assumes
  • the target concept is included in the
    hypothesis space.
  • Any inductive algorithm needs some assumptions
    about the hypothesis
  • space, otherwise it would be impossible to
    perform predictions.
Write a Comment
User Comments (0)
About PowerShow.com