Title: Symbolic Machine Learning Ib
1Symbolic Machine Learning Ib
- Preliminaries
- Candidate Elimination Algorithm
- Inductive Bias
- Summary
2Main Ideas
- The algorithm that finds the maximally specific
hypothesis is limited in that it only finds one
of many hypotheses consistent with the training
data. - The Candidate Elimination Algorithm (CEA) finds
ALL hypotheses consistent with the training data.
- CEA does that without explicitly enumerating all
consistent hypotheses. - Applications
- Chemical Mass Spectroscopy
- Control Rules for Heuristic Search
3Consistency vs Coverage
h2
Training set D
h1
Positive examples
Negative examples
h1 covers a different set of examples than h2 h2
is consistent with training set D h1 is not
consistent with training set D
4Version Space VS
Hypothesis space H
Version space Subset of hypotheses from H
consistent with
training set D.
5List-Then-Eliminate Algorithm
- Algorithm
- 1. Version Space VS All hypotheses in H
- 2. For each training example X
- Remove every hypothesis h in H
inconsistent - with X h(x) c(x)
- Output the version space VS
Comments This is unfeasible. The size of H is
unmanageable.
6Previous Exercise Mushrooms
Lets remember our exercise in which we tried to
classify mushrooms as poisonous or not-poisonous.
Training set D ((red,small,round,humid,low,smoot
h), poisonous) ((red,small,elongated,humid,low,smo
oth), poisonous) ((gray,large,elongated,humid,low,
rough), not-poisonous) ((red,small,elongated,humi
d,high,rough), poisonous)
7Consistent Hypotheses
Our first algorithm found only one out of six
consistent hypotheses
(red,?,?,?,?,?)
(?,small,?,?,?,?)
G
(?,small,?,humid,?,?)
(red,?,?,humid,?,?)
(red,small,?,?,?,?)
(red,small,?,humid,?,?)
S
S Most specific G Most general
8Symbolic Machine Learning Ib
- Preliminaries
- Candidate Elimination Algorithm
- Inductive Bias
- Summary
9Candidate-Elimination Algorithm
The candidate elimination algorithm keeps two
lists of hypotheses consistent with the training
data The list of most specific hypotheses S
and the list of most general hypotheses G This
is enough to derive the whole version space VS.
(red,?,?,?,?,?)
(?,small,?,?,?,?)
G
VS
(red,small,?,humid,?,?)
S
10Candidate-Elimination Algorithm
- Initialize G to the set of maximally general
hypotheses in H - Initialize S to the set of maximally specific
hypotheses in H - For each training example X do
- If X is positive generalize S if necessary
- If X is negative specialize G if necessary
- Output G,S
11Positive Examples
- If X is positive
- Remove from G any hypothesis inconsistent with X
- For each hypothesis h in S not consistent with X
- Remove h from S
- Add all minimal generalizations of h consistent
with X - such that some member of G is more
general than h - Remove from S any hypothesis more general than
any other - hypothesis in S
G
h
inconsistent
add minimal generalizations
S
12Negative Examples
- b) If X is negative
- Remove from S any hypothesis inconsistent with X
- For each hypothesis h in G not consistent with X
- Remove g from G
- Add all minimal generalizations of h consistent
with X - such that some member of S is more
specific than h - Remove from G any hypothesis more general than
any other - hypothesis in G
G
add minimal specializations
S
h
inconsistent
13An Exercise
Initialize the S and G sets S (0,0,0,0,0,0) G
(?,?,?,?,?,?) Lets look at the first two
examples ((red,small,round,humid,low,smooth), po
isonous) ((red,small,elongated,humid,low,smooth),
poisonous)
14An Exercise two positives
The first two examples are positive ((red,small,r
ound,humid,low,smooth), poisonous) ((red,small,elo
ngated,humid,low,smooth), poisonous)
G (?,?,?,?,?,?)
specialize
(red,small,?,humid,low,smooth)
(red,small,round,humid,low,smooth)
generalize
S (0,0,0,0,0,0)
15An Exercise first negative
The third example is a negative
example ((gray,large,elongated,humid,low,rough),
not-poisonous)
G (?,?,?,?,?,?)
specialize
(red,?,?,?,?,?,?) (?,small,?,?,?,?)
(?,?,?,?,?,smooth)
S(red,small,?,humid,low,smooth)
generalize
Why is (?,?,round,?,?,?) not a valid
specialization of G
16An Exercise another positive
The fourth example is a positive
example ((red,small,elongated,humid,high,rough),
poisonous)
specialize
G (red,?,?,?,?,?,?) (?,small,?,?,?,?)
(?,?,?,?,?,smooth)
(red,small,?,humid,?,?)
generalize
S(red,small,?,humid,low,smooth)
17The Learned Version Space VS
G (red,?,?,?,?,?,?) (?,small,?,?,?,?)
(red,?,?,humid,?,?) (red,small,?,?,?,?)
(?,small,?,humid,?,?)
S (red,small,?,humid,?,?)
18Points to Consider
- Will the algorithm converge to the right
hypothesis? - The algorithm is guaranteed to converge to
the right - hypothesis provided the following
- No errors exist in the examples.
- The target concept is included in the hypothesis
space H. - What happens if there exists errors in the
examples? - The right hypothesis would be inconsistent and
thus eliminated. - If the S and G sets converge to an empty space
we have evidence - that the true concept lies outside space H.
19Classifying Examples
What if the version space VS has not collapsed
into a single hypothesis and we are asked to
classify a new instance? Suppose all hypotheses
in set S agree that the instance is
positive? Then we are sure that all hypotheses in
VS agree the instance is positive. Why? The
same can be said if the instance is negative by
all members of set G. Why? In general one can
vote over all hypotheses in VS if there is no
unanimous agreement.
20Symbolic Machine Learning Ib
- Preliminaries
- Candidate Elimination Algorithm
- Inductive Bias
- Summary
21Inductive Bias
Inductive bias is the preference for a hypothesis
space H and a search mechanism over H. What
would happen if we choose an H that contains all
possible hypotheses? What would the size of H
be? H Size of the power set of the input
space X. Example. You have n Boolean features.
X 2n And the size of H is 22
n
22Inductive Bias
In this case, the candidate elimination algorithm
would simply classify as positive the training
examples it has seen. This is because H is so
large, every possible hypothesis is contained
within it.
A Property of any Inductive Algorithm It must
have some embedded assumptions about the nature
of H. Without assumptions learning is impossible.
23Symbolic Machine Learning Ib
- Preliminaries
- Candidate Elimination Algorithm
- Inductive Bias
- Summary
24Summary
- The candidate elimination algorithm exploits the
general-specific - ordering of hypotheses to find all
hypotheses consistent with the - training data.
- The version space contains all consistent
hypotheses and is simply - represented by two lists S and G.
- The candidate elimination algorithm is not
robust to noise and assumes - the target concept is included in the
hypothesis space. - Any inductive algorithm needs some assumptions
about the hypothesis - space, otherwise it would be impossible to
perform predictions.