Title: Deriving Classification Rules
1Deriving Classification Rules
2The Covering Approach for Deriving Classification
Rules
- The covering algorithm recursively identifies a
new test to be added to the current rule to
further improve accuracy.
3An Example of the Covering Algorithm
-
- Step 1 If x gt 1.2 then class a.
- Step 2 If x gt 1.2 and ygt2.6 then class a
4Continue to Derive More Comprehensive Rules
- The rule if xgt1.2 and ygt2.6, then classa
covers all as but one. - A new rule if xgt1.4 and ylt2.4, then class A
may be added to cover all as.
5The Covering Algorithm
- The covering algorithm operate by adding new
tests to the rule under construction, always
striving to create a rule with maximum accuracy.
6A More Comprehensive Exampleand the Prism
Algorithm
- Assume we want to derive a rule for
recommendation hard based on the following
dataset.
7Insert Table 1.1 on page 4
8The Candidate Tests and Their Accuracies
ageyoung 2/8
agepre-presbyopic 1/8
agepresbyopic 1/8
spectacle prescriptionmyope 3/12
spectacle prescriptionhypermetrope 1/12
astigmatismno 0/12
astigmatismyes 4/12
tear production ratereduced 0/12
tear production ratenormal 4/12
- Among the 9 candidates, the following two have
the highest accuracy
9The First Intermediate Rule
- Assume that we pick astigmatism yes randomly.
Then, we have the first intermediate rule - If astigmatism yes,then recommendation hard.
- Now, consider the remaining possible tests in
order to refine the rule.
10(No Transcript)
11Tests to Refine the Intermediate Rule
ageyoung 2/4
agepre-presbyopic 1/4
agepresbyopic 1/4
spectacle prescriptionmyope 3/6
spectacle prescriptionhypermetrope 1/6
tear production ratereduced 0/6
tear production ratenormal 4/6
- The test tear production rate normal is the
apparent winner. - Hence, the intermediate rule becomes
- If astigmatism yesand tear production rate
normal,then recommendation hard.
12Insert Table 4.9 on page 102
13More Tests to Get the Perfect Rule
ageyoung 2/4
agepre-presbyopic 1/2
agepresbyopic 1/2
spectacle prescriptionmyope 3/3
spectacle prescriptionhypermetrope 1/3
- We may include test spectacle prescription
myope to get a perfect rule. - The rule now is
- If astigmatism yesand tear production rate
normaland spectacle prescription myope,then
recommendation hard.
14Deriving More Rules to Get 100 Coverage
- The rule that we just derived covers 3 out of 4
instances that have recommendation hard. - Therefore, we delete these 3 instances and start
the process over again.
15The Complete Rules List for Recommendation Hard
- Eventually, we will get the following list of
rules - If astigmatism yesand tear production rate
normaland spectacle prescription myope,then
recommendation hard. - If age youngand astigmatism yesand tear
production rate normal,then recommendation
hard.
16An Example of Overfitting
- Assume that we have derived the following rule.
- If A B C, then Ans yes.
- Further assume that
- 20 training sample pass condition A B and 17 of
them give yes answer. - 10 training sample pass condition A B C and 9
of them give yes answer.
17- The pessimistic error rates of (1) and (2) under
the 95 confidence level are - There fore, we may remove condition C from the
rule.
18- Similarly, we can apply the chi-square test. The
corresponding contingency table is as follows. - Therefore, condition C should be deleted from the
rule.
19Final Remarks on Decision Trees and Decision Rules
- Given a decision tree, we can derive a set of
decision rules based on the decision tree.
However, sometimes, it is impossible to do the
reverse derivation. - The derivation of a single decision rule focuses
more on precision rather than the recall rate.
20- We have used 3 different measures in evaluating
the effectiveness of a decision among many more
that have been proposed in literatures. - Information gain
- Chi-square statistic
- Accuracy.
- How these measures compare is subject to the
characteristics of the data set and the goal of
the application.