Chapter 3: Decision Tree Learning - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Chapter 3: Decision Tree Learning

Description:

Chapter 3: Decision Tree Learning Decision Tree Learning Introduction Decision Tree Representation Appropriate Problems for Decision Tree Learning Basic Algorithm ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 38
Provided by: csSungsh
Category:

less

Transcript and Presenter's Notes

Title: Chapter 3: Decision Tree Learning


1
Chapter 3 Decision Tree Learning
2
Decision Tree Learning
  • Introduction
  • Decision Tree Representation
  • Appropriate Problems for Decision Tree Learning
  • Basic Algorithm
  • Hypothesis Space Search in Decision Tree Learning
  • Inductive Bias in Decision Tree Learning
  • Issues in Decision Tree Learning
  • Summary

3
Introduction
  • A method for approximating discrete-valued target
    functions
  • Easy to convert learned tree into if-then rule
  • ID3, ASSISTANT, C4.5
  • Preference bias to smaller trees.
  • Search a completely expressive hypothesis space

4
Decision Tree Representation
  • Root -gt leaf? sorting??? ???? ??
  • Node attribute ???
  • Branch attributes value? ??
  • Disjunction of conjunctions of constraints on the
    attribute values of instances

5
(No Transcript)
6
Appropriate Problems for Decision tree Learning
  • Instances are represented by attribute-value
    pairs
  • The target function has discrete output values
  • Disjunctive descriptions may be required
  • The training data may contain errors
  • The training data may contain missing attribute
    values

7
Basic Algorithm
  • ??? ?? decision trees space??? top-down, greedy
    search
  • Training examples? ?? ? ??? ? ?? attribute? ???
    ??.
  • Entropy, Information gain

8
(No Transcript)
9
Entropy
  • Minimum number of bits of information needed to
    encode the classification of an arbitrary member
    of S
  • entropy 0, if all members in the same class
  • entropy 1, if positive examplesnegative
    examples

10
(No Transcript)
11
Information Gain
  • Expected reduction in entropy caused by
    partitioning the examples according to attribute
    A
  • Attribute A? ???? ???? entropy? ?? ??

12
(No Transcript)
13
(No Transcript)
14
Which Attribute is the Best Classifier? (1)
15
Which Attribute is the Best Classifier? (2)
Classifying examples by Humidity provides more
information gain than by Wind.
16
(No Transcript)
17
Hypothesis Space Search in Decision Tree Learning
(1)
  • Training examples? ??? ??? hypothesis? ???.
  • ID3? hypothesis space
  • the set of possible decision trees
  • Simple-to-complex, hill-climbing search
  • Information gain gt hill-climbing? guide

18
(No Transcript)
19
Hypothesis Space Search in Decision tree Learning
(2)
  • Complete space of finite discrete-valued
    functions
  • Single current hypothesis ? ????.
  • No back-tracking
  • ??? ? ???? ?? training examples ??-???? ??? ??

20
Inductive Bias (1) - Case ID3
  • Examples? ???? decision tree? ? ?? decision tree?
    ???? ? ????
  • Shorter trees are preferred over larger trees,
  • Trees that place high information gain attributes
    close to the root are preferred.

21
Inductive Bias (2)
22
Inductive Bias (3)
  • Occams razor
  • Prefer the simplest hypothesis that fits the data
  • Major difficulty
  • ??? ?? ??? ?? hypothesis? ??? ??? ? ??.

23
Issues in Decision Tree Learning
  • How deeply to grow the decision tree
  • Handling continuous attributes
  • Choosing an appropriate attribute selection
    measure
  • Handling the missing attribute values

24
Avoiding Overfitting the Data (1)
  • Training examples? ???? ??? ??? tree? ?????
  • 1. Data? noise? ?? ?
  • 2. Training examples ?? ?? ?
  • Overfit training data? ?? hypothesis h,h? ?? ?
  • h? error lt h? error, (training examples? ???)
  • h? error gt h? error, (?? ????? ???)

25
(No Transcript)
26
Avoiding Overfitting the Data (2)
  • ???
  • 1.examples? training set? validation set?? ???.
  • 2.?? data? training?? ????, ?? ??? ??? ??? ?? ?
    ?? ? ????? ????.
  • 3.Training examples, decision tree? encoding??
    ???? ???? explicit measure?? -chapter 6
  • 1? ?? training and validation set approach
  • validation set gt hypothesis? pruning ?? ??

27
Reduced Error Pruning
  • validation set? ???, ??? ??? tree? ??? tree?? ???
    ?? ??? ??? ?, ? ??? ????.
  • Training set?? ???? ??? leaf ??? ??? ???? ??.
  • ? ?? ???? validation set??? ????? ??? ??
  • Training set, test set, validation set?? ??
  • ?? data? ?? ?? ?

28
(No Transcript)
29
Rule Post-Pruning (1)
  • 1. Decision tree? ???. (overfitting ??)
  • 2. Root?? leaf? ??? rule? ??
  • 3. Precondition? ?????? estimated accuracy ?
    ????? rule? ??
  • 4. Estimated accuracy? ?? sort??. Subsequent
    instance? ??? ? ??? ??? ????.

30
Rule Post-Pruning (2)
  • Pruning ?? decision tree? rule? ???? ??
  • Decision node? ???? ??? context?? ??? ? ??.
  • Root? leaf ????? attribute ???? ??? ?? ??.

31
Incorporating Continuous-Valued Attributes
  • Information gain? ??? ?? ??
  • threshold? ???.
  • Attribute value? ?? sort??.
  • Target classification? ??? pair? ???.
  • ? pair? ???? threshold ??? ??.
  • ? ??? ? information gain? ??? ?? ?? ??

32
Alternative Measures for Selecting Attributes (1)
  • Information gain measure? ?? value? ?? attribute?
    ????.
  • Extreme example
  • Attribute Data (e.g. March 4. 1979)
  • Training data? ???? target attribute? ???? ??
  • ?? predictor? ?? ???

33
Alternative Measures for Selecting Attributes (2)
  • attribute A? value? ?? ????? S? ?? entropy??.
  • n?? data? n?? value? ???? ?????
  • 2???? ???? ??? 2?? value? ????

34
Alternative Measures for Selecting Attributes(3)
35
Handling Training Examples with Missing Attribute
Values
  • node n? ?? examples ??? C(x)? ??? ?? ? ?? ??
    attribute value? ???
  • attribute A? ??? value? ?? ???? ??.
  • Node n? ?? A? value? frequency? ?????? ? ? ??.

36
Handling Attributes with Differing Costs
37
Summary
  • ID3 family root rule?? downward? ??, next best
    attribute? greedy search
  • Complete hypothesis space
  • Preference for smaller trees
  • Overfitting avoidance by Post-pruning
Write a Comment
User Comments (0)
About PowerShow.com