CS236501 Introduction to AI - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

CS236501 Introduction to AI

Description:

Unlabeled example. Label. We aim to produce. an accurate classifier. 12/22/09. Intro. ... We want to learn the concept: 'A good day to play tennis' Examples to ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 19
Provided by: webcourse
Category:

less

Transcript and Presenter's Notes

Title: CS236501 Introduction to AI


1
CS 236501Introduction to AI
  • Tutorial 9
  • ID3

2
Learning
Unlabeled example
Learner
Classifier
Training Set
/-
Label
We aim to produce an accurate classifier
3
Example Play Tennis
  • We want to learn the concept
  • A good day to play tennis
  • Examples to be used for learning

Classification (label)
Attributes
Outlook Temperature Humidity Wind PlayTennis
Sunny Cold High Weak YES
Rain Hot High Strong NO
Sunny Hot High Strong NO
Examples
4
Decision Trees
A Node represents an attribute
Possible attribute values
Outlook
Sunny
Overcast
Rain
Humidity
Wind
YES
High
Normal
Strong
Weak
YES
NO
YES
NO
Leaves contain classifications
(Outlook Sunny, Temperature High, Humidity
High, Wind Weak) ? PlayTennis NO
5
Building a decision tree
  • Building a decision tree, given a group of
    labeled examples (training set)
  • Choose an attribute A
  • Split the examples according to the values of A
  • Build trees for sons recursively, stop splitting
    when all examples at a node have the same labels

A
A a1
A a2
A a3



6
ID3
  • ID3 is an algorithm for building decision trees
  • ID3 uses information gain to select the best
    attribute for splitting

7
Decision trees and ID3
ID3
Decision tree
Learner
Classifier
Training Set
8
Information Gain
High Uncertainty
High Uncertainty
Low Uncertainty
Good Split
9
Information Gain
  • Where
  • p, n number of positive/negative examples at a
    node
  • I(p,n) uncertainty given p and n
  • Va number of possible values for attribute A
  • Ei number of examples at son i
  • ID3 chooses an attribute with the highest gain
    for splitting

10
Attribute TypesAn Attribute with Discrete Values
  • The domain of attribute A is discrete
  • DomainA blue, green, yellow
  • Splitting is simple create a son for each
    possible value of A

11
Attribute typesAn attribute with continuous
values
  • The domain of attribute A is continuous
  • DomainA 1 - 100
  • How to split?
  • Suggestion make the domain discrete
  • DomainA 1 30, 30 40, 40 100
  • Problems
  • Which discretization is good?
  • We will not be able to distinguish between
    examples in the same range
  • Example if A represents grades, there will be no
    difference between students with grades within
    the range 40 -100

12
An attribute with continuous values
  • A solution dynamic split
  • Sort examples according to the values of
    attribute A
  • For each possible value xi ? Domain(A)
  • Try to split into 2 sons xi and gt xi
  • Measure the information gain of the split
  • An example good temperature for playing tennis
    is between 20 and 28 deg.

13
Attribute TypesAn important note
  • Let ATTRIB A1, A2, , An be the group of
    attributes available for splitting at the current
    node
  • Let Ai be the attribute chosen for split
  • If the domain of Ai is discrete
  • We will choose from ATTRIB \ Ai for splitting at
    the sons
  • If the domain of Ai values is continuous
  • We will choose from ATTRIB for splitting at the
    sons

14
The Accuracy of a Classifier
  • We aim to produce an accurate classifier
  • How can we measure an accuracy of a classifier
    that was produced by our algorithm?
  • We can know the true accuracy of the classifier
    by testing it on all possible examples
  • This is usually impossible
  • We can get an estimation of the classifier
    accuracy by testing it on a subset of all
    possible examples

15
Estimating an Accuracyof a Classifier
  • Assume that we have a labeled set of examples T.
  • We can split T
  • Use k of T as a training set
  • Use the rest (100 k) of T for testing
  • The accuracy of the classifier on the test set
    will provide us an estimate of the true accuracy
  • Note it is important that the training set and
    the test set do not overlap, otherwise the
    accuracy estimate can be too optimistic

16
Cross Validation
  • A common method for estimating the accuracy of a
    classifier, by splitting the labeled data into
    non-overlapping training and testing sets
  • N-fold cross validation
  • Split the labeled data to N distinct groups
  • Run N experiments, in each
  • Use N 1 groups of examples for learning
    (training set)
  • Use one group for testing
  • Average the results of the N experiments this is
    the accuracy estimate

17
An Example5-Fold Cross Validation
Labeled Data
Group 1 Group 2 Group 3 Group 4 Group 5
Run Training Set Test Set Classifier accuracy
1 Groups 2, 3, 4, 5 Group 1 X1
2 Groups 1, 3, 4, 5 Group 2 X2
3 Groups 1, 2, 4, 5 Group 3 X3
4 Groups 1, 2, 3, 5 Group 4 X4
5 Groups 1, 2, 3, 4 Group 5 X5
Classifier accuracy estimate Classifier accuracy estimate Classifier accuracy estimate Average of X1 X5
18
Learning Curves
  • Show the accuracy of the produced classifier as a
    function of the training set size
  • In simple words, show how classification
    accuracies behave, when learning with more and
    more examples
  • Note that the accuracies should be measured on
    the same test set, which does not overlap with
    any of the training sets
Write a Comment
User Comments (0)
About PowerShow.com