Formal Evaluation Techniques - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Formal Evaluation Techniques

Description:

Curvilinear Relationship. Figure 7.4 A perfect positive correlation (r = 1) ... with no linear correlation (r = 0) but a substantial curvilinear relationship ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 38
Provided by: Richard1376
Category:

less

Transcript and Presenter's Notes

Title: Formal Evaluation Techniques


1
Formal Evaluation Techniques
  • Chapter 7

2
7.1 What Should Be Evaluated?
  • Supervised Model
  • Training Data
  • Attributes
  • Model Builder
  • Parameters
  • Test Set Evaluation

3
(No Transcript)
4
7.2 Tools for Evaluation
5
Single-Valued Summary Statistics
  • Mean
  • Variance
  • Standard deviation

6
The Normal Distribution
7
(No Transcript)
8
Normal Distributions and Sample Means
  • A distribution of means taken from random sets
    of independent samples of equal size are
    distributed normally.
  • Any sample mean will vary less than two
    standard errors from the population mean 95 of
    the time.

9
Computing the Standard Error
  • The population variance is estimated by
    dividing the sample variance by the sample
    size.
  • The standard error is computed by taking the
    square root of the estimated population
    variance.

10
(No Transcript)
11
A Classical Model for Hypothesis Testing
12
(No Transcript)
13
7.3 Computing Test Set Confidence Intervals
14
Computing 95 Test Set Confidence Intervals for
Whole Population
  • Given a test set sample S of size n and error
    rate E
  • Compute sample variance as V E(1-E)
  • Compute the standard error (SE) as the square
    root of V divided by n.
  • Calculate an upper bound error as E 2(SE)
  • Calculate a lower bound error as E - 2(SE)

15
  • E 10, E0.1, n100
  • Variance 0.1(1-0.1) 0.09
  • SE(0.09/100)1/20.03
  • We can be 90 confident that the actual test set
    error rate lies somewhere between 2SE above and
    2SE below 0.1. The actual TSER is between 0.04
    and 0.16.
  • Test set accuracy is between 84 and 96

16
  • If number of instances is incereased, the size of
    confidence (test set accuracy) is decreased.
  • If n1000,
  • SE0.005
  • Test set accuracy is between 88 and 92

17
Cross validation
  • If test data size is small, apply
    cross-validation
  • Cross-validation
  • Avalilable data is partitioned into n equal-size
    units
  • For ith unit where i1,..,n
  • n-1 units used for training, nth used for test,
    with average accuracy ai
  • Model accuracy average(ai)

18
bootstrapping
  • Let training set selection process choose the
    same training instance several times
  • Select n items from among n items with duplicates
  • Training set contains approx 2/3 of the n
    instance, after n times of selection
  • 1/3 used for testing

19
7.4 Comparing Supervised Learner Models
20
  • Null Hypothesis There is no significant
    different in the test set error rate of two
    supervised learner model built with the same
    training data.

21
Comparing Models with Independent Test Data
  • where
  • E1 The error rate for model M1
  • E2 The error rate for model M2
  • q (E1 E2)/2
  • n1 the number of instances in test set A
  • n2 the number of instances in test set B
  • q(1-q) variance
  •  

22
Comparing Models with a Single Test Dataset
  • where
  • E1 The error rate for model M1
  • E2 The error rate for model M2
  • q (E1 E2)/2
  • n the number of test set instances
  •  

23
Comparing Models with a Single Test Dataset
Example
  • Test M1 with A, M2 with B, 100 instance each. M1
    has 80 accuracy, M2 has 70.
  • We wish to know if M1 has performed significantly
    better than M2.
  • E10.2, E20.3, q0.25, combined variance
    q(1-q)0.1875,
  • P1.633, no sifnificant diffference between M1
    and M2

24
7.5 Attribute Evaluation
25
Locating Redundant Attributes with Excel
  • Correlation Coefficient
  • Positive Correlation
  • Negative Correlation
  • Curvilinear Relationship

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
Creating a Scatterplot Diagram with MS Excel
30
(No Transcript)
31
Hypothesis Testing for Numerical Attribute
Significance
32
(No Transcript)
33
7.6 Unsupervised Evaluation Techniques
  • Unsupervised Clustering for Supervised
    Evaluation
  • Supervised Evaluation for Unsupervised
    Clustering
  • Additional Methods

34
7.7 Evaluating Supervised Models with Numeric
Output
35
Mean Squared Error
  • where for the ith instance,
  • ai actual output value
  • ci computed output value
  •  
  •  

36
Mean Absolute Error
  • where for the ith instance,
  • ai actual output value
  • ci computed output value
  •  
  •  

37
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com