Formal Evaluation Techniques - PowerPoint PPT Presentation

1 / 37

About This Presentation

Title:

Formal Evaluation Techniques

Description:

Curvilinear Relationship. Figure 7.4 A perfect positive correlation (r = 1) ... with no linear correlation (r = 0) but a substantial curvilinear relationship ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 38

Provided by: Richard1376

Category:

more less

Transcript and Presenter's Notes

Title: Formal Evaluation Techniques

1
Formal Evaluation Techniques

Chapter 7

2
7.1 What Should Be Evaluated?

Supervised Model
Training Data
Attributes
Model Builder
Parameters
Test Set Evaluation

3
(No Transcript)
4
7.2 Tools for Evaluation
5
Single-Valued Summary Statistics

Mean
Variance
Standard deviation

6
The Normal Distribution
7
(No Transcript)
8
Normal Distributions and Sample Means

A distribution of means taken from random sets
of independent samples of equal size are
distributed normally.
Any sample mean will vary less than two
standard errors from the population mean 95 of
the time.

9
Computing the Standard Error

The population variance is estimated by
dividing the sample variance by the sample
size.
The standard error is computed by taking the
square root of the estimated population
variance.

10
(No Transcript)
11
A Classical Model for Hypothesis Testing
12
(No Transcript)
13
7.3 Computing Test Set Confidence Intervals
14
Computing 95 Test Set Confidence Intervals for
Whole Population

Given a test set sample S of size n and error
rate E
Compute sample variance as V E(1-E)
Compute the standard error (SE) as the square
root of V divided by n.
Calculate an upper bound error as E 2(SE)
Calculate a lower bound error as E - 2(SE)

E 10, E0.1, n100
Variance 0.1(1-0.1) 0.09
SE(0.09/100)1/20.03
We can be 90 confident that the actual test set
error rate lies somewhere between 2SE above and
2SE below 0.1. The actual TSER is between 0.04
and 0.16.
Test set accuracy is between 84 and 96

If number of instances is incereased, the size of
confidence (test set accuracy) is decreased.
If n1000,
SE0.005
Test set accuracy is between 88 and 92

17
Cross validation

If test data size is small, apply
cross-validation
Cross-validation
Avalilable data is partitioned into n equal-size
units
For ith unit where i1,..,n
n-1 units used for training, nth used for test,
with average accuracy ai
Model accuracy average(ai)

18
bootstrapping

Let training set selection process choose the
same training instance several times
Select n items from among n items with duplicates
Training set contains approx 2/3 of the n
instance, after n times of selection
1/3 used for testing

19
7.4 Comparing Supervised Learner Models
20

Null Hypothesis There is no significant
different in the test set error rate of two
supervised learner model built with the same
training data.

21
Comparing Models with Independent Test Data

where
E1 The error rate for model M1
E2 The error rate for model M2
q (E1 E2)/2
n1 the number of instances in test set A
n2 the number of instances in test set B
q(1-q) variance

22
Comparing Models with a Single Test Dataset

where
E1 The error rate for model M1
E2 The error rate for model M2
q (E1 E2)/2
n the number of test set instances

23
Comparing Models with a Single Test Dataset
Example

Test M1 with A, M2 with B, 100 instance each. M1
has 80 accuracy, M2 has 70.
We wish to know if M1 has performed significantly
better than M2.
E10.2, E20.3, q0.25, combined variance
q(1-q)0.1875,
P1.633, no sifnificant diffference between M1
and M2

24
7.5 Attribute Evaluation
25
Locating Redundant Attributes with Excel

Correlation Coefficient
Positive Correlation
Negative Correlation
Curvilinear Relationship

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
Creating a Scatterplot Diagram with MS Excel
30
(No Transcript)
31
Hypothesis Testing for Numerical Attribute
Significance
32
(No Transcript)
33
7.6 Unsupervised Evaluation Techniques