Performance Evaluation: Estimation of Recognition rates

About This Presentation

Title:

Performance Evaluation: Estimation of Recognition rates

Description:

Title: This would be an example of a two line header Author: Ken Hyman Last modified by: roger jang Created Date: 10/11/1995 6:38:31 PM Document presentation format – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 20

Provided by: KenH172

Category:

more less

Transcript and Presenter's Notes

Title: Performance Evaluation: Estimation of Recognition rates

1
Performance EvaluationEstimation ofRecognition
rates
Machine Learning Performance Evaluation

J.-S. Roger Jang ( ??? )
CSIE Dept., National Taiwan Univ.
http//mirlab.org/jang
jang_at_mirlab.org

2
Outline

Performance indices of a given classifier/model
Accuracy (recognition rate)
Computation load
Methods to estimate the recognition rate
Inside test
One-sided holdout test
Two-sided holdout test
M-fold cross validation
Leave-one-out cross validation

3
Synonym

The following sets of synonyms will be use
interchangeably
Classifier, model
Recognition rate, accuracy

4
Performance Indices

Performance indices of a classifier
Recognition rate
Requires an objective procedure to derive it
Computation load
Design-time computation
Run-time computation
Our focus
Recognition rate and the procedures to derive it
The estimated accuracy depends on
Dataset
Model (types and complexity)

5
Methods for Deriving Recognition rates

Methods to derive the recognition rates
Inside test (resubstitution recog. rate)
One-sided holdout test
Two-sided holdout test
M-fold cross validation
Leave-one-out cross validation
Data partitioning
Training set
Training and test sets
Training, validating, and test sets

6
Inside Test

Dataset partitioning
Use the whole dataset for training evaluation
Recognition rate
Inside-test recognition rate
Resubstitution accuracy

7
Inside Test (2)

Characteristics
Too optimistic since RR tends to be higher
For instance, 1-NNC always has an RR of 100!
Can be used as the upper bound of the true RR.
Potential reasons for low inside-test RR
Bad features of the dataset
Bad method for model construction, such as
Bad results from neural network training
Bad results from k-means clustering

8
One-side Holdout Test

Dataset partitioning
Training set for model construction
Test set for performance evaluation
Recognition rate
Inside-test RR
Outside-test RR

9
One-side Holdout Test (2)

Characteristics
Highly affected by data partitioning
Usually Adopted when design-time computation load
is high

10
Two-sided Holdout Test

Dataset partitioning
Training set for model construction
Test set for performance evaluation
Role reversal

11
Two-sided Holdout Test (2)

Two-sided holdout test (used in GMDH)

Outside-test RR (RRA RRB)/2
12
Two-sided Holdout Test (3)

Characteristics
Better usage of the dataset
Still highly affected by the partitioning
Suitable for models/classifiers with high
design-time computation load

13
M-fold Cross Validation

Data partitioning
Partition the dataset into m fold
One fold for test, the other folds for training
Repeat m times

14
M-fold Cross Validation (2)
construction
. . .
. . .
m disjoint sets
Model k
evaluation
. . .
Outside test
15
M-fold Cross Validation (3)

Characteristics
When m2 ? Two-sided holdout test
When mn ? Leave-one-out cross validation
The value of m depends on the computation load
imposed by the selected model/classifier.

16
Leave-one-out Cross Validation

Data partitioning
When mn and Si(xi, yi)

17
Leave-one-out Cross Validation (2)

Leave-one-out CV

construction
. . .
0 or 100!
. . .
n i/o pairs
Model k
evaluation
. . .
Outside test
18
Leave-one-out Cross Validation (3)

General method for LOOCV
Perform model construction (as a blackbox) n
times ? Slow!
To speed up the computation LOOCV
Construct a common part that will be used
repeatedly, such as
Global mean and covariance for QC
More info of cross-validation on Wikipedia

19
Applications and Misuse of CV

Applications of CV
Input (feature) selection
Model complexity determination
Performance comparison among different models
Misuse of CV
Do not try to boost validation RR too much, or
you are running the risk of indirectly training
the left-out data!

Write a Comment

User Comments (0)