Pattern Recognition Lecture 9 - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Pattern Recognition Lecture 9

Description:

Method to choose the X best (most discriminative) features ... Discriminative power = information. SEPCOR (ignore some of the features) ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 24
Provided by: thomasm46
Category:

less

Transcript and Presenter's Notes

Title: Pattern Recognition Lecture 9


1
Pattern RecognitionLecture 9
  • Methods for reducing the dimensionality of the
    feature-space

2
Topics
  • Last time
  • Evaluating features
  • Dependencies between features
  • Covariance and correlation
  • Today
  • Methods for exploiting the dependencies between
    features
  • Reduce the number of features gt reduce the
    dimensionality

3
Reduce the number of features
  • Why?
  • The curse of dimensionality
  • Visualization
  • Remove noise (10 dependent features 1
    independent)
  • Faster processing
  • How?
  • If features are correlated gt redundancy
  • Remove redundancy
  • Before the break
  • Methods where we DONT consider the classes
  • Unsupervised
  • After the break
  • Methods where we DO consider the classes
  • Supervised

4
Methods where we DONT consider the classes
  • Unsupervised
  • Ignore that samples comes
  • from different classes
  • Reduce the dimensionality (compression)
  • Methods
  • Hierarchical dimensionality reduction
  • Principal component analysis (PCA)

5
Hierarchical dimensionality reduction
  • Correlation matrix
  • Algorithm
  • Calc. the correlation matrix
  • Find max Cij
  • Merge feature Fi and Fj
  • Save merged feature as Fi
  • Delete Fj
  • Stop or go to 1)
  • Stop criterion
  • Max Cij is too small
  • Number of dim. Ok
  • Others
  • Merge features
  • Keep Fi and delete Fj
  • (Fi Fj) / 2
  • (w1Fi w2Fj) / 2
  • Others

6
Principal Component Analysis (PCA)
7
Principal Component Analysis (PCA)
  • Combines features into new features! and then
    ignore some of the new features
  • PCA is used a lot, especially when you have many
    dimensions
  • Basic idea Features with a large variance
    separates the classes better
  • If both features have
  • large variances then what?
  • Transform the feature-space,
  • so we get large variances and
  • no correlation!
  • Variance Information !

8
PCA Transform
  • Ignore y2 without loosing info when classifying
  • y1 and y2 are the principal components

9
PCA How to
  • Collect data (x)
  • Calc. the covariance matrix Cx
  • Matlab Cx cov(x)
  • Solve the Eigen-value problem gt A and Cy
  • Matlab Evec,Eval eig(Cx)
  • Transform x gt y y A (x-m)
  • Analyze (PCA)
  • M-method
  • Variability measure from SEPCOR
  • J-measure

10
What to remember
  • Feature reduction where we dont use the fact
    that data comes from different classes
  • Unsupervised
  • Hierarchical dimensionality reduction
  • Correlation matrix
  • Merge features or delete features
  • Principal Component Analysis (PCA)
  • Combine features into new features
  • Ignore some of the new features
  • VARIANCE INFORMATION
  • Transform the feature-space from the
    Eigen-vectors of the covariance matrix gt
    uncorrelated features !
  • Analyze
  • M-method (no class info.)
  • Use class info.
  • Variability measure from SEPCOR
  • J-measure

11
Break
12
Reduce the number of features
  • Why?
  • The curse of dimensionality
  • Too many parameters to tune/train
  • Visualization
  • Remove noise (10 dependent features 1
    independent)
  • Faster processing
  • How?
  • If features are correlated gt redundancy
  • Remove redundancy
  • Before the break
  • Methods where we DONT consider the classes
  • Unsupervised
  • After the break
  • Methods where we DO consider the classes
  • Supervised

13
Methods where we DO use the class information
  • Supervised
  • Use info. of classes and reduce the
    dimensionality
  • Methods
  • SEPCOR
  • Linear Discriminant Methods

14
SEPCOR
  • Inspired from Hierarchical dimensionality
    reduction
  • Method to choose the X best (most discriminative)
    features
  • Idea combine Hierarchical dimensionality
    reduction with class info.
  • SEPCOR separability correlation
  • Principle
  • Calc. a measure for how good (discriminative)
    each feature, xi, is wrt. classification
  • Variability measure V(xi)
  • Keep the most discriminative features, which have
    a
  • low correlation with the other features

15
SEPCOR Variability measure
V(xi)
  • V(xi) large gt good feature wrt. classification
  • That is large nominator and small denominator

x2
x1
V gtgt 1
V 1
V(x1) lt V(x2) gt x2 best
16
SEPCOR The algorithm
  • Make a list with features ordered after V-value
  • Repeat until we have the desired number of
    features or the list is empty
  • Remove and store the feature with largest V-value
  • Find the correlation between the removed feature
    and all the other features in the list
  • Ignore all features with correlation bigger than
    MAXCOR

17
Linear Discriminant Methods
18
Linear Discriminant Methods
  • Transform data to the new feature space
  • Linear transform (rotation) yAx
  • The transform is defined so that classification
    becomes as easy as possible gt
  • Info discriminative power
  • Fisher Linear Discriminant method
  • Map data to one dimension
  • Multiple Discriminant analysis
  • Map data to a M-dimensional space

19
Fisher Linear Discriminant
  • Idea Map data to a line, y
  • The orientation of the line is defined so that
    the classes are as separated as possible
  • Transform y wTx, w is the direction of the
    line, y
  • PCA w is defined as the 1st eigen-vector for the
    covariance matrix (vis prob.)

20
Fisher Linear Discriminant
  • Example 4 classes in 2D

Transformation y wTx Find w
21
Fisher Linear Discriminant
  • Transform
  • y wTx
  • Find w so that the following criterion function
    is maximum

22
Multiple Discriminant Analysis
  • Generalized Fisher Linear Discriminant method
  • N classes
  • Mapped into a M-dimensional space (M lt N)
  • Fx 3 points will span a plan
  • Example with 3
  • classes in 3D
  • mapped into two
  • different sub-spaces

23
What to remember
  • Feature reduction where we use the class info.
  • Discriminative power information
  • SEPCOR (ignore some of the features)
  • Hierarchical dimensionality reduction
    (correlation)
  • Variability measure
  • The variance of the means / the mean of the
    variances
  • Linear Discriminant Methods (make new features
    and ignore some)
  • Fisher Linear Discriminant (map onto a line)
  • Transform y wTx, w is the direction of the
    line, y
  • Variability measure
  • Multiple Discriminant Analysis
  • Generalized Fisher Linear Discriminant method
  • N classes
  • Map data into a M-dimensional space (M lt N)
Write a Comment
User Comments (0)
About PowerShow.com