Title: Chap.13 Prototypes and Nearest-Neighbors
1Chap.13 Prototypes and Nearest-Neighbors
2Model-free methods for classification
- Cannot understand the nature of the problem
- In real problems they are very effective, often
the best performers
3Prototype Methods
- Dataset N pairs of training data (x1,g1)
(xn,gn) where gi in 1,2,,K - Construct a set of Prototype points for each
cluster in the feature space, to represent the
training data - On query, find the closest prototype to the
querying point, and classify using the class of
the prototype
4Prototype Method K-means clustering
- Begin from randomly chosen centers (prototypes),
K-means alternates - For each center, identify its own cluster
- Compute the means of each feature for the data
points in each cluster to be the new center for
that cluster - Until convergence
- Drawback prototypes are near the class
boundaries, leading to potential
misclassification errors
5Prototype Method K-means clustering VS LVQ
6Prototype Method Learning Vector Quantization
- Algorithm 13.1 (LVQ)
- Training points attract prototypes of the correct
class, and repel other prototypes - Move prototypes away from decision boundaries
- But LVQ is defined by algorithms, difficult to
understand
7Prototype Method Gaussian Mixtures
- EM steps for Gaussian Mixtures
- In E-step, each observation is assigned a weight
for each cluster, based on the likelihood of each
of the corresponding Gaussians. - In M-step, each observation contributes to the
weighted means for every cluster - Soft, smooth clustering method
8Prototype Method K-means clustering VS Gaussian
Mixtures
9k-Nearest-Neighbor Classifiers
- Find k nearest training points (using Euclidean
distance) to classify - Cover Hart (1967) asymptotically the error
rate of the 1-nearest-neighbor classifier is
never more than twice the Bayes rate
101-Nearest-Neighbor Classifiers error
- Bayes error 1-pk(x)
- 1-Nearest-Neighbor error
- ?pk(x)(1-pk(x)) 1-pk(x)
- For K2, 1-Nearest-Neighbor error
- 2pk(x) (1-pk(x)) 2(1-pk(x))
11Example A Comparative Study
12Image Scene Classification
- For each pixel extract an 8-neighbor feature map
separately in the four spectral bands. (18)436
dimensional feature space. Figure 13.7 - Carry out five-nearest-neighbors classification
in this space - Result Figure 13.8
13Invariant Metrics and Tangent Distance
- Rotated digits are closed in meaning, but far in
feature space. - Use rotation curve in the feature space?
- Difficult to calculate
- Over rotated 6 can become a 9
- Calculate and use Tangent Distance (Figure 13.11)
14Adaptive Nearest-Neighbor Methods
- Curse of dimensionality
- In some problem, in high-dimensional feature
space class probabilities vary only in a
low-dimensional subspace - Local dimension-reduction adapting the metric
(2D example Figure 13.13)
15Discriminant adaptive nearest-neighbor (DANN)
- D(x,x0) (x-x0)T?(x-x0)
- ?W-1/2W-1/2BW-1/2?I W-1/2
- W-1/2 B?I W-1/2
- W is the pooled within-class covariance matrix
?pkWk, and B is the between class covariance
matrix ?pk(xk-x) (xk-x)T
16DANN
- Figure 13.14
- Figure 13.15