Advanced Artificial Intelligence

About This Presentation

Title:

Advanced Artificial Intelligence

Description:

Title: CS 294-5: Statistical Natural Language Processing Author: Preferred Customer Last modified by: Alex Created Date: 8/27/2004 4:16:05 AM Document presentation format – PowerPoint PPT presentation

Number of Views:140

Avg rating:3.0/5.0

Slides: 28

Provided by: Preferr1213

Category:

more less

Transcript and Presenter's Notes

Title: Advanced Artificial Intelligence

1
Advanced Artificial Intelligence

Lecture 8
Advance machine learning

2
Outline

Clustering
K-Means
EM
Spectral Clustering
Dimensionality Reduction

3
The unsupervised learning problem
Many data points, no labels
4
K-Means
Many data points, no labels
5
K-Means

Choose a fixed number of clusters
Choose cluster centers and point-cluster
allocations to minimize error
cant do this by exhaustive search, because there
are too many possible allocations.

Algorithm
fix cluster centers allocate points to closest
cluster
fix allocation compute best cluster centers
x could be any set of features for which we can
compute a distance (careful about scaling)

6
K-Means
7
K-Means
From Marc Pollefeys COMP 256 2003
8
K-Means

Is an approximation to EM
Model (hypothesis space) Mixture of N Gaussians
Latent variables Correspondence of data and
Gaussians
We notice
Given the mixture model, its easy to calculate
the correspondence
Given the correspondence its easy to estimate
the mixture models

9
Expectation Maximzation Idea

Data generated from mixture of Gaussians
Latent variables Correspondence between Data
Items and Gaussians

10
Generalized K-Means (EM)
11
Gaussians
12
ML Fitting Gaussians
13
Learning a Gaussian Mixture(with known
covariance)
M-Step
14
Expectation Maximization

Converges!
Proof Neal/Hinton, McLachlan/Krishnan
E/M step does not decrease data likelihood
Converges at local minimum or saddle point
But subject to local minima

15
Practical EM

Number of Clusters unknown
Suffers (badly) from local minima
Algorithm
Start new cluster center if many points
unexplained
Kill cluster center that doesnt contribute
(Use AIC/BIC criterion for all this, if you want
to be formal)

16
Spectral Clustering
17
Spectral Clustering
18
Spectral Clustering Overview
Data
Similarities
Block-Detection

19
Eigenvectors and Blocks

Block matrices have block eigenvectors
Near-block matrices have near-block eigenvectors
Ng et al., NIPS 02

l2 2
l3 0
l1 2
l4 0
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1
.71
.71
0
0
0
0
.71
.71
eigensolver
l3 -0.02
l1 2.02
l2 2.02
l4 -0.02
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
.71
.69
.14
0
0
-.14
.69
.71
eigensolver
20
Spectral Space

Can put items into blocks by eigenvectors
Resulting clusters independent of row ordering

e1
.71
.69
.14
0
0
-.14
.69
.71
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
e2
e1
e2
e1
.71
.14
.69
0
0
.69
-.14
.71
1 .2 1 0
.2 1 0 1
1 0 1 -.2
0 1 -.2 1
e2
e1
e2
21
The Spectral Advantage

The key advantage of spectral clustering is the
spectral space representation

22
Measuring Affinity
Intensity
Distance
Texture
23
Scale affects affinity
24
(No Transcript)
25
Dimensionality Reduction
26
Dimensionality Reduction with PCA
27
Linear Principal Components