Advanced Artificial Intelligence - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Advanced Artificial Intelligence

Description:

Title: CS 294-5: Statistical Natural Language Processing Author: Preferred Customer Last modified by: Alex Created Date: 8/27/2004 4:16:05 AM Document presentation format – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 28
Provided by: Preferr1213
Category:

less

Transcript and Presenter's Notes

Title: Advanced Artificial Intelligence


1
Advanced Artificial Intelligence
  • Lecture 8
  • Advance machine learning

2
Outline
  • Clustering
  • K-Means
  • EM
  • Spectral Clustering
  • Dimensionality Reduction

3
The unsupervised learning problem
Many data points, no labels
4
K-Means
Many data points, no labels
5
K-Means
  • Choose a fixed number of clusters
  • Choose cluster centers and point-cluster
    allocations to minimize error
  • cant do this by exhaustive search, because there
    are too many possible allocations.
  • Algorithm
  • fix cluster centers allocate points to closest
    cluster
  • fix allocation compute best cluster centers
  • x could be any set of features for which we can
    compute a distance (careful about scaling)

6
K-Means
7
K-Means
From Marc Pollefeys COMP 256 2003
8
K-Means
  • Is an approximation to EM
  • Model (hypothesis space) Mixture of N Gaussians
  • Latent variables Correspondence of data and
    Gaussians
  • We notice
  • Given the mixture model, its easy to calculate
    the correspondence
  • Given the correspondence its easy to estimate
    the mixture models

9
Expectation Maximzation Idea
  • Data generated from mixture of Gaussians
  • Latent variables Correspondence between Data
    Items and Gaussians

10
Generalized K-Means (EM)
11
Gaussians
12
ML Fitting Gaussians
13
Learning a Gaussian Mixture(with known
covariance)
M-Step
14
Expectation Maximization
  • Converges!
  • Proof Neal/Hinton, McLachlan/Krishnan
  • E/M step does not decrease data likelihood
  • Converges at local minimum or saddle point
  • But subject to local minima

15
Practical EM
  • Number of Clusters unknown
  • Suffers (badly) from local minima
  • Algorithm
  • Start new cluster center if many points
    unexplained
  • Kill cluster center that doesnt contribute
  • (Use AIC/BIC criterion for all this, if you want
    to be formal)

16
Spectral Clustering
17
Spectral Clustering
18
Spectral Clustering Overview
Data
Similarities
Block-Detection


















19
Eigenvectors and Blocks
  • Block matrices have block eigenvectors
  • Near-block matrices have near-block eigenvectors
    Ng et al., NIPS 02

l2 2
l3 0
l1 2
l4 0
1 1 0 0
1 1 0 0
0 0 1 1
0 0 1 1
.71
.71
0
0
0
0
.71
.71
eigensolver
l3 -0.02
l1 2.02
l2 2.02
l4 -0.02
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
.71
.69
.14
0
0
-.14
.69
.71
eigensolver
20
Spectral Space
  • Can put items into blocks by eigenvectors
  • Resulting clusters independent of row ordering

e1
.71
.69
.14
0
0
-.14
.69
.71
1 1 .2 0
1 1 0 -.2
.2 0 1 1
0 -.2 1 1
e2
e1
e2
e1
.71
.14
.69
0
0
.69
-.14
.71
1 .2 1 0
.2 1 0 1
1 0 1 -.2
0 1 -.2 1
e2
e1
e2
21
The Spectral Advantage
  • The key advantage of spectral clustering is the
    spectral space representation



















22
Measuring Affinity
Intensity
Distance
Texture
23
Scale affects affinity
24
(No Transcript)
25
Dimensionality Reduction
26
Dimensionality Reduction with PCA
27
Linear Principal Components
  • Fit multivariate Gaussian
  • Compute eigenvectors of Covariance
  • Project onto eigenvectors with largest
    eigenvalues
Write a Comment
User Comments (0)
About PowerShow.com