Title: EM Algorithm
1EM Algorithm
2Contents
- Introduction
- Example ? Missing Data
- Example ? Mixed Attributes
- Example ? Mixture
- Main Body
- Mixture Model
- EM-Algorithm on GMM
3EM Algorithm
4Introduction
- EM is typically used to compute maximum
likelihood estimates given incomplete samples. - The EM algorithm estimates the parameters of a
model iteratively. - Starting from some initial guess, each iteration
consists of - an E step (Expectation step)
- an M step (Maximization step)
5Applications
- Filling in missing data in samples
- Discovering the value of latent variables
- Estimating the parameters of HMMs
- Estimating parameters of finite mixtures
- Unsupervised learning of clusters
6EM Algorithm
7Univariate Normal Sample
Sampling
8Maximum Likelihood
Sampling
We want to maximize it.
Given x, it is a function of ? and ?2
9Log-Likelihood Function
Maximize this instead
By setting
and
10Max. the Log-Likelihood Function
11Max. the Log-Likelihood Function
12Miss Data
Missing data
Sampling
13E-Step
be the estimated parameters at the initial of the
tth iterations
Let
14E-Step
be the estimated parameters at the initial of the
tth iterations
Let
15M-Step
be the estimated parameters at the initial of the
tth iterations
Let
16Exercise
17EM Algorithm
- Example ?
- Mixed Attributes
18Multinomial Population
Sampling
N samples
19Maximum Likelihood
Sampling
N samples
20Maximum Likelihood
Sampling
N samples
We want to maximize it.
21Log-Likelihood
22Mixed Attributes
Sampling
N samples
x3 is not available
23E-Step
Sampling
N samples
x3 is not available
Given ?(t), what can you say about x3?
24M-Step
25Exercise
Estimate ? using different initial conditions?
26EM Algorithm
27Binomial/Poison Mixture
M married obasong
X Children
n0
Obasongs
28Binomial/Poison Mixture
M married obasong
X Children
n0
Obasongs
Unobserved data
nA married Obs nB unmarried Obs
29Binomial/Poison Mixture
M married obasong
X Children
n0
Obasongs
Complete data
30Binomial/Poison Mixture
n0
Obasongs
Complete data
31Complete Data Likelihood
n0
Obasongs
Complete data
32Complete Data Likelihood
n0
Obasongs
Complete data
33Log-Likelihood
34Maximization
35Maximization
36E-Step
Given
37M-Step
38Example
39EM Algorithm
40Maximum Likelihood
41Latent Variables
Incomplete Data
Complete Data
42Complete Data Likelihood
43Complete Data Likelihood
A function of latent variable Y and
parameter ?
A function of parameter ?
A function of random variable Y.
The result is in term of random variable Y.
Computable
If we are given ?,
44Expectation Step
Let ?(i?1) be the parameter vector obtained at
the (i?1)th step.
Define
45Maximization Step
Let ?(i?1) be the parameter vector obtained at
the (i?1)th step.
Define
46EM Algorithm
47Mixture Models
- If there is a reason to believe that a data set
is comprised of several distinct populations, a
mixture model can be used. - It has the following form
with
48Mixture Models
Let yi?1,, M represents the source that
generates the data.
49Mixture Models
Let yi?1,, M represents the source that
generates the data.
50Mixture Models
51Mixture Models
52Mixture Models
Given x and ?, the conditional density of y can
be computed.
53Complete-Data Likelihood Function
54Expectation
?g Guess
55Expectation
?g Guess
56Expectation
Zero when yi ? l
57Expectation
58Expectation
59Expectation
1
60Maximization
Given the initial guess ?g,
We want to find ?, to maximize the above
expectation.
In fact, iteratively.
61The GMM (Guassian Mixture Model)
Guassian model of a d-dimensional source, say j
GMM with M sources
62EM Algorithm
63Goal
Mixture Model
subject to
To maximize
64Goal
Mixture Model
Correlated with ?l only.
Correlated with ?l only.
subject to
To maximize
65Finding ?l
Due to the constraint on ?ls, we introduce
Lagrange Multiplier ?, and solve the following
equation.
66Finding ?l
1
N
1
67Finding ?l
68Finding ?l
Only need to maximize this term
Consider GMM
unrelated
69Finding ?l
Only need to maximize this term
Therefore, we want to maximize
How?
knowledge on matrix algebra is needed.
unrelated
70Finding ?l
Therefore, we want to maximize
71Summary
EM algorithm for GMM
Given an initial guess ?g, find ?new as follows
Not converge
72Demonstration
EM algorithm for Mixture models
73Exercises
- Write a program to generate multidimensional
Gaussian distribution. - Draw the distribution for 2-dim data.
- Write a program to generate GMM.
- Write EM-algorithm to analyze GMM data.
- Study more EM-algorithm for mixture.
- Find applications for EM-algorithm.
74References
- A Gentle Tutorial of the EM Algorithm and its
Application to Parameter Estimation for Gaussian
Mixture and Hidden Markov Models (1998), Jeff
Bilmes - The Expectation Maximization Algorithm A short
tutorial, Sean Borman.