Mixture Language Models and EM Algorithm - PowerPoint PPT Presentation

About This Presentation

Title:

Mixture Language Models and EM Algorithm

Description:

Mixture Language Models and. EM Algorithm (Lecture for CS397-CXZ Intro ... text mining passage. food nutrition passage. A document with 2 types of vocabulary ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 15

Provided by: Ale8279

Category:

more less

Transcript and Presenter's Notes

Title: Mixture Language Models and EM Algorithm

1
Mixture Language Models and EM Algorithm

(Lecture for CS397-CXZ Intro Text Info Systems)
Sept. 17, 2003
ChengXiang Zhai
Department of Computer Science
University of Illinois, Urbana-Champaign

2
Rest of this Lecture

Unigram mixture models
Slightly more sophisticated unigram LMs
Related to smoothing
EM algorithm
VERY useful for estimating parameters of a
mixture model or when latent/hidden variables are
involved
Will occur again and again in the course

3
Modeling a Multi-topic Document
A document with 2 types of vocabulary
text mining passage food nutrition passage
text mining passage text mining passage food
nutrition passage
How do we model such a document? How do we
generate such a document? How do we estimate
our model?
Solution A mixture model EM
4
Simple Unigram Mixture Model
text 0.2 mining 0.1 assocation 0.01 clustering
0.02 food 0.00001
?0.7
Model/topic 1 p(w?1)
food 0.25 nutrition 0.1 healthy 0.05 diet 0.02
1-?0.3
Model/topic 2 p(w?2)
p(w?1? ?2) ?p(w?1)(1- ?)p(w?2)
5
Parameter Estimation
Likelihood

Estimation scenarios
p(w?1) p(w?2) are known estimate ?
p(w?1) ? are known estimate p(w?2)
p(w?1) is known estimate ? p(w?2)
? is known estimate p(w?1) p(w?2)
Estimate ?, p(w?1), p(w?2)

clustering
6
Parameter Estimation ExampleGiven p(w?1) and
p(w?2), estimate ?
Maximum Likelihood
Expectation-Maximization (EM) Algorithm is a
commonly used method Basic idea Start from
some random guess of parameter values, and then
Iteratively improve our estimates (hill
climbing)
E-step compute the
lower bound M-step find a new ? that
maximizes the lower
bound
7
EM Algorithm Intuition
text 0.2 mining 0.1 assocation 0.01 clustering
0.02 food 0.00001
Observed Doc d
??
p(w?1)
food 0.25 nutrition 0.1 healthy 0.05 diet 0.02
1-??
p(w?2)
Suppose we know the identity of each word
p(w?1? ?2) ?p(w?1)(1- ?)p(w?2)
8
Can We Guess the Identity?
Identity (hidden) variable zw ?1 (w from ?1),
0(w from ?2)
zw 1 1 1 1 0 0 0 1 0 ...
Whats a reasonable guess? - depends on ?
(why?) - depends on p(w ?1) ) and p(w?2)
(how?)
the paper presents a text mining algorithm the pap
er ...
Initially, set ? to some random value, then
iterate
9
An Example of EM Computation
10
Any Theoretical Guarantee?

EM is guaranteed to reach a LOCAL maximum
When local maxima global maxima, EM can
find the global maximum
But, when there are multiple local maximas,
special techniques are needed (e.g., try
different initial values)
In our case, we have one unique local maxima
(why?)

11
A General Introduction to EM
Data X (observed) H(hidden) Parameter ?
Incomplete likelihood L(? ) log p(X
?) Complete likelihood Lc(? ) log p(X,H ?)
EM tries to iteratively maximize the complete
likelihood Starting with an initial guess
?(0), 1. E-step compute the expectation of the
complete likelihood 2. M-step compute ?(n) by
maximizing the Q-function
12
Convergence Guarantee
Goal maximizing Incomplete likelihood L(? )
log p(X ?) I.e., choosing ?(n), so
that L(?(n))-L(?(n-1))?0 Note that, since p(X,H
?) p(HX, ?) P(X ?) , L(?) Lc(?) -log p(HX,
?) L(?(n))-L(?(n-1)) Lc(?(n))-Lc(?
(n-1))log p(HX, ? (n-1) )/p(HX,
?(n)) Taking expectation w.r.t. p(HX, ?(n-1)),
L(?(n))-L(?(n-1)) Q(?(n) ?
(n-1))-Q(? (n-1) ? (n-1)) D(p(HX, ?
(n-1))p(HX, ? (n)))
EM chooses ?(n) to maximize Q
KL-divergence, always non-negative
Therefore, L(?(n)) ? L(?(n-1))!
13
Another way of looking at EM
Likelihood p(X ?)
L(?(n-1)) Q(?? (n-1)) -Q(? (n-1) ? (n-1) )
D(p(HX, ? (n-1) )p(HX, ? ))
next guess
current guess
Lower bound (Q function)
?
E-step computing the lower bound M-step
maximizing the lower bound
14
What You Should Know