Exact Maximum Likelihood Estimation for Word Mixtures

About This Presentation

Title:

Exact Maximum Likelihood Estimation for Word Mixtures

Description:

Based on Zhai&Lafferty's s in CIKM 2001. Carnegie Mellon ... 20 relevant documents (sampled from AP Wire News and Wall Street Journal dataset ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 14

Provided by: yiz6

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Exact Maximum Likelihood Estimation for Word Mixtures

1
Exact Maximum Likelihood Estimation for Word
Mixtures

Yi Zhang Jamie Callan
Carnegie Mellon University
yiz,callan_at_cs.cmu.edu
Wei Xu
NEC CC Research Lab
xw_at_ccrl.sj.nec.com

2
Outline

Introduction
Why this problems? some retrieval applications
Traditional solutions EM algorithm
New algorithm exact MLE estimation
Experimental Results

3
Example 1 Model-based Feedback in the Language
Modeling Approach to IR
Document D
Results
Query Q
Feedback Docs Fd1, d2 , , dn
Based on ZhaiLaffertys slides in CIKM 2001
4
?F Estimation based on Generative Mixture Model
Given F, P(wc) and ? Find MLE of ?
Based on ZhaiLaffertys slides in CIKM 2001
5
Example 2 Model-based Approach for Novelty
Detection in Adaptive Information Filtering
Given ?general English, ?Topic ?E ?T ?new
Find MLE of ?new
Based on ZhangCallans paper in SIGIR 2002
6
Problem Setting and Traditional Solution Using EM

Observe data generated by a mixture multinomial
distribution r(r1, r2, r3, , rk)
Given interpolation weights ? and ?, another
multinomial distribution p(p1, p2, p3, , pk)
Find the maximum likelihood estimation (MLE) of
multinomial distribution q(q1, q2, q3, , qk)
Traditional solution EM algorithm
Iterative process which can be computationally
expensive
Only provide approximate solution

7
Finding q (1)
Under the constraints
Where fi is observed frequency of word i
8
Finding q (2)
For all the qi such that qi gt0, apply Lagrange
multiplier method and calculate the derivatives
with respect to qi
This is a close form solution for qi, if we know
all i that qi gt0. Theorem All the qi greater
than 0 correspond to the smallest See detailed
proof in our paper
9
Algorithm for Finding Exact MLE for q
10
Experiments Setting on Model Based Feedback in IR

20 relevant documents (sampled from AP Wire News
and Wall Street Journal dataset from 1988-1990)
for a topic as observed training data sequence. p
is calculated directly as described in
(ZhaiLafferty) from 119823 documents.
There are 2352 unique words in these 20 relevant
documents, which means at most 2352 qi's are none
zero, while there are 200542 pi's are none zero.

11
EM result converges to the result calculated
directly by our algorithm.
12
Compar ing the Speed of Our Algorithm With EM