Language Modeling Frameworks for Information Retrieval

About This Presentation

Title:

Language Modeling Frameworks for Information Retrieval

Description:

Unified framework can be built on Bayesian decision theory: ... Studied recently in (Minka and Lafferty, 2002) September 11, 2002 ... – PowerPoint PPT presentation

Number of Views:128

Avg rating:3.0/5.0

Slides: 9

Provided by: Ale8205

Category:

more less

Transcript and Presenter's Notes

Title: Language Modeling Frameworks for Information Retrieval

1
Language Modeling Frameworks for Information
Retrieval

John Lafferty
School of Computer Science
Carnegie Mellon University

2
Retrieval As Decision Making
Given a query, - Which documents should be
selected? (D) - How should these docs be
presented to the user? (?)
3
Decision Theory Framework
Unified framework can be built on Bayesian
decision theory Models, loss function,
risk minimization (Zhai, 2002)
4
Example Aspect Retrieval
Query What are current applications of
robotics? Find as many different applications as
possible.
Aspect judgments A1 A2 A3
... Ak d1 1 1 0 0 0 0 d2
0 1 1 1 0 0 d3 0 0 0 0 1
0 . dk 1 0 1 0 ... 0 1
Example Aspects A1 spot-welding robotics A2
controlling inventory A3 pipe-laying robots A4
talking robot A5 robots for loading unloading

memory tapes A6 robot telephone operators A7
robot cranes
5
Aspect Models(Hofmann 1999, Blei, Ng and
Jordan., 2001)
Aspect 1
Aspect 2
Inference Given aspects and document, what is
posterior for l? Learning Given documents, what
are the (ML) aspects?
Studied recently in (Minka and Lafferty, 2002)
6
Evaluation Measures

What is the best measure?
Requires concrete specification of task
Several natural measures are computationally
intractable, even assuming aspects known (e.g.,
aspect coverage, aspect uniqueness)
Defining aspects is difficult
Maximum likelihood cannot be expected to capture
true semantic relationships in aspects

7
Aspect Retrieval Baselines
Aspect Precision
Aspect Recall
8
Challenges for IR Models
Probabilistic language models have proven to be
an effective way to reason about IR systems.
We now need