Title: Language Modeling Frameworks for Information Retrieval
1Language Modeling Frameworks for Information
Retrieval
- John Lafferty
- School of Computer Science
- Carnegie Mellon University
2Retrieval As Decision Making
Given a query, - Which documents should be
selected? (D) - How should these docs be
presented to the user? (?)
3Decision Theory Framework
Unified framework can be built on Bayesian
decision theory Models, loss function,
risk minimization (Zhai, 2002)
4Example Aspect Retrieval
Query What are current applications of
robotics? Find as many different applications as
possible.
Aspect judgments A1 A2 A3
... Ak d1 1 1 0 0 0 0 d2
0 1 1 1 0 0 d3 0 0 0 0 1
0 . dk 1 0 1 0 ... 0 1
Example Aspects A1 spot-welding robotics A2
controlling inventory A3 pipe-laying robots A4
talking robot A5 robots for loading unloading
memory tapes A6 robot telephone operators A7
robot cranes
5Aspect Models(Hofmann 1999, Blei, Ng and
Jordan., 2001)
Aspect 1
Aspect 2
Inference Given aspects and document, what is
posterior for l? Learning Given documents, what
are the (ML) aspects?
Studied recently in (Minka and Lafferty, 2002)
6Evaluation Measures
- What is the best measure?
- Requires concrete specification of task
- Several natural measures are computationally
intractable, even assuming aspects known (e.g.,
aspect coverage, aspect uniqueness) - Defining aspects is difficult
- Maximum likelihood cannot be expected to capture
true semantic relationships in aspects
7Aspect Retrieval Baselines
Aspect Precision
Aspect Recall
8Challenges for IR Models
Probabilistic language models have proven to be
an effective way to reason about IR systems.
We now need
- Better task specification and data
- e.g., TREC interactive data inadequate
- More advanced models
- Fewer independence assumptions, greater structure
- Improved inference and learning algorithms
- Accuracy and efficiency
- To handle user preferences, background knowledge
- Loss function and priors/constraints