Language Modeling Frameworks for Information Retrieval - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Language Modeling Frameworks for Information Retrieval

Description:

Unified framework can be built on Bayesian decision theory: ... Studied recently in (Minka and Lafferty, 2002) September 11, 2002 ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 9
Provided by: Ale8205
Category:

less

Transcript and Presenter's Notes

Title: Language Modeling Frameworks for Information Retrieval


1
Language Modeling Frameworks for Information
Retrieval
  • John Lafferty
  • School of Computer Science
  • Carnegie Mellon University

2
Retrieval As Decision Making
Given a query, - Which documents should be
selected? (D) - How should these docs be
presented to the user? (?)
3
Decision Theory Framework
Unified framework can be built on Bayesian
decision theory Models, loss function,
risk minimization (Zhai, 2002)
4
Example Aspect Retrieval
Query What are current applications of
robotics? Find as many different applications as
possible.
Aspect judgments A1 A2 A3
... Ak d1 1 1 0 0 0 0 d2
0 1 1 1 0 0 d3 0 0 0 0 1
0 . dk 1 0 1 0 ... 0 1
Example Aspects A1 spot-welding robotics A2
controlling inventory A3 pipe-laying robots A4
talking robot A5 robots for loading unloading

memory tapes A6 robot telephone operators A7
robot cranes
5
Aspect Models(Hofmann 1999, Blei, Ng and
Jordan., 2001)
Aspect 1
Aspect 2
Inference Given aspects and document, what is
posterior for l? Learning Given documents, what
are the (ML) aspects?
Studied recently in (Minka and Lafferty, 2002)
6
Evaluation Measures
  • What is the best measure?
  • Requires concrete specification of task
  • Several natural measures are computationally
    intractable, even assuming aspects known (e.g.,
    aspect coverage, aspect uniqueness)
  • Defining aspects is difficult
  • Maximum likelihood cannot be expected to capture
    true semantic relationships in aspects

7
Aspect Retrieval Baselines
Aspect Precision
Aspect Recall
8
Challenges for IR Models
Probabilistic language models have proven to be
an effective way to reason about IR systems.
We now need
  • Better task specification and data
  • e.g., TREC interactive data inadequate
  • More advanced models
  • Fewer independence assumptions, greater structure
  • Improved inference and learning algorithms
  • Accuracy and efficiency
  • To handle user preferences, background knowledge
  • Loss function and priors/constraints
Write a Comment
User Comments (0)
About PowerShow.com