Title: Other IR Models
1Other IR Models
Classic Models
boolean vector probabilistic
Algebraic
U s e r T a s k
Generalized Vector Lat. Semantic Index Neural
Networks
Retrieval Adhoc Filtering
Probabilistic
Inference Network Belief Network
Browsing
2Another Vector Model Motivation
- Index terms have synonyms.
- Use thesauri?
- Index terms have multiple meanings (polysemy).
- Use restricted vocabularies or more precise
queries? - Index terms are not independent think phrases.
- Use combinations of terms?
3Latent Semantic Indexing/Analysis
- Basic Idea Keywords in a query are just one way
of specifying the information need. One really
wants to specify the key concepts rather than
words. - Assume a latent semantic structure underlying the
term-document data that is partially obscured by
exact word choice.
4LSI In Brief
- Map from terms into lower dimensional space (via
SVD) to remove noise and force clustering of
similar words. - Pre-process corpus to create reduced vector space
- Match queries to docs in reduced space
5SVD for Term-Doc Matrix
Docs
Terms
m x m
m x d
t x d
t x m
C
where m is the rank of X (ltmin(t,d)), T is
orthonornal matrix of eigenvectors for term-term
correlation, D is orthonornal matrix of
eigenvectors from transpose of doc-doc
correlation
6Reducing Dimensionality
- Order singular values in S0 by size, keep the k
largest, and delete other rows/columns in S0, T0
and D0 to form - Approximate model is the rank-k model with best
possible least-squares-fit to X. - Pick k large enough to fit structure, but small
enough to eliminate noise usually 100-300.
7Computing Similarities in LSI
- How similar are 2 terms?
- dot product between two row vectors of
- How similar are two documents?
- dot product between two column vectors of
- How similar are a term and a document?
- value of an individual cell
8Query Retrieval
- As before, treat query as short document make it
column 0 of C - First row of C provides rank of docs wrt query.
9LSI Issues
- Requires access to corpus to compute SVD
- How to efficiently compute for Web?
- What is the right value of k ?
- Can LSI be used for cross-language retrieval?
- Size of corpus is limited one students reading
through high school (Landauer 2002).
10Other Vector Model Neural Network
- Basic idea
- 3 layer neural net query terms, document terms,
documents - Signal propagation based on classic similarity
computation - Tune weights.
11Neural Network Diagram
- from Wilkinson and Hingston, SIGIR 1991
12Computing Document Rank
- Weight from query to document term
- Wiq wiq sqrt ( ?i wiq
) - Weight from document term to document
- Wij wij sqrt (
?i wij )
13Probabilistic Models
- Principle Given a user query q and a document d
in the collection, estimate the probability that
the user will find d relevant. (How?) - User rates a retrieved subset.
- System uses rating to refine the subset.
- Over time, retrieved subset should converge on
relevant set.
14Computing Similarity I
- probability that document dj is
relevant to query q, - probability that dj is
non-relevant to the query q, - probability of randomly selecting
dj from set R - probability that a randomly
selected document is relevant
15Computing Similarity II
- probability that index term ki is
present in document randomly selected from R, - Assumes independence of index terms
16Initializing Probabilities
- assume constant probabilities for index terms
-
- assume distribution of index terms in
non-relevant documents matches overall
distribution
17Improving Probabilities
- Assumptions
- approximate probability given relevant as docs
with index i retrieved so far - approximate probabilities given non-relevant by
assuming not retrieved are non-relevant
18Classic Probabilistic Model Summary
- Pros
- ranking based on assessed probability
- can be approximated without user intervention
- Cons
- really need user to determine set V
- ignores term frequency
- assumes independence of terms
19Probabilistic Alternative Bayesian (Belief)
Networks
- A graphical structure to represent the dependence
between variables in which the following holds - a set of random variables for the nodes
- a set of directed links
- a conditional probability table for each node,
indicating relationship with parents - a directed acyclic graph
20Belief Network Example
B E P(A)
T T .95
T F .94
F T .29
F F .001
P(B)
.001
P(E)
.002
A P(J)
T .90
F .05
A P(M)
T .70
F .01
from Russell Norvig
21Belief Network Example (cont.)
P(B)
.001
P(E)
.002
B E P(A)
T T .95
T F .94
F T .29
F F .001
A P(J)
T .90
F .05
A P(M)
T .70
F .01
Probability of false notification alarm sounded
and both people call, but there was no burglary
or earthquake
22Inference Networks for IR
- Random variables are associated with documents,
index terms and queries. - Edges from document node to term nodes increases
belief in terms.
23Computing rank in Inference Networks for IR
- q is keyword query. q1 is Boolean query. I is
information need. - Rank of document is computed as P(qdj)
24Where do probabilities come from? (Boolean Model)
- uniform priors on documents
- only terms in the document are active
- query is matched to keywords ala Boolean model
25Belief Network Formulation
- different network topology
- does not consider each document individually
- adopts set theoretic view