Title: Inforadar @ UPRM
1Inforadar _at_ UPRM
- Computing Systems Research Group
- Prof. Bienvenido Vélez-Rivera Leader
- José Enseñat Graduate student
- Juan Torres Undergraduate student
- University of Puerto Rico
- Mayagüez Campus
- PRECISE Project
- Mayagüez, October 07, 2000
2Problem Statement
3Query-based Web Search
short query
large result-set
- BUT
- - queries hard to write
- sequential access to result set inadequate
4Proposed Solution
5Proposed Solution
Inforadars Interactive query hierarchies
seed query
selected query
result set for selected query
dynamic categories are queries
6Inforadars Interactive query hierarchies
colors indicate node status
level 2 categories
icons mark documents read or in-basket
7Theoretical Formulation
8Coverage-based Category Evaluation MetricGoal
Avoid Redundancy and Information Loss
q
seed
q1
q2
(a) low information loss high redundancy
(b) high information loss low redundancy
(c) better
Ideal Select categories that best approximate a
partition But This is an NP-complete problem
9CTS A greedy approximation algorithm for
category selection
Goal Pick best term among t1, t2, t3
C set of documents covered by previously
selected terms
low coverage
D(q t3)
winning category!
D(q)
D(q t2)
D(q t1)
high redundancy
10Experimental Plan
- Implement Inforadar site indexing ALL website
data at UPRM - Make Inforadar the official search engine for the
UPRM web site - Conduct usability study
- Analyze real user feedback
- Incorporate feedback into an improved design
11References
- Query Lookahead for Query-Based Document
Categorization. - Ph.D. Thesis
- Massachusetts Institute of Technology
- September 1999Â
- Fast and Effective Query Refinement
- Bienvenido Vélez, Ron Weiss, Mark Sheldon and
David K. Gifford - ACM Conference in Research and Development in
Information Retrieval (SIGIR 97)Â - HyPursuit A network search engine exploiting
concent-link similarity - R. Weiss, B. Vélez, M. Sheldon, C. Namprempre, P.
Szilagy and D. K. Gifford.. - ACM Conference on Hypertext (HyperText 96)