Seesaw%20Personalized%20Web%20Search - PowerPoint PPT Presentation

About This Presentation
Title:

Seesaw%20Personalized%20Web%20Search

Description:

Web statistics. Result set. Document representation. Download document. Use result set snippet ... McDonalds. Pre-selected. 53 pre-selected (2-9/query) Total: ... – PowerPoint PPT presentation

Number of Views:308
Avg rating:3.0/5.0
Slides: 22
Provided by: JaimeT4
Category:

less

Transcript and Presenter's Notes

Title: Seesaw%20Personalized%20Web%20Search


1
SeesawPersonalized Web Search
  • Jaime Teevan, MIT
  • with Susan T. Dumais
  • and Eric Horvitz, MSR

2
(No Transcript)
3
Personalization Algorithms
  • Query expansion
  • Standard IR

Query
Server
Document
Client
User
4
Personalization Algorithms
  • Query expansion
  • Standard IR

Query
Server
Document
Client
User
v. Result re-ranking
5
Result Re-Ranking
  • Ensures privacy
  • Good evaluation framework
  • Can look at rich user profile
  • Look at light weight user models
  • Collected on server side
  • Sent as query expansion

6
Seesaw Search Engine
Seesaw
Seesaw
dog 1 cat 10 india 2 mit 4 search 93 amherst
12 vegas 1
7
Seesaw Search Engine
query
dog 1 cat 10 india 2 mit 4 search 93 amherst
12 vegas 1
8
Seesaw Search Engine
query
forest hiking walking gorp
dog cat monkey banana food
baby infant child boy girl
csail mit artificial research robot
baby infant child boy girl
web search retrieval ir hunt
dog 1 cat 10 india 2 mit 4 search 93 amherst
12 vegas 1
9
Seesaw Search Engine
query
Search results page
6.0
1.6
0.2
2.7
0.2
1.3
dog 1 cat 10 india 2 mit 4 search 93 amherst
12 vegas 1
web search retrieval ir hunt
1.3
10
Calculating a Documents Score
  • Based on standard tf.idf

web search retrieval ir hunt
1.3
11
Calculating a Documents Score
  • Based on standard tf.idf

(ri0.5)(N-ni-Rri0.5) (ni-ri0.5)(R-ri0.5)
wi log
  • User as relevance feedback
  • Stuff Ive Seen index
  • More is better

0.1 0.5 0.05 0.35 0.3
1.3
12
Finding the Score Efficiently
  • Corpus representation (N, ni)
  • Web statistics
  • Result set
  • Document representation
  • Download document
  • Use result set snippet
  • Efficiency hacks generally OK!

13
Evaluating Personalized Search
  • 15 evaluators
  • Evaluate 50 results for a query
  • Highly relevant
  • Relevant
  • Irrelevant
  • Measure algorithm quality
  • DCG(i)

Gain(i), DCG(i1) Gain(i)/log(i),
if i 1 otherwise
14
Evaluating Personalized Search
  • Query selection
  • Chose from 10 pre-selected queries
  • Previously issued query

Pre-selected
cancer Microsoft traffic
bison frise Red Sox airlines
Las Vegas rice McDonalds
Mary
Joe
Total 137
53 pre-selected (2-9/query)
15
Seesaw Improves Text Retrieval
  • Random
  • Relevance Feedback
  • Seesaw

16
Text Features Not Enough
17
Take Advantage of Web Ranking
18
Further Exploration
  • Explore larger parameter space
  • Learn parameters
  • Based on individual
  • Based on query
  • Based on results
  • Give user control?

19
Making Seesaw Practical
  • Learn most about personalization by deploying a
    system
  • Best algorithm reasonably efficient
  • Merging server and client
  • Query expansion
  • Get more relevant results in the set to be
    re-ranked
  • Design snippets for personalization

20
User Interface Issues
  • Make personalization transparent
  • Give user control over personalization
  • Slider between Web and personalized results
  • Allows for background computation
  • Creates problem with re-finding
  • Results change as user model changes
  • Thesis research ReSearch Engine

21
Thank you!
  • teevan_at_csail.mit.edu
Write a Comment
User Comments (0)
About PowerShow.com