Title: Personalizing Information Search: Understanding Users and their Interests
1Personalizing Information Search Understanding
Users and their Interests
Diane KellySchool of Information Library
ScienceUniversity of North Carolina
dianek_at_email.unc.edu
IPAM 04 October 2007
2Background IR and TREC
- What is IR?
- Who works on problems in IR?
- Where can I find the most recent work in IR?
- A TREC primer
3Background Personalization
- Personalization is a process where retrieval is
customized to the individual (not
one-size-fits-all searching) - Hans Peter Luhn was one of the first people to
personalize IR through selective dissemination of
information (SDI) (now called filtering) - Profiles and user models are often employed to
house data about users and represent their
interests - Figuring out how to populate and maintain the
profile or user model is a hard problem
4Major Approaches
- Explicit Feedback
- Implicit Feedback
- Users desktop
5Explicit Feedback
6Explicit Feedback
- Term relevance feedback is one of the most widely
used and studied explicit feedback techniques - Typical relevance feedback scenarios (examples)
- Systems-centered research has found that
relevance feedback works (including
pseudo-relevance feedback) - User-centered research has found mixed results
about its effectiveness
7Explicit Feedback
- Terms are not presented in context so it may be
hard for users to understand how they can help - Quality of terms suggested is not always good
- Users dont have the additional cognitive
resources to engage in explicit feedback - Users are too lazy to provide feedback
- Questions about the sustainability of explicit
feedback for long-term modeling
8Examples
9Examples
BACK
10Query Elicitation Study
- Users typically pose very short queries
- This may be because
- users have a difficult time articulating their
information needs - traditional search interfaces encourage short
queries - Polyrepresentative extraction of information
needs suggests obtaining multiple representations
of a single information need (reference interview)
11Motivation
- Research has demonstrated that a positive
relationship exists between query length and
performance in batch-mode experimental IR - Query expansion is an effective technique for
increasing query length, but research has
demonstrated that users have some difficulty with
traditional term relevance feedback features
12Elicitation Form
Already Know
Why Know
Keywords
13Results Number of Terms
16.18
10.67
9.33
Already Know
Why
Keywords
2.33
N45
14Experimental Runs
15Overall Performance
0.3685
0.2843
16Query Length and Performance
y 0.263 .000265(x), p.000
17Major Findings
- Users provided lengthy responses to some of the
questions - There were large differences in the length of
users responses to each question - In most cases responses significantly improved
retrieval - Query length and performance were significantly
related
18Implicit Feedback
19Implicit Feedback
- What is it?
- Information about users, their needs and document
preferences that can be obtained unobtrusively,
by watching users interactions and behaviors
with systems - What are some examples?
- Examine Select, View, Listen, Scroll, Find,
Query, Cumulative measures - Retain Print, Save, Bookmark, Purchase, Email
- Reference Link, Cite
- Annotate/Create Mark up, Type, Edit, Organize,
Label
20Implicit Feedback
- Why is it important?
- It is generally believed that users are unwilling
to engage in explicit relevance feedback - It is unlikely that users can maintain their
profiles over time - Users generate large amounts of data each time
the engage in online information-seeking
activities and the things in which they are
interested is in this data somewhere
21Implicit Feedback
- What do we know about it?
- There seems to be a positive correlation between
selection (click-through) and relevance - There seems to be a positive correlation between
display time and relevance - What is problematic about it?
- Much of the research has been based on incomplete
data and general behavior - And has not considered the impact of contextual
variables such as task and a users familiarity
with a topic on behaviors
22Implicit Feedback Study
- To investigate
- the relationship between behaviors and relevance
- the relationship between behaviors and context
- To develop a method for studying and measuring
behaviors, context and relevance in a natural
setting, over time
23Method
- Approach naturalistic and longitudinal, but
some control - Subjects/Cases 7 Ph.D. students
- Study period 14 weeks
- Compensation new laptops and printers
24Data Collection
Endurance
Frequency
Tasks
Stage
Relevance
Context
Document
Persistence
Usefulness
Topics
Familiarity
Behaviors
Display Time
Printing
Saving
25Protocol
Client- Server-side Logging
Context Evaluation Document Evaluations
Context Evaluation
Document Evaluations
Week 1
Week 13
START
END
14 weeks
26(No Transcript)
27Results Description of Data
28Relevance Usefulness
6.1 (2.00)
6.0 (0.80)
5.3 (2.40)
5.3 (2.20)
5.0 (2.40)
4.8 (1.65)
4.6 (0.80)
29Relevance Usefulness
30Display Time
31Display Time Usefulness
32Display Time Task
33Major Findings
- Behaviors differed for each subject, but in
general - most display times were low
- most usefulness ratings were high
- not much printing or saving
- No direct relationship between display time and
usefulness
34Major Findings
- Main effects for display time and all contextual
variables - Task (5 subjects)
- Topic (6 subjects)
- Familiarity (5 subjects)
- Lower levels of familiarity associated with
higher display times - No clear interaction effects among behaviors,
context and relevance
35Personalizing Search
- Using the display time, task and relevance
information from the study, we evaluated the
effectiveness of a set of personalized retrieval
algorithms - Four algorithms for using display time as
implicit feedback were tested - User
- Task
- User Task
- General
36Results
MAP
Iteration
37Major Findings
- Tailoring display time thresholds based on task
information improved performance, but doing so
based on user information did not - There was a lot of variability between subjects,
with the user-centered algorithms performing well
for some and poorly for others - The effectiveness of most of the algorithms
increased with time (and more data)
38Some Problems
39Relevance
- What are we modeling? Does click relevance?
- Relevance is multi-dimensional and dynamic
- A single measure does to adequately reflect
relevance - Most pages are likely to be rated as useful, even
if the value or importance of the information
differs
40Definition
Recipe
41Weather Forecast
Information about Rocky Mountain Spotted Fever
42Paper about Personalization
43Page Structure
- Some behaviors are more likely to occur on some
types of pages - A more intelligent modeling function would know
when and what to observe and expect - The structure of pages encourage/inhibit certain
behaviors - Not all pages are equally as useful for modeling
a users interests
44What types of behaviors do you expect here?
And here?
45And here?
And here?
46The Future
47Future
- New interaction styles and systems create new
opportunities for explicit and implicit feedback - Collaborative search features and query
recommendation - Features/Systems that support the entire search
process (e.g., saving, organizing, etc.) - QA systems
- New types of feedback
- Negative
- Physiological
48Thank You
- Diane Kelly (dianek_at_email.unc.edu)
- WEB http//ils.unc.edu/dianek/research.html
- Collaborators Nick Belkin, Xin Fu, Vijay Dollu,
Ryen White
49TRECText REtrieval Conference
50What is TREC?
- TREC is a workshop series sponsored by the
National Institute of Standards and Technology
(NIST) and the US Department of Defense. - Its purpose is to build infrastructure for
large-scale evaluation of text retrieval
technology. - TREC collections and evaluation measures are the
de facto standard for evaluation in IR. - TREC is comprised of different tracks each of
which focuses on different issues (e.g., question
answering, filtering).
51(No Transcript)
52TREC Collections
- Central to each TREC Track is a collection, which
consists of three major components - A corpus of documents (typically newswire)
- A set of information needs (called topics)
- A set of relevance judgments.
- Each Track also adopts particular evaluation
measures - Precision and Recall F-measure
- Average Precision (AP) and Mean AP (MAP)
53Comparison of Measures
54Learn more about TREC
- http//trec.nist.gov
- Voorhees, E. M., Harman, D. K. (2005). TREC
Experiment and Evaluation in Information
Retrieval, Cambridge, MA MIT Press. -
BACK
55Example Topic
BACK
56Learn more about IR
- ACM SIGIR Conference
- Sparck-Jones, K., Willett, P. (1997). Readings
in Information Retrieval. Morgan-Kaufman
Publishers. - Baeza-Yates, R., Ribeiro-Neto, B. (1999).
Modern information retrieval. New York, NY ACM
Press. - Grossman, D. A., Frieder, O. (2004). Information
retrieval Algorithms and Heuristics. The
Netherlands Springer. -
BACK