Personalizing Information Search: Understanding Users and their Interests - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Personalizing Information Search: Understanding Users and their Interests

Description:

Personalization is a process where retrieval is customized to the individual ... Collaborators: Nick Belkin, Xin Fu, Vijay Dollu, Ryen White. Thank You. TREC ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 57
Provided by: shaw185
Category:

less

Transcript and Presenter's Notes

Title: Personalizing Information Search: Understanding Users and their Interests


1
Personalizing Information Search Understanding
Users and their Interests
Diane KellySchool of Information Library
ScienceUniversity of North Carolina
dianek_at_email.unc.edu
IPAM 04 October 2007
2
Background IR and TREC
  • What is IR?
  • Who works on problems in IR?
  • Where can I find the most recent work in IR?
  • A TREC primer

3
Background Personalization
  • Personalization is a process where retrieval is
    customized to the individual (not
    one-size-fits-all searching)
  • Hans Peter Luhn was one of the first people to
    personalize IR through selective dissemination of
    information (SDI) (now called filtering)
  • Profiles and user models are often employed to
    house data about users and represent their
    interests
  • Figuring out how to populate and maintain the
    profile or user model is a hard problem

4
Major Approaches
  • Explicit Feedback
  • Implicit Feedback
  • Users desktop

5
Explicit Feedback
6
Explicit Feedback
  • Term relevance feedback is one of the most widely
    used and studied explicit feedback techniques
  • Typical relevance feedback scenarios (examples)
  • Systems-centered research has found that
    relevance feedback works (including
    pseudo-relevance feedback)
  • User-centered research has found mixed results
    about its effectiveness

7
Explicit Feedback
  • Terms are not presented in context so it may be
    hard for users to understand how they can help
  • Quality of terms suggested is not always good
  • Users dont have the additional cognitive
    resources to engage in explicit feedback
  • Users are too lazy to provide feedback
  • Questions about the sustainability of explicit
    feedback for long-term modeling

8
Examples
9
Examples
BACK
10
Query Elicitation Study
  • Users typically pose very short queries
  • This may be because
  • users have a difficult time articulating their
    information needs
  • traditional search interfaces encourage short
    queries
  • Polyrepresentative extraction of information
    needs suggests obtaining multiple representations
    of a single information need (reference interview)

11
Motivation
  • Research has demonstrated that a positive
    relationship exists between query length and
    performance in batch-mode experimental IR
  • Query expansion is an effective technique for
    increasing query length, but research has
    demonstrated that users have some difficulty with
    traditional term relevance feedback features

12
Elicitation Form
Already Know
Why Know
Keywords
13
Results Number of Terms
16.18
10.67
9.33
Already Know
Why
Keywords
2.33
N45
14
Experimental Runs
15
Overall Performance
0.3685
0.2843
16
Query Length and Performance
y 0.263 .000265(x), p.000
17
Major Findings
  • Users provided lengthy responses to some of the
    questions
  • There were large differences in the length of
    users responses to each question
  • In most cases responses significantly improved
    retrieval
  • Query length and performance were significantly
    related

18
Implicit Feedback
19
Implicit Feedback
  • What is it?
  • Information about users, their needs and document
    preferences that can be obtained unobtrusively,
    by watching users interactions and behaviors
    with systems
  • What are some examples?
  • Examine Select, View, Listen, Scroll, Find,
    Query, Cumulative measures
  • Retain Print, Save, Bookmark, Purchase, Email
  • Reference Link, Cite
  • Annotate/Create Mark up, Type, Edit, Organize,
    Label

20
Implicit Feedback
  • Why is it important?
  • It is generally believed that users are unwilling
    to engage in explicit relevance feedback
  • It is unlikely that users can maintain their
    profiles over time
  • Users generate large amounts of data each time
    the engage in online information-seeking
    activities and the things in which they are
    interested is in this data somewhere

21
Implicit Feedback
  • What do we know about it?
  • There seems to be a positive correlation between
    selection (click-through) and relevance
  • There seems to be a positive correlation between
    display time and relevance
  • What is problematic about it?
  • Much of the research has been based on incomplete
    data and general behavior
  • And has not considered the impact of contextual
    variables such as task and a users familiarity
    with a topic on behaviors

22
Implicit Feedback Study
  • To investigate
  • the relationship between behaviors and relevance
  • the relationship between behaviors and context
  • To develop a method for studying and measuring
    behaviors, context and relevance in a natural
    setting, over time

23
Method
  • Approach naturalistic and longitudinal, but
    some control
  • Subjects/Cases 7 Ph.D. students
  • Study period 14 weeks
  • Compensation new laptops and printers

24
Data Collection
Endurance
Frequency
Tasks
Stage
Relevance
Context
Document
Persistence
Usefulness
Topics
Familiarity
Behaviors
Display Time
Printing
Saving
25
Protocol
Client- Server-side Logging
Context Evaluation Document Evaluations
Context Evaluation
Document Evaluations
Week 1
Week 13
START
END
14 weeks
26
(No Transcript)
27
Results Description of Data
28
Relevance Usefulness
6.1 (2.00)
6.0 (0.80)
5.3 (2.40)
5.3 (2.20)
5.0 (2.40)
4.8 (1.65)
4.6 (0.80)
29
Relevance Usefulness
30
Display Time
31
Display Time Usefulness
32
Display Time Task
33
Major Findings
  • Behaviors differed for each subject, but in
    general
  • most display times were low
  • most usefulness ratings were high
  • not much printing or saving
  • No direct relationship between display time and
    usefulness

34
Major Findings
  • Main effects for display time and all contextual
    variables
  • Task (5 subjects)
  • Topic (6 subjects)
  • Familiarity (5 subjects)
  • Lower levels of familiarity associated with
    higher display times
  • No clear interaction effects among behaviors,
    context and relevance

35
Personalizing Search
  • Using the display time, task and relevance
    information from the study, we evaluated the
    effectiveness of a set of personalized retrieval
    algorithms
  • Four algorithms for using display time as
    implicit feedback were tested
  • User
  • Task
  • User Task
  • General

36
Results
MAP
Iteration
37
Major Findings
  • Tailoring display time thresholds based on task
    information improved performance, but doing so
    based on user information did not
  • There was a lot of variability between subjects,
    with the user-centered algorithms performing well
    for some and poorly for others
  • The effectiveness of most of the algorithms
    increased with time (and more data)

38
Some Problems
39
Relevance
  • What are we modeling? Does click relevance?
  • Relevance is multi-dimensional and dynamic
  • A single measure does to adequately reflect
    relevance
  • Most pages are likely to be rated as useful, even
    if the value or importance of the information
    differs

40
Definition
Recipe
41
Weather Forecast
Information about Rocky Mountain Spotted Fever
42
Paper about Personalization
43
Page Structure
  • Some behaviors are more likely to occur on some
    types of pages
  • A more intelligent modeling function would know
    when and what to observe and expect
  • The structure of pages encourage/inhibit certain
    behaviors
  • Not all pages are equally as useful for modeling
    a users interests

44
What types of behaviors do you expect here?
And here?
45
And here?
And here?
46
The Future
47
Future
  • New interaction styles and systems create new
    opportunities for explicit and implicit feedback
  • Collaborative search features and query
    recommendation
  • Features/Systems that support the entire search
    process (e.g., saving, organizing, etc.)
  • QA systems
  • New types of feedback
  • Negative
  • Physiological

48
Thank You
  • Diane Kelly (dianek_at_email.unc.edu)
  • WEB http//ils.unc.edu/dianek/research.html
  • Collaborators Nick Belkin, Xin Fu, Vijay Dollu,
    Ryen White

49
TRECText REtrieval Conference
  • Its not this

50
What is TREC?
  • TREC is a workshop series sponsored by the
    National Institute of Standards and Technology
    (NIST) and the US Department of Defense.
  • Its purpose is to build infrastructure for
    large-scale evaluation of text retrieval
    technology.
  • TREC collections and evaluation measures are the
    de facto standard for evaluation in IR.
  • TREC is comprised of different tracks each of
    which focuses on different issues (e.g., question
    answering, filtering).

51
(No Transcript)
52
TREC Collections
  • Central to each TREC Track is a collection, which
    consists of three major components
  • A corpus of documents (typically newswire)
  • A set of information needs (called topics)
  • A set of relevance judgments.
  • Each Track also adopts particular evaluation
    measures
  • Precision and Recall F-measure
  • Average Precision (AP) and Mean AP (MAP)

53
Comparison of Measures
54
Learn more about TREC
  • http//trec.nist.gov
  • Voorhees, E. M., Harman, D. K. (2005). TREC
    Experiment and Evaluation in Information
    Retrieval, Cambridge, MA MIT Press.

BACK
55
Example Topic
BACK
56
Learn more about IR
  • ACM SIGIR Conference
  • Sparck-Jones, K., Willett, P. (1997). Readings
    in Information Retrieval. Morgan-Kaufman
    Publishers.
  • Baeza-Yates, R., Ribeiro-Neto, B. (1999).
    Modern information retrieval. New York, NY ACM
    Press.
  • Grossman, D. A., Frieder, O. (2004). Information
    retrieval Algorithms and Heuristics. The
    Netherlands Springer.

BACK
Write a Comment
User Comments (0)
About PowerShow.com