Title: Dealing with heterogeneity in profiles for Personalized Information Retrieval
1Dealing with heterogeneity in profiles for
Personalized Information Retrieval
- Pavel Serdyukov
- Twente University
Dealing with heterogeneity in profiles for
Personalized Information Retrieval
Pavel Serdyukov
2Outline
- Personalization basics
- The need for dynamic preferences
- Context-aware user profiles
- Methods
- Summary
3Personalization paradigm
Faceless user
Query
A
B
C
4Profiles
- Personality is expressed in profile
- Contains the order in which specific features are
liked or disliked - Implicitly built
- Most likely quantitative (e.g. user language
model) - Explicitly built
- Most likely qualitative I like A better than B
- POS/(Music Style, Techno Brit-pop)
- POS/(Country of Origin, USA France)
Music Style ltBrit-pop 0.13 Techno 0.21
gt Country of Origin ltFrance 0.1, USA 0.4 gt
W. Kießling, VLDB 2002
5Profile usage
- Content-based
- E.g. using cross-entropy of retrieved Object and
Profile Language Models - Collaborative
- Use preferences from similar profiles
last.fm
6Heterogeneity in profiles
- User preferences are not necessarily static
- For most domains
- Multimedia search music, movies, TV
- Text search
- Product search (u-commerce)
- Preferences should be situational!
- and hence context-aware
7Example Music preferences
- The number of situations of multimedia search and
consumption is increased over last decade - 50 of all personal activities have soundtrack!
- Situational context plays great role in music
preferences (A. North, Music Perception, 2004)
8Evolution of context-awareness
- Low-level context
- Spatial location, proximity, speed, body
position - Temporal Daytime, Weekday
- Physical weather, temperature, light, humidity,
noise - Personal heart beat, blood pressure
- High-level context
- Activity, social intercourse, mood
hands washing
web surfing
whiteboard drawing
time
9Activity recognition
- using only location, time and duration
- using cameras, microphones
- devices set on
- RFIDs of objects involved!
RFID tags
RFID reader
D. Patterson, D. Kautz, M. Philipose.
Fine-Grained Activity Recognition by Aggregating
Abstract Object Usage. 2005
10Social awareness
- Vicinity of people is important context
- When I am with girls I prefer jazz
- Through location recognition, or
- mobile interconnections
- Nokia Sensor, 10 meter awareness by means of
Bluetooth technology
11Context-aware profiles (1)
- The goal is to find context-aware language model
of user preferences - Data is a user context history, consisting of
pairs - Context and Object are vectors of attributes
- Clustering is the principal approach
- Hard clustering using object similarities
- Soft clustering using similarities of pairs
12Context-aware profiles (1)
- Hard clustering based algorithm
- Cluster objects in K clusters using some objects
similarity function - Use context variables to describe clusters
- Classify new situation characterized by Context
to do non-discriminative classification and get - Get new language model
13Context-aware profiles (2)
- Soft clustering based algorithm
- Probabilistic Latent Semantic Analysis principle
- The choice of objects is driven by latent
intentions - Likelihood of data
14EM algorithm
15Important applications
- In mobile multimedia
- Cell phones already contain MP3 players and
persistent Internet connection - In desktop search
- Search for saved/browsed documents
- Music recommendation again
- Context (metadata) is
- Time, Location (in case of laptop or pda)
- Running applications
- Opened documents (emails, web-pages, etc.)
- Played MP3 files
- User status in messengers
- Agenda records
16Context-enriched dataset
- Context-aware IR is desperate for publicly
available dataset! - Context acquisition and aggregation
infrastructures are imperfect - Additional user interaction at this stage
- Initiative fully belongs to the user
- Initiative partly belongs to the user
- Initiative fully belongs to the system
17Summary
- Personalization must be situational
- Context cannot be ignored then
- Unsupervised learning of context-aware
preferences is to be utilized - Semi-supervised methods are to be studied
- User feedback
- Explicit preferences
- Dataset is the primary stumbling block and our
short-term goal
18Questions
?