Web%20Personalization%20and%20Recommender%20Systems

About This Presentation

Title:

Web%20Personalization%20and%20Recommender%20Systems

Description:

... CF Movie data set Movie ratings from the movielens data set Semantic info. extracted from IMDB based on the following ontology ... word) clusters ... Similarity ... – PowerPoint PPT presentation

Number of Views:255

Avg rating:3.0/5.0

Slides: 90

Provided by: Bamsh95

Learn more at: http://facweb.cs.depaul.edu

Category:

more less

Transcript and Presenter's Notes

Title: Web%20Personalization%20and%20Recommender%20Systems

1
Web Personalization andRecommender Systems
Bamshad Mobasher School of Computing, DePaul
University
2
What is Web Personalization

Web Personalization personalizing the browsing
experience of a user by dynamically tailoring the
look, feel, and content of a Web site to the
users needs and interests.
Related Phrases
mass customization, one-to-one marketing, site
customization, target marketing
Why Personalize?
broaden and deepen customer relationships
provide continuous relationship marketing to
build customer loyalty
help automate the process of proactively market
products to customers
lights-out marketing
cross-sell/up-sell products
provide the ability to measure customer behavior
and track how well customers are responding to
marketing efforts

3
Personalization v. Customization

Its a question of who controls the users
browsing experience
Customization
user controls and customizes the site or the
product based on his/her preferences
usually manual, but sometimes semi-automatic
based on a given user profile
Personalization
done automatically based on the users actions,
the users profile, and (possibly) the profiles
of others with similar profiles

4
(No Transcript)
5
(No Transcript)
6
Challenges and Pitfalls

Technical Challenges
data collection and data preprocessing
discovering actionable knowledge from the data
which personalization algorithms
Implementation/Deployment Challenges
what to personalize
when to personalize
degree of personalization or customization
how to target information without being intrusive

7
Web Personalization Recommender Systems

Dynamically serve customized content (pages,
products, recommendations, etc.) to users based
on their profiles, preferences, or expected
interests
Most common type of personalization Recommender
systems

User profile
Recommendationalgorithm
8
Common Recommendation Techniques

Collaborative Filtering
Give recommendations to a user based on
preferences of similar users
Preferences on items may be explicit or implicit
Content-Based Filtering
Give recommendations to a user based on items
with similar content in the users profile
Rule-Based (Knowledge-Based) Filtering
Provide recommendations to users based on
predefined (or learned) rules
age(x, 25-35) and income(x, 70-100K) and
childred(x, gt3) ? recommend(x, Minivan)

9
The Recommendation Task

Basic formulation as a prediction problem
Typically, the profile Pu contains preference
scores by u on some other items, i1, , ik
different from it
preference scores on i1, , ik may have been
obtained explicitly (e.g., movie ratings) or
implicitly (e.g., time spent on a product page or
a news article)

Given a profile Pu for a user u, and a target
item it, predict the preference score of user u
on item it
10
Notes on User Profiling

Utilizing user profiles for personalization
assumes
1) past behavior is a useful predictor of the
future behavior
2) wide variety of behaviors amongst users
Basic task in user profiling Preference
elicitation
May be based on explicit judgments from users
(e.g. ratings)
May be based on implicit measures of user
interest
Automatic user profiling
Use machine learning or data mining techniques to
learn models user behavior, preferences
May build a model for each specific user or build
group profiles
Usually based on passive observation of user
behavior
Advantages
less work for user and application writer
adaptive behavior
user and system build trust relationship gradually

11
Consequences of passiveness

Weak heuristics
example click through multiple uninteresting
pages en route to interestingness
example user browses to uninteresting page, then
goes for a coffee
example hierarchies tend to get more hits near
root
Cold start
No ability to fine tune profile or express
interest without visiting appropriate pages
Some possible alternative/extensions to
internally maintained profiles
expose to the user (e.g. fine tune profile) ?
expose to other users/agents (e.g. collaborative
filtering)?
expose to web server (e.g. cnn.com custom news)?

12
Content-Based Filtering Systems

Track which pages/items the user visits and give
as recommendations other pages with similar
content
Often involves the use of client-side learning
interface agents
May require the user to enter a profile or to
rate pages/objects as interesting or
uninteresting
Advantages
useful for large information-based sites (e.g.,
portals) or for domains where items have
content-rich features
can be easily integrated with content servers
Disadvantages
may miss important pragmatic relationships among
items (based on usage)
not effective in small-specific sites or sites
which are not content-oriented

13
Content-Based Recommenders

Predictions for unseen (target) items are
computed based on their similarity (in terms of
content) to items in the user profile.
E.g., user profile Pu contains
recommend highly and
recommend mildly

14
Content-Based Recommender Systems
15
Content-Based Recommenders Personalized Search

How can the search engine determine the users
context?

?
Query Madonna and Child
?

Need to learn the user profile
User is an art historian?
User is a pop music fan?

16
Content-Based Recommenders

Music recommendations
Play list generation

Example Pandora
17
Example Recommender Systems

Collaborative filtering recommenders
Predictions for unseen (target) items are
computed based the other users with similar
interest scores on items in user us profile
i.e. users with similar tastes (aka nearest
neighbors)
requires computing correlations between user u
and other users according to interest scores or
ratings

18
Collaborative Recommender Systems
19
Collaborative Recommender Systems
20
Collaborative Recommender Systems
21
Basic Collaborative Filtering Process
Current User Record
ltuser, item1, item2, gt
Nearest Neighbors
Neighborhood Formation
Recommendation Engine
Combination Function
Historical User Records
Recommendations
user
item
rating
Recommendation Phase
Neighborhood Formation Phase
Both of the Neighborhood formation and the
recommendation phases are real-time components
22
Collaborative Filtering Measuring Similarities

Pearson Correlation
weight by degree of correlation between user U
and user J
1 means very similar, 0 means no correlation, -1
means dissimilar
Works well in case of user ratings (where there
is at least a range of 1-5)
Not always possible (in some situations we may
only have implicit binary values, e.g., whether a
user did or did not select a document)
Alternatively, a variety of distance or
similarity measures can be used

Average rating of user J on all items.
23
Collaborative Filtering Making Predictions

When generating predictions from the nearest
neighbors, neighbors can be weighted based on
their distance to the target user
To generate predictions for a target user a on an
item i
ra mean rating for user a
u1, , uk are the k-nearest-neighbors to a
ru,i rating of user u on item I
sim(a,u) Pearson correlation between a and u
This is a weighted average of deviations from the
neighbors mean ratings (and closer neighbors
count more)

24
Example Collaborative System
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Bestmatch
Prediction ?
Using k-nearest neighbor with k 1
25
Item-based Collaborative Filtering

Find similarities among the items based on
ratings across users
Often measured based on a variation of Cosine
measure
Prediction of item I for user a is based on the
past ratings of user a on items similar to i.
Suppose
Predicted rating for Karen on Indep. Day will be
7, because she rated Star Wars 7
That is if we only use the most similar item
Otherwise, we can use the k-most similar items
and again use a weighted average

sim(Star Wars, Indep. Day) gt sim(Jur. Park,
Indep. Day) gt sim(Termin., Indep. Day)
26
Item-Based Collaborative Filtering
Prediction ?
Item1 Item 2 Item 3 Item 4 Item 5 Item 6
Alice 5 2 3 3 ?
User 1 2 4 4 1
User 2 2 1 3 1 2
User 3 4 2 3 2 1
User 4 3 3 2 3 1
User 5 3 2 2 2
User 6 5 3 1 3 2
User 7 5 1 5 1
Item similarity 0.76 0.79 0.60 0.71 0.75
Bestmatch
Cosine Similarity to the target item
27
Collaborative Filtering Evaluation

split users into train/test sets
for each user a in the test set
split as votes into observed (I) and to-predict
(P)
measure average absolute deviation between
predicted and actual votes in P
MAE mean absolute error
average over all test users

28
Other Forms of Collaborative and Social Filtering

Social Tagging (Folksonomy)
people add free-text tags to their content
where people happen to use the same terms then
their content is linked
frequently used terms floating to the top to
create a kind of positive feedback loop for
popular tags.
Examples
Del.icio.us
Flickr
Last.fm

29
Social Tagging

By allowing loose coordination, tagging systems
allow social exchange of conceptual information.
Facilitates a similar but richer information
exchange than collaborative filtering.
I comment that a movie is "romantic", or "a good
holiday movie". Everyone who overhears me has
access to this metadata about the movie.
The social exchange goes beyond collaborative
filtering - facilitating transfer of more
abstract, conceptual information about the movie.
Note the preference information is transferred
implicitly - we are more likely to tag items we
like than don't like
No algorithm mediating our connection between
individuals when we navigate by tags, we are
directly connecting with others.

30
Social Tagging

Deviating from standard mental models
No browsing of topical, categorized navigation or
searching for an explicit term or phrase
Instead is use language I use to define my world
(tagging)
Sharing my language and contexts will create
community
Tagging creates community through the overlap of
perspectives
This leads to the creation of social networks
which may further develop and evolve
But, does this lead to dynamic evolution of
complex concepts or knowledge? Collective
intelligence?

31
Folksonomies
32
Hybrid Recommender Systems
33
Semantically Enhanced Collaborative Filtering

Basic Idea
Extend item-based collaborative filtering to
incorporate both similarity based on ratings (or
usage) as well as semantic similarity based on
domain knowledge
Semantic knowledge about items
Can be extracted automatically from the Web based
on domain-specific reference ontologies
Used in conjunction with user-item mappings to
create a combined similarity measure for item
comparisons
Singular value decomposition used to reduce noise
in the semantic data
Semantic combination threshold
Used to determine the proportion of semantic and
rating (or usage) similarities in the combined
measure

34
Semantically Enhanced Hybrid Recommendation

An extension of the item-based algorithm
Use a combined similarity measure to compute item
similarities
where,
SemSim is the similarity of items ip and iq based
on semantic features (e.g., keywords, attributes,
etc.) and
RateSim is the similarity of items ip and iq
based on user ratings (as in the standard
item-based CF)
? is the semantic combination parameter
? 1 ? only user ratings no semantic similarity
? 0 ? only semantic features no collaborative
similarity

35
Semantically Enhanced CF

Movie data set
Movie ratings from the movielens data set
Semantic info. extracted from IMDB based on the
following ontology

36
Semantically Enhanced CF

Used 10-fold x-validation on randomly selected
test and training data sets
Each user in training set has at least 20 ratings
(scale 1-5)

37
Semantically Enhanced CF

Dealing with new items and sparse data sets
For new items, select all movies with only one
rating as the test data
Degrees of sparsity simulated using different
ratios for training data

38
Collaborative Filtering Problems

Problems with standard CF
major problem with CF is scalability
neighborhood formation is done in real-time
small number of users relative to items may
result in poor performance
data become too sparse to provide accurate
predictions
new item problem
Vulnerability to attacks (will come back to this
later)
Problems in context of clickstream / e-commerce
data
explicit user ratings are not available
features are binary (visit or a non-visit for a
particular item) or a function of the time spent
on a particular item
a visit to a page is not necessarily an
indication of interest in that item
number of user records (and items) is far larger
than the standard domains for CF where users are
limited to purchasers or people who rated items
need to rely on very short user histories

39
Web Mining Approach to Personalization

Basic Idea
generate aggregate user models (usage profiles)
by discovering user access patterns through Web
usage mining (offline process)
Clustering user transactions
Clustering items / pageviews
Association rule mining
Sequential pattern discovery
match a users active session against the
discovered models to provide dynamic content
(online process)
Advantages
no explicit user ratings or interaction with
users
helps preserve user privacy, by making effective
use of anonymous data
enhance the effectiveness and scalability of
collaborative filtering
more accurate and broader recommendations than
content-only approaches

40
Automatic Web Personalization
Offline Process
41
Automatic Web Personalization
Online Process
42
Conceptual Representation of User Transactions or
Sessions
Pageview/objects
Session/user data
Raw weights are usually based on time spent on a
page, but in practice, need to normalize and
transform.
43
Real-Time Recommendation Engine

Keep track of users navigational history through
the site
a fixed-size sliding window over the active
session to capture the current users
short-term history depth
Match current users activity against the
discovered profiles
profiles either can be based on aggregate usage
profiles, or are obtained directly from
association rules or sequential patterns
Dynamically generated recommendations are added
to the returned page
each pageview can be assigned a recommendation
score based on
matching score to user profiles (e.g., aggregate
usage profiles)
information value of the pageview based on
domain knowledge (e.g., link distance of the
candidate recommendation to the active session)

44
Recommendations Based on Aggregate Profiles

Matching score computed using cosine similarity
Users active session (pageviews in the current
window) is compared to each aggregate profile
(both are viewed as pageview vectors)
Weight of items in the profile vector is the
significance weight of the item for that profile
Weight of items in the session vector can be all
1s, or based on some method for determining
their significance in the current session
Generating recommendations based on matching
profiles
from each matching profile recommend the items
not already in the user session window, and not
directly linked from the pages in the current
session window
the recommendation score for an item is based on
a combination of profile matching score
(similarity to session window) and the weight of
the item in that profile
additionally, we can weight items farther away
from the current location of user higher (i.e.,
consider them better recommendations)

45
Discovery of Aggregate Profiles

Transaction clusters as Aggregate Profiles
Each transaction is viewed as a pageview vector
Each cluster contains a set of transaction
vectors with a centroid
Each centroid acts as an aggregate profile with
representing the weight for pageview pi in
the profile
Personalization involves computing similarity
between a current users profile (or the active
user session) and the cluster centroids

46
Web Usage Mining clustering example

Transaction Clusters
Clustering similar user transactions and using
centroid of each cluster as a usage profile
(representative for a user segment)

Sample cluster centroid from dept. Web site
(cluster size 330)
Support URL Pageview Description
1.00 /courses/syllabus.asp?course450-96-303q3y2002id290 SE 450 Object-Oriented Development class syllabus
0.97 /people/facultyinfo.asp?id290 Web page of a lecturer who thought the above course
0.88 /programs/ Current Degree Descriptions 2002
0.85 /programs/courses.asp?depcode96deptmnesecourseid450 SE 450 course description in SE program
0.82 /programs/2002/gradds2002.asp M.S. in Distributed Systems program description
47
Using Clusters for Personalization
Original Session/user data
Given an active session A ? B, the best matching
profile is Profile 1. This may result in a
recommendation for page F.html, since it appears
with high weight in that profile.
Result of Clustering
PROFILE 0 (Cluster Size 3) ---------------------
----------------- 1.00 C.html 1.00 D.html PROFILE
1 (Cluster Size 4) ----------------------------
---------- 1.00 B.html 1.00 F.html 0.75 A.html 0.2
5 C.html PROFILE 2 (Cluster Size
3) -------------------------------------- 1.00 A.h
tml 1.00 D.html 1.00 E.html 0.33 C.html
48
Association Rules Personalization

Approach of Fu, Budzik, Hammond, 2000
Proposed solution to the problem of reduced
coverage due to sparse data
rank all discovered rules by the degree of
intersection between the left-hand-side of rule
and a user's active session and then generate the
top k recommendations
Problem requires the generation of all
association rules, requiring a search in the full
space of rules during the recommendation process
Approach of Lin, Alvarez, Ruiz, 2000
Basic Approach
find an appropriate number of rules for each
target user by automatically selecting the
minimum support
the recommendation engine generates association
rules among both users and articles
Problem requires online generation of relevant
rules for each user

49
Association Rules Personalization

Approach of Mobasher, et al., 2001
discovered frequent itemsets of are stored into
an itemset graph (an extension of lexicographic
tree structure of Agrawal, et al. 1999)
each node at depth d in the graph corresponds to
an itemset, I, of size d and is linked to
itemsets of size d1 that contain I at level d1.
The single root node at level 0 corresponds to
the empty itemset.
frequent itemsets are matched against a user's
active session S by performing a search of the
graph to depth S
recommendation generation can be done in constant
time
does not require apriori generate association
rules from frequent itemsets
a recommendation r is an item at level S1
whose recommendation score is the confidence of
rule S gt r

50
Sequential Patterns Personalization

Sequential / Navigational Patterns as Aggregate
Profiles
similar to association rules, but the ordering of
accessed items is taken into account
Two basic approaches
use contiguous sequential patterns (CSP) (e.g.,
Web navigational patterns)
use general sequential patterns (SP)
Contiguous sequential patterns are often modeled
as Markov chains and used for prefetching (i.e.,
predicting the immediate next user access based
on previously accessed pages)
In context of recommendations, they can achieve
high accuracy, but may be difficult to obtain
reasonable coverage

51
Sequential Patterns Personalization

Sequential / Navigational Patterns (continued)
representation as Markov chains often leads to
high space complexity due to model sizes
some approaches have focused on reducing model
size
selective Markov Models (Deshpande, Karypis,
2000)
use various pruning strategies to reduce the
number of states (e.g., support or confidence
pruning, error pruning)
longest repeating subsequences (Pitkow, Pirolli,
1999)
similar to support pruning, used to focus only on
significant navigational paths
increased coverage can be achieved by using
all-Kth-order models (i.e., using all possible
sizes for user histories)

52
Sequential Patterns Personalization(Mobasher,
et al. 2002)

A Frequent Sequence Trie (FST), is used to store
both the sequential and contiguous sequential
patterns
organized into levels from 0 to k, where k is the
maximal size among all sequential (respectively,
contiguous sequential) patterns
each non-root node N at depth d contains an item
sd and representing a frequent sequence
lts1,s2,...,sdgt
along with each node the support (or frequency)
value of the corresponding pattern is stored
for each active session window w ltw1,w2,...,wngt
perform a depth-first search of the FST to level
n
If a match is found, then the children of the
matching node N are used to generate candidate
recommendations
given a sequence S ltw1,w2,...,wn,pgt, the item p
is added to the recommendation set if the
confidence of S is greater than or equal to the
confidence threshold

53
Example Frequent Itemsets
Sample Transactions
Frequent itemsets (using min. support frequency
4)
54
Example Sequential Patterns
Sample Transactions
CSP (min. support frequency 4)
SP (min. support frequency 4)
55
Example An Itemset Graph
Frequent Itemset Graph for the Example
Given an active session window ltB,Egt, the
algorithm finds items A and C with recommendation
scores of 1 and 4/5 (corresponding to confidences
of the rules B,E gt A and B,E gt C ).
56
Example Frequent Sequence Trie
Frequent Sequence Trie for the Example
Given an active session window ltA,Bgt, the
algorithm finds item E with recommendation score
of 1 (corresponding to confidences of the rules
A,B gt E .
57
Quantitative Evaluation of Recommendation
Effectiveness

Two important factors in evaluating
recommendations
Precision measures the ratio of correct
recommendations to all recommendations produced
by the system
low precision would result in angry or frustrated
users
Coverage measures the ratio of correct
recommendations to all pages/items that will be
accessed by user
low coverage would inhibit the ability of the
system to give relevant recommendations at
critical points in user navigation
Transactions Divided into Training Evaluation
Sets
training set is used to build models (generation
of aggregate profiles, neighborhood formation)
evaluation set is used to measure precision
coverage
10-Fold Cross Validation generally used in the
experiments

58
Evaluation Methodology

Each transaction t in the evaluation set is
divided into two parts
ast portion of the first n items in t, used as
the user session to generate recommendations (n
is the maximum allowable window size)
Evalt the remaining portion of t is used to
evaluate the recommendations (Evalt t - n)
R(ast, t) the recommendation set which contains
all items whose recommendation score is greater
than or equal to the threshold t.

Example t A,B,C,D,E,F,G,H
Use A,B,C,D to generate recommendations, say
E,G,K
Match E,G,K with E,F,G,H
No. of matches 2
Size of Evalt 4
Size of recommendation set 3
Coverage 2/4 50
Precision 2/3 67

59
Impact of Window Size

Increasing window sizes (using larger portion of
users history) generally leads to improvement in
precision

This example is based on the association rule
approach
60
Associations vs. Sequences

Comparison of recommendations based on
association rules, sequential patterns,
contiguous sequential patterns, and standard
k-nearest neighbor

Support threshold for Association, SP, CSP 0.04
61
Problems with Web Usage Mining

New item problem
Patterns will not capture new items recently
added
Bad for dynamic Web sites
Poor machine interpretability
Hard to generalize and reason about patterns
No domain knowledge used to enhance results
E.g., Knowing a user is interested in a program,
we could recommend the prerequisites, core or
popular courses in this program to the user
Poor insight into the patterns themselves
The nature of the relationships among items or
users in a pattern is not directly available

62
Solution Integrate Semantic Knowledge with Web
Usage Mining

Information Retrieval/Extraction Approach
Represent semantic knowledge in pageviews as
keyword vectors
Keywords extracted from text or meta-data
Text mining can be used to capture higher-level
concepts or associations among concepts
Cannot capture deeper relationships among objects
based on their inherent properties or attributes
Ontology-based approach
Represent domain knowledge using relational model
or ontology representation languages
Process Web usage data with the structured domain
knowledge
Requires the extraction of ontology instances
from Web pages
Challenge performing underlying mining
operations on structured objects (e.g., computing
similarities or performing aggregations)

63
Integration of Content Features

Pre-Mining
Initial transaction vector t ltweight(p1,t), ,
weight(Pn,t)gt
Transform into content-enhanced transaction
t ltweight(w1,t), , weight(wk,t)gt
Now transaction clustering can be performed based
on content similarity among user transactions
Post-Mining
First perform mining operations on usage and
content data independently
Integrate usage and content patterns in the
recommendation phase
Example Content Profiles
Perform clustering on the term-pageview matrix
Each cluster centroid represents pages with some
similar content
Use both content and usage profiles to generate
recommendations

64
A.html B.html C.html D.html E.html
user1 1 0 1 0 1
user2 1 1 0 0 1
user3 0 1 1 1 0
user4 1 0 1 1 1
user5 1 1 0 0 1
user6 1 0 1 1 1
User transaction matrix UT
A.html B.html C.html D.html E.html
web 0 0 1 1 1
data 0 1 1 1 0
mining 0 1 1 1 0
business 1 1 0 0 0
intelligence 1 1 0 0 1
marketing 1 1 0 0 1
ecommerce 0 1 1 0 0
search 1 0 1 0 0
information 1 0 1 1 1
retrieval 1 0 1 1 1
Feature-Pageview Matrix FP
65
Content Enhanced Transactions
User-Feature Matrix UF
Note that UF UT x FPT
web data mining business intelligence marketing ecommerce search information retrieval
user1 2 1 1 1 2 2 1 2 3 3
user2 1 1 1 2 3 3 1 1 2 2
user3 2 3 3 1 1 1 2 1 2 2
user4 3 2 2 1 2 2 1 2 4 4
user5 1 1 1 2 3 3 1 1 2 2
user6 3 2 2 1 2 2 1 2 4 4
Example users 4 and 6 are more interested in
concepts related to Web information retrieval,
while user 3 is more interested in data mining.
66
Integrating Content and UsageFor Personalization
67
Example Content Profiles
Examples of feature (word) clusters (Association
for Consumer Research Web site)
CLUSTER 10 ---------- ballot result vote ...
CLUSTER 0 ---------- anthropologi associ behavior
...
CLUSTER 4 ---------- consum journal market Psychol
ogi
CLUSTER 11 ---------- advisori appoint committe co
uncil ...
Cluster Centroids
68
Example Usage Profiles

Example Usage Profiles from the ACR Site

1.00 Call for Papers 0.67 ACR News Special
Topics 0.67 CFP Journal of Psychology and
Marketing I 0.67 CFP Journal of Psychology and
Marketing II 0.67 CFP Journal of Consumer
Psychology II 0.67 CFP Journal of Consumer
Psychology I
1.00 CFP Winter 2000 SCP Conference 1.00 Call
for Papers 0.36 CFP ACR 1999 Asia-Pacific
Conference 0.30 ACR 1999 Annual
Conference 0.25 ACR News Updates 0.24 Conference
Update

Generated by clustering user transactions
directly
Usage profiles represent groups of users commonly
accessing certain pages together
Content profiles represent groups of pages with
similar content

69
Comparison of Recommendations
70
Ontology-Based Usage Mining

Approach 1 Ontology-Enhanced Transactions
Initial transaction vector t ltweight(p1,t), ,
weight(Pn,t)gt
Transform into content-enhanced transaction
t ltweight(o1,t), , weight(or,t)gt
The structured objects o1, , or are instances on
ontology entities extracted from pages p1, , pn
in the transaction
Now mining tasks can be performed based on
ontological similarity among user transactions
Approach 2 Ontology-Enhanced Patterns
Discover usage patterns in the standard way
Transform patterns by creating an aggregate
representation of the patterns based on the
ontology
Requires the categorization of similar objects
into ontology classes
Also requires the specification of different
aggregation/combination function for each
attribute of each class in the ontology

71
Example Ontology for a Movie Site
An example of a Movie object instance
72
Ontology-Based Pattern Aggregation
Year
Name
Genre
Actor
Usage profile
2002
A
S 0.7 T 0.2 U 0.1
Genre-All
Movie 1
0.50 Movie1.html 0.35 Movie2.html0.15 Movie3.html
Romance
Comedy
Romantic Comedy
KidFamily
Object Extraction
B
1999
S 0.5 T0.5
Movie 2
Genre-All
Romance
Comedy
Genre-All
2000
C
S 0.6 W 0.4
Movie 3
Ontology-Based Aggregation
Comedy
Semantic Usage pattern
73
Personalization with Semantic Usage Patterns
Current User Profile
Aggregate Semantic Usage Patterns
Note that the matching between the semantic
representations of users profile and patterns
requires computation of similarities at the
ontological level (may be defined based on
domain-specific characteristics)
Match Profiles
Extended User Profile
Instantiate to Real Web Objects
Recommendations of Items
74
Profile Injection Attacks

Consist of a number of "attack profiles"
added to the system by providing ratings for
various items
engineered to bias the system's recommendations
Two basic types
Push attack (Shilling) designed to promote
an item
Nuke attack designed to demote a item
Prior work has shown that CF recommender systems
are highly vulnerable to such attacks
Attack Models
strategies for assigning ratings to items based
on knowledge of the system, products, or users
examples of attack models random, average,
bandwagon, segment, love-hate

75
A Successful Push Attack
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Attack 1 2 3 2 5 -1.00
Attack 2 3 2 3 2 5 0.76
Attack 3 3 2 2 2 5 0.93
BestMatch
Prediction ?
user-based algorithm using k-nearest neighbor
with k 1
76
A Generic Attack Profile
IS
IF
IÆ
it
null null null
Ratings for l filler items
Ratings for k selected items
Rating for the target item
Unrated items in the attack profile

Attack models differ based on ratings assigned to
filler and selected items

77
Average and Random Attack Models
IF
IÆ
it
null null null rmax
Rating for the target item
Random ratings for l filler items
Unrated items in the attack profile

Random Attack filler items are assigned random
ratings drawn from the overall distribution of
ratings on all items across the whole DB
Average Attack ratings each filler item drawn
from distribution defined by average rating for
that item in the DB
The percentage of filler items determines the
amount knowledge (and effort) required by the
attacker

78
Bandwagon Attack Model
IS
IF
IÆ
it
rmax rmax null null null rmax
Ratings for k frequently rated items
Random ratings for l filler items
Unrated items in the attack profile
Rating for the target item

What if the system's rating distribution is
unknown?
Identify products that are frequently rated
(e.g., blockbuster movies)
Associate the pushed product with them
Ratings for the filler items centered on overall
system average rating (Similar to Random attack)
frequently rated items can be guessed or obtained
externally

79
Segment Attack Model
IF
IS
IÆ
it
rmax rmax rmin rmin null null null rmax
Ratings for k favorite items in user segment
Rating for the target item
Ratings for l filler items
Unrated items in the attack profile

Assume attacker wants to push product to a target
segment of users
those with preference for similar products
fans of Harrison Ford
fans of horror movies
like bandwagon but for semantically-similar items
originally designed for attacking item-based CF
algorithms
maximize sim(target item, segment items)
minimize sim(target item, non-segment items)

80
Nuke Attacks Love/Hate Attack Model
IF
IÆ
it
rmax rmax null null null rmin
Min rating for the target item
Unrated items in the attack profile
Max rating for l filler items

A limited-knowledge attack in its simplest form
Target item given the minimum rating value
All other ratings in the filler item set are
given the maximum rating value
Note
Variations of this (an the other models) can also
be used as a push or nuke attacks, essentially by
switching the roles of rmin and rmax.

81
How Effective Can Attacks Be?

First A Methodological Note
Using MovieLens 100K data set
50 different "pushed" movies
selected randomly but mirroring overall
distribution
50 users randomly pre-selected
Results were averages over all runs for each
movie-user pair
K 20 in all experiments
Evaluating results
prediction shift
how much the rating of the pushed movie differs
before and after the attack
hit ratio
how often the pushed movie appears in a
recommendation list before and after the attack

82
Example Results Average Attack

Average attack is very effective against user
based algorithm (Random not as effective)
Item-based CF more robust (but vulnerable to
other attack types such as segment attack
Burke Mobasher, 2005

83
Example Results Bandwagon Attack

Only a small profile needed (3-7)
Only a few (lt 10) popular movies needed
As effective as the more data-intensive average
attack (but still not effective against
item-based algorithms)

84
Results Impact of Profile Size
Only a small number of filler items need to be
assigned ratings. An attacker, therefore, only
needs to use part of the product space to make
the attack effective.
In the item-based algorithm we dont see the same
drop-off, but prediction shift shows a
logarithmic behavior near maximum at about 7
filler size.
85
Example Results Segmented Attack Against
Item-Based CF

Very effective against targeted group
Best against item-based
Also effective against user-based
Low knowledge

86
Possible Solutions

Explicit trust calculation?
select peers through network of trust
relationships
law of large numbers
hard to achieve numbers needed for CF to work
well
Hybrid recommendation
Some indications that some hybrids may be more
robust
Model-based recommenders
Certain recommenders using clustering are more
robust, but generally at the cost of less
accuracy
But a probabilistic approach has been shown to be
relatively accurate
Detection and Response

87
Results Semantically Enhanced Hybrid
Semantic features extracted for movies top
actors, director, genre, synopsis (top
keywords), etc.
Alpha 0.0 100 semantic item-based
similarity Alpha 1.0 100 collaborative
item-based similarity
88
Approaches to Detection Response

Profile Classification
Classification model to identify attack profiles
and exclude these profiles in computing
predictions
Uses the characteristic features of most
successful attack models
Designed to increase cost of attacks by detecting
most effective attacks
Anomaly Detection
Classify Items (as being possibly under attack)
Not dependent on known attack models
Can shed some light on which type of items are
most vulnerable to which types of attacks

But, what if the attack does not closely
correspond to known attack signature
In Practice need a comprehensive framework
combining both approaches
89
Conclusions

Why recommender systems?
Many algorithmic advances ? more accurate and
reliable systems ? more confidence by users
Assist users in
Finding more relevant information, items,
products
Give users alternatives ? broaden user knowledge
Building communities
Help companies to
Better engage users and customers ? building
loyalty
Increase sales (on average 5-10)
Problems and challenges
More complex Web-based applications ? more
complex user interactions ? need more
sophisticated models
Need to further explore the impact of
recommendations on (a) user behavior and (b) on
the evolution of Web communities
Privacy, security, trust