Web%20Personalization%20and%20Recommender%20Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Web%20Personalization%20and%20Recommender%20Systems

Description:

... CF Movie data set Movie ratings from the movielens data set Semantic info. extracted from IMDB based on the following ontology ... word) clusters ... Similarity ... – PowerPoint PPT presentation

Number of Views:253
Avg rating:3.0/5.0
Slides: 90
Provided by: Bamsh95
Category:

less

Transcript and Presenter's Notes

Title: Web%20Personalization%20and%20Recommender%20Systems


1
Web Personalization andRecommender Systems
Bamshad Mobasher School of Computing, DePaul
University
2
What is Web Personalization
  • Web Personalization personalizing the browsing
    experience of a user by dynamically tailoring the
    look, feel, and content of a Web site to the
    users needs and interests.
  • Related Phrases
  • mass customization, one-to-one marketing, site
    customization, target marketing
  • Why Personalize?
  • broaden and deepen customer relationships
  • provide continuous relationship marketing to
    build customer loyalty
  • help automate the process of proactively market
    products to customers
  • lights-out marketing
  • cross-sell/up-sell products
  • provide the ability to measure customer behavior
    and track how well customers are responding to
    marketing efforts

3
Personalization v. Customization
  • Its a question of who controls the users
    browsing experience
  • Customization
  • user controls and customizes the site or the
    product based on his/her preferences
  • usually manual, but sometimes semi-automatic
    based on a given user profile
  • Personalization
  • done automatically based on the users actions,
    the users profile, and (possibly) the profiles
    of others with similar profiles

4
(No Transcript)
5
(No Transcript)
6
Challenges and Pitfalls
  • Technical Challenges
  • data collection and data preprocessing
  • discovering actionable knowledge from the data
  • which personalization algorithms
  • Implementation/Deployment Challenges
  • what to personalize
  • when to personalize
  • degree of personalization or customization
  • how to target information without being intrusive

7
Web Personalization Recommender Systems
  • Dynamically serve customized content (pages,
    products, recommendations, etc.) to users based
    on their profiles, preferences, or expected
    interests
  • Most common type of personalization Recommender
    systems

User profile
Recommendationalgorithm
8
Common Recommendation Techniques
  • Collaborative Filtering
  • Give recommendations to a user based on
    preferences of similar users
  • Preferences on items may be explicit or implicit
  • Content-Based Filtering
  • Give recommendations to a user based on items
    with similar content in the users profile
  • Rule-Based (Knowledge-Based) Filtering
  • Provide recommendations to users based on
    predefined (or learned) rules
  • age(x, 25-35) and income(x, 70-100K) and
    childred(x, gt3) ? recommend(x, Minivan)

9
The Recommendation Task
  • Basic formulation as a prediction problem
  • Typically, the profile Pu contains preference
    scores by u on some other items, i1, , ik
    different from it
  • preference scores on i1, , ik may have been
    obtained explicitly (e.g., movie ratings) or
    implicitly (e.g., time spent on a product page or
    a news article)

Given a profile Pu for a user u, and a target
item it, predict the preference score of user u
on item it
10
Notes on User Profiling
  • Utilizing user profiles for personalization
    assumes
  • 1) past behavior is a useful predictor of the
    future behavior
  • 2) wide variety of behaviors amongst users
  • Basic task in user profiling Preference
    elicitation
  • May be based on explicit judgments from users
    (e.g. ratings)
  • May be based on implicit measures of user
    interest
  • Automatic user profiling
  • Use machine learning or data mining techniques to
    learn models user behavior, preferences
  • May build a model for each specific user or build
    group profiles
  • Usually based on passive observation of user
    behavior
  • Advantages
  • less work for user and application writer
  • adaptive behavior
  • user and system build trust relationship gradually

11
Consequences of passiveness
  • Weak heuristics
  • example click through multiple uninteresting
    pages en route to interestingness
  • example user browses to uninteresting page, then
    goes for a coffee
  • example hierarchies tend to get more hits near
    root
  • Cold start
  • No ability to fine tune profile or express
    interest without visiting appropriate pages
  • Some possible alternative/extensions to
    internally maintained profiles
  • expose to the user (e.g. fine tune profile) ?
  • expose to other users/agents (e.g. collaborative
    filtering)?
  • expose to web server (e.g. cnn.com custom news)?

12
Content-Based Filtering Systems
  • Track which pages/items the user visits and give
    as recommendations other pages with similar
    content
  • Often involves the use of client-side learning
    interface agents
  • May require the user to enter a profile or to
    rate pages/objects as interesting or
    uninteresting
  • Advantages
  • useful for large information-based sites (e.g.,
    portals) or for domains where items have
    content-rich features
  • can be easily integrated with content servers
  • Disadvantages
  • may miss important pragmatic relationships among
    items (based on usage)
  • not effective in small-specific sites or sites
    which are not content-oriented

13
Content-Based Recommenders
  • Predictions for unseen (target) items are
    computed based on their similarity (in terms of
    content) to items in the user profile.
  • E.g., user profile Pu contains
  • recommend highly and
    recommend mildly

14
Content-Based Recommender Systems
15
Content-Based Recommenders Personalized Search
  • How can the search engine determine the users
    context?

?
Query Madonna and Child
?
  • Need to learn the user profile
  • User is an art historian?
  • User is a pop music fan?

16
Content-Based Recommenders
  • Music recommendations
  • Play list generation

Example Pandora
17
Example Recommender Systems
  • Collaborative filtering recommenders
  • Predictions for unseen (target) items are
    computed based the other users with similar
    interest scores on items in user us profile
  • i.e. users with similar tastes (aka nearest
    neighbors)
  • requires computing correlations between user u
    and other users according to interest scores or
    ratings

18
Collaborative Recommender Systems
19
Collaborative Recommender Systems
20
Collaborative Recommender Systems
21
Basic Collaborative Filtering Process
Current User Record
ltuser, item1, item2, gt
Nearest Neighbors
Neighborhood Formation
Recommendation Engine
Combination Function
Historical User Records
Recommendations
user
item
rating
Recommendation Phase
Neighborhood Formation Phase
Both of the Neighborhood formation and the
recommendation phases are real-time components
22
Collaborative Filtering Measuring Similarities
  • Pearson Correlation
  • weight by degree of correlation between user U
    and user J
  • 1 means very similar, 0 means no correlation, -1
    means dissimilar
  • Works well in case of user ratings (where there
    is at least a range of 1-5)
  • Not always possible (in some situations we may
    only have implicit binary values, e.g., whether a
    user did or did not select a document)
  • Alternatively, a variety of distance or
    similarity measures can be used

Average rating of user J on all items.
23
Collaborative Filtering Making Predictions
  • When generating predictions from the nearest
    neighbors, neighbors can be weighted based on
    their distance to the target user
  • To generate predictions for a target user a on an
    item i
  • ra mean rating for user a
  • u1, , uk are the k-nearest-neighbors to a
  • ru,i rating of user u on item I
  • sim(a,u) Pearson correlation between a and u
  • This is a weighted average of deviations from the
    neighbors mean ratings (and closer neighbors
    count more)

24
Example Collaborative System
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Bestmatch
Prediction ?
Using k-nearest neighbor with k 1
25
Item-based Collaborative Filtering
  • Find similarities among the items based on
    ratings across users
  • Often measured based on a variation of Cosine
    measure
  • Prediction of item I for user a is based on the
    past ratings of user a on items similar to i.
  • Suppose
  • Predicted rating for Karen on Indep. Day will be
    7, because she rated Star Wars 7
  • That is if we only use the most similar item
  • Otherwise, we can use the k-most similar items
    and again use a weighted average

sim(Star Wars, Indep. Day) gt sim(Jur. Park,
Indep. Day) gt sim(Termin., Indep. Day)
26
Item-Based Collaborative Filtering
Prediction ?
Item1 Item 2 Item 3 Item 4 Item 5 Item 6
Alice 5 2 3 3 ?
User 1 2 4 4 1
User 2 2 1 3 1 2
User 3 4 2 3 2 1
User 4 3 3 2 3 1
User 5 3 2 2 2
User 6 5 3 1 3 2
User 7 5 1 5 1
Item similarity 0.76 0.79 0.60 0.71 0.75
Bestmatch
Cosine Similarity to the target item
27
Collaborative Filtering Evaluation
  • split users into train/test sets
  • for each user a in the test set
  • split as votes into observed (I) and to-predict
    (P)
  • measure average absolute deviation between
    predicted and actual votes in P
  • MAE mean absolute error
  • average over all test users

28
Other Forms of Collaborative and Social Filtering
  • Social Tagging (Folksonomy)
  • people add free-text tags to their content
  • where people happen to use the same terms then
    their content is linked
  • frequently used terms floating to the top to
    create a kind of positive feedback loop for
    popular tags.
  • Examples
  • Del.icio.us
  • Flickr
  • Last.fm

29
Social Tagging
  • By allowing loose coordination, tagging systems
    allow social exchange of conceptual information.
  • Facilitates a similar but richer information
    exchange than collaborative filtering.
  • I comment that a movie is "romantic", or "a good
    holiday movie". Everyone who overhears me has
    access to this metadata about the movie.
  • The social exchange goes beyond collaborative
    filtering - facilitating transfer of more
    abstract, conceptual information about the movie.
  • Note the preference information is transferred
    implicitly - we are more likely to tag items we
    like than don't like
  • No algorithm mediating our connection between
    individuals when we navigate by tags, we are
    directly connecting with others.

30
Social Tagging
  • Deviating from standard mental models
  • No browsing of topical, categorized navigation or
    searching for an explicit term or phrase
  • Instead is use language I use to define my world
    (tagging)
  • Sharing my language and contexts will create
    community
  • Tagging creates community through the overlap of
    perspectives
  • This leads to the creation of social networks
    which may further develop and evolve
  • But, does this lead to dynamic evolution of
    complex concepts or knowledge? Collective
    intelligence?

31
Folksonomies
32
Hybrid Recommender Systems
33
Semantically Enhanced Collaborative Filtering
  • Basic Idea
  • Extend item-based collaborative filtering to
    incorporate both similarity based on ratings (or
    usage) as well as semantic similarity based on
    domain knowledge
  • Semantic knowledge about items
  • Can be extracted automatically from the Web based
    on domain-specific reference ontologies
  • Used in conjunction with user-item mappings to
    create a combined similarity measure for item
    comparisons
  • Singular value decomposition used to reduce noise
    in the semantic data
  • Semantic combination threshold
  • Used to determine the proportion of semantic and
    rating (or usage) similarities in the combined
    measure

34
Semantically Enhanced Hybrid Recommendation
  • An extension of the item-based algorithm
  • Use a combined similarity measure to compute item
    similarities
  • where,
  • SemSim is the similarity of items ip and iq based
    on semantic features (e.g., keywords, attributes,
    etc.) and
  • RateSim is the similarity of items ip and iq
    based on user ratings (as in the standard
    item-based CF)
  • ? is the semantic combination parameter
  • ? 1 ? only user ratings no semantic similarity
  • ? 0 ? only semantic features no collaborative
    similarity

35
Semantically Enhanced CF
  • Movie data set
  • Movie ratings from the movielens data set
  • Semantic info. extracted from IMDB based on the
    following ontology

36
Semantically Enhanced CF
  • Used 10-fold x-validation on randomly selected
    test and training data sets
  • Each user in training set has at least 20 ratings
    (scale 1-5)

37
Semantically Enhanced CF
  • Dealing with new items and sparse data sets
  • For new items, select all movies with only one
    rating as the test data
  • Degrees of sparsity simulated using different
    ratios for training data

38
Collaborative Filtering Problems
  • Problems with standard CF
  • major problem with CF is scalability
  • neighborhood formation is done in real-time
  • small number of users relative to items may
    result in poor performance
  • data become too sparse to provide accurate
    predictions
  • new item problem
  • Vulnerability to attacks (will come back to this
    later)
  • Problems in context of clickstream / e-commerce
    data
  • explicit user ratings are not available
  • features are binary (visit or a non-visit for a
    particular item) or a function of the time spent
    on a particular item
  • a visit to a page is not necessarily an
    indication of interest in that item
  • number of user records (and items) is far larger
    than the standard domains for CF where users are
    limited to purchasers or people who rated items
  • need to rely on very short user histories

39
Web Mining Approach to Personalization
  • Basic Idea
  • generate aggregate user models (usage profiles)
    by discovering user access patterns through Web
    usage mining (offline process)
  • Clustering user transactions
  • Clustering items / pageviews
  • Association rule mining
  • Sequential pattern discovery
  • match a users active session against the
    discovered models to provide dynamic content
    (online process)
  • Advantages
  • no explicit user ratings or interaction with
    users
  • helps preserve user privacy, by making effective
    use of anonymous data
  • enhance the effectiveness and scalability of
    collaborative filtering
  • more accurate and broader recommendations than
    content-only approaches

40
Automatic Web Personalization
Offline Process
41
Automatic Web Personalization
Online Process
42
Conceptual Representation of User Transactions or
Sessions
Pageview/objects
Session/user data
Raw weights are usually based on time spent on a
page, but in practice, need to normalize and
transform.
43
Real-Time Recommendation Engine
  • Keep track of users navigational history through
    the site
  • a fixed-size sliding window over the active
    session to capture the current users
    short-term history depth
  • Match current users activity against the
    discovered profiles
  • profiles either can be based on aggregate usage
    profiles, or are obtained directly from
    association rules or sequential patterns
  • Dynamically generated recommendations are added
    to the returned page
  • each pageview can be assigned a recommendation
    score based on
  • matching score to user profiles (e.g., aggregate
    usage profiles)
  • information value of the pageview based on
    domain knowledge (e.g., link distance of the
    candidate recommendation to the active session)

44
Recommendations Based on Aggregate Profiles
  • Matching score computed using cosine similarity
  • Users active session (pageviews in the current
    window) is compared to each aggregate profile
    (both are viewed as pageview vectors)
  • Weight of items in the profile vector is the
    significance weight of the item for that profile
  • Weight of items in the session vector can be all
    1s, or based on some method for determining
    their significance in the current session
  • Generating recommendations based on matching
    profiles
  • from each matching profile recommend the items
    not already in the user session window, and not
    directly linked from the pages in the current
    session window
  • the recommendation score for an item is based on
    a combination of profile matching score
    (similarity to session window) and the weight of
    the item in that profile
  • additionally, we can weight items farther away
    from the current location of user higher (i.e.,
    consider them better recommendations)

45
Discovery of Aggregate Profiles
  • Transaction clusters as Aggregate Profiles
  • Each transaction is viewed as a pageview vector
  • Each cluster contains a set of transaction
    vectors with a centroid
  • Each centroid acts as an aggregate profile with
    representing the weight for pageview pi in
    the profile
  • Personalization involves computing similarity
    between a current users profile (or the active
    user session) and the cluster centroids

46
Web Usage Mining clustering example
  • Transaction Clusters
  • Clustering similar user transactions and using
    centroid of each cluster as a usage profile
    (representative for a user segment)

Sample cluster centroid from dept. Web site
(cluster size 330)
Support URL Pageview Description
1.00 /courses/syllabus.asp?course450-96-303q3y2002id290 SE 450 Object-Oriented Development class syllabus
0.97 /people/facultyinfo.asp?id290 Web page of a lecturer who thought the above course
0.88 /programs/ Current Degree Descriptions 2002
0.85 /programs/courses.asp?depcode96deptmnesecourseid450 SE 450 course description in SE program
0.82 /programs/2002/gradds2002.asp M.S. in Distributed Systems program description
47
Using Clusters for Personalization
Original Session/user data
Given an active session A ? B, the best matching
profile is Profile 1. This may result in a
recommendation for page F.html, since it appears
with high weight in that profile.
Result of Clustering
PROFILE 0 (Cluster Size 3) ---------------------
----------------- 1.00 C.html 1.00 D.html PROFILE
1 (Cluster Size 4) ----------------------------
---------- 1.00 B.html 1.00 F.html 0.75 A.html 0.2
5 C.html PROFILE 2 (Cluster Size
3) -------------------------------------- 1.00 A.h
tml 1.00 D.html 1.00 E.html 0.33 C.html
48
Association Rules Personalization
  • Approach of Fu, Budzik, Hammond, 2000
  • Proposed solution to the problem of reduced
    coverage due to sparse data
  • rank all discovered rules by the degree of
    intersection between the left-hand-side of rule
    and a user's active session and then generate the
    top k recommendations
  • Problem requires the generation of all
    association rules, requiring a search in the full
    space of rules during the recommendation process
  • Approach of Lin, Alvarez, Ruiz, 2000
  • Basic Approach
  • find an appropriate number of rules for each
    target user by automatically selecting the
    minimum support
  • the recommendation engine generates association
    rules among both users and articles
  • Problem requires online generation of relevant
    rules for each user

49
Association Rules Personalization
  • Approach of Mobasher, et al., 2001
  • discovered frequent itemsets of are stored into
    an itemset graph (an extension of lexicographic
    tree structure of Agrawal, et al. 1999)
  • each node at depth d in the graph corresponds to
    an itemset, I, of size d and is linked to
    itemsets of size d1 that contain I at level d1.
    The single root node at level 0 corresponds to
    the empty itemset.
  • frequent itemsets are matched against a user's
    active session S by performing a search of the
    graph to depth S
  • recommendation generation can be done in constant
    time
  • does not require apriori generate association
    rules from frequent itemsets
  • a recommendation r is an item at level S1
    whose recommendation score is the confidence of
    rule S gt r

50
Sequential Patterns Personalization
  • Sequential / Navigational Patterns as Aggregate
    Profiles
  • similar to association rules, but the ordering of
    accessed items is taken into account
  • Two basic approaches
  • use contiguous sequential patterns (CSP) (e.g.,
    Web navigational patterns)
  • use general sequential patterns (SP)
  • Contiguous sequential patterns are often modeled
    as Markov chains and used for prefetching (i.e.,
    predicting the immediate next user access based
    on previously accessed pages)
  • In context of recommendations, they can achieve
    high accuracy, but may be difficult to obtain
    reasonable coverage

51
Sequential Patterns Personalization
  • Sequential / Navigational Patterns (continued)
  • representation as Markov chains often leads to
    high space complexity due to model sizes
  • some approaches have focused on reducing model
    size
  • selective Markov Models (Deshpande, Karypis,
    2000)
  • use various pruning strategies to reduce the
    number of states (e.g., support or confidence
    pruning, error pruning)
  • longest repeating subsequences (Pitkow, Pirolli,
    1999)
  • similar to support pruning, used to focus only on
    significant navigational paths
  • increased coverage can be achieved by using
    all-Kth-order models (i.e., using all possible
    sizes for user histories)

52
Sequential Patterns Personalization(Mobasher,
et al. 2002)
  • A Frequent Sequence Trie (FST), is used to store
    both the sequential and contiguous sequential
    patterns
  • organized into levels from 0 to k, where k is the
    maximal size among all sequential (respectively,
    contiguous sequential) patterns
  • each non-root node N at depth d contains an item
    sd and representing a frequent sequence
    lts1,s2,...,sdgt
  • along with each node the support (or frequency)
    value of the corresponding pattern is stored
  • for each active session window w ltw1,w2,...,wngt
  • perform a depth-first search of the FST to level
    n
  • If a match is found, then the children of the
    matching node N are used to generate candidate
    recommendations
  • given a sequence S ltw1,w2,...,wn,pgt, the item p
    is added to the recommendation set if the
    confidence of S is greater than or equal to the
    confidence threshold

53
Example Frequent Itemsets
Sample Transactions
Frequent itemsets (using min. support frequency
4)
54
Example Sequential Patterns
Sample Transactions
CSP (min. support frequency 4)
SP (min. support frequency 4)
55
Example An Itemset Graph
Frequent Itemset Graph for the Example
Given an active session window ltB,Egt, the
algorithm finds items A and C with recommendation
scores of 1 and 4/5 (corresponding to confidences
of the rules B,E gt A and B,E gt C ).
56
Example Frequent Sequence Trie
Frequent Sequence Trie for the Example
Given an active session window ltA,Bgt, the
algorithm finds item E with recommendation score
of 1 (corresponding to confidences of the rules
A,B gt E .
57
Quantitative Evaluation of Recommendation
Effectiveness
  • Two important factors in evaluating
    recommendations
  • Precision measures the ratio of correct
    recommendations to all recommendations produced
    by the system
  • low precision would result in angry or frustrated
    users
  • Coverage measures the ratio of correct
    recommendations to all pages/items that will be
    accessed by user
  • low coverage would inhibit the ability of the
    system to give relevant recommendations at
    critical points in user navigation
  • Transactions Divided into Training Evaluation
    Sets
  • training set is used to build models (generation
    of aggregate profiles, neighborhood formation)
  • evaluation set is used to measure precision
    coverage
  • 10-Fold Cross Validation generally used in the
    experiments

58
Evaluation Methodology
  • Each transaction t in the evaluation set is
    divided into two parts
  • ast portion of the first n items in t, used as
    the user session to generate recommendations (n
    is the maximum allowable window size)
  • Evalt the remaining portion of t is used to
    evaluate the recommendations (Evalt t - n)
  • R(ast, t) the recommendation set which contains
    all items whose recommendation score is greater
    than or equal to the threshold t.
  • Example t A,B,C,D,E,F,G,H
  • Use A,B,C,D to generate recommendations, say
    E,G,K
  • Match E,G,K with E,F,G,H
  • No. of matches 2
  • Size of Evalt 4
  • Size of recommendation set 3
  • Coverage 2/4 50
  • Precision 2/3 67

59
Impact of Window Size
  • Increasing window sizes (using larger portion of
    users history) generally leads to improvement in
    precision

This example is based on the association rule
approach
60
Associations vs. Sequences
  • Comparison of recommendations based on
    association rules, sequential patterns,
    contiguous sequential patterns, and standard
    k-nearest neighbor

Support threshold for Association, SP, CSP 0.04
61
Problems with Web Usage Mining
  • New item problem
  • Patterns will not capture new items recently
    added
  • Bad for dynamic Web sites
  • Poor machine interpretability
  • Hard to generalize and reason about patterns
  • No domain knowledge used to enhance results
  • E.g., Knowing a user is interested in a program,
    we could recommend the prerequisites, core or
    popular courses in this program to the user
  • Poor insight into the patterns themselves
  • The nature of the relationships among items or
    users in a pattern is not directly available

62
Solution Integrate Semantic Knowledge with Web
Usage Mining
  • Information Retrieval/Extraction Approach
  • Represent semantic knowledge in pageviews as
    keyword vectors
  • Keywords extracted from text or meta-data
  • Text mining can be used to capture higher-level
    concepts or associations among concepts
  • Cannot capture deeper relationships among objects
    based on their inherent properties or attributes
  • Ontology-based approach
  • Represent domain knowledge using relational model
    or ontology representation languages
  • Process Web usage data with the structured domain
    knowledge
  • Requires the extraction of ontology instances
    from Web pages
  • Challenge performing underlying mining
    operations on structured objects (e.g., computing
    similarities or performing aggregations)

63
Integration of Content Features
  • Pre-Mining
  • Initial transaction vector t ltweight(p1,t), ,
    weight(Pn,t)gt
  • Transform into content-enhanced transaction
  • t ltweight(w1,t), , weight(wk,t)gt
  • Now transaction clustering can be performed based
    on content similarity among user transactions
  • Post-Mining
  • First perform mining operations on usage and
    content data independently
  • Integrate usage and content patterns in the
    recommendation phase
  • Example Content Profiles
  • Perform clustering on the term-pageview matrix
  • Each cluster centroid represents pages with some
    similar content
  • Use both content and usage profiles to generate
    recommendations

64
A.html B.html C.html D.html E.html
user1 1 0 1 0 1
user2 1 1 0 0 1
user3 0 1 1 1 0
user4 1 0 1 1 1
user5 1 1 0 0 1
user6 1 0 1 1 1
User transaction matrix UT
A.html B.html C.html D.html E.html
web 0 0 1 1 1
data 0 1 1 1 0
mining 0 1 1 1 0
business 1 1 0 0 0
intelligence 1 1 0 0 1
marketing 1 1 0 0 1
ecommerce 0 1 1 0 0
search 1 0 1 0 0
information 1 0 1 1 1
retrieval 1 0 1 1 1
Feature-Pageview Matrix FP
65
Content Enhanced Transactions
User-Feature Matrix UF
Note that UF UT x FPT
web data mining business intelligence marketing ecommerce search information retrieval
user1 2 1 1 1 2 2 1 2 3 3
user2 1 1 1 2 3 3 1 1 2 2
user3 2 3 3 1 1 1 2 1 2 2
user4 3 2 2 1 2 2 1 2 4 4
user5 1 1 1 2 3 3 1 1 2 2
user6 3 2 2 1 2 2 1 2 4 4
Example users 4 and 6 are more interested in
concepts related to Web information retrieval,
while user 3 is more interested in data mining.
66
Integrating Content and UsageFor Personalization
67
Example Content Profiles
Examples of feature (word) clusters (Association
for Consumer Research Web site)
CLUSTER 10 ---------- ballot result vote ...
CLUSTER 0 ---------- anthropologi associ behavior
...
CLUSTER 4 ---------- consum journal market Psychol
ogi
CLUSTER 11 ---------- advisori appoint committe co
uncil ...
Cluster Centroids
68
Example Usage Profiles
  • Example Usage Profiles from the ACR Site

1.00 Call for Papers 0.67 ACR News Special
Topics 0.67 CFP Journal of Psychology and
Marketing I 0.67 CFP Journal of Psychology and
Marketing II 0.67 CFP Journal of Consumer
Psychology II 0.67 CFP Journal of Consumer
Psychology I
1.00 CFP Winter 2000 SCP Conference 1.00 Call
for Papers 0.36 CFP ACR 1999 Asia-Pacific
Conference 0.30 ACR 1999 Annual
Conference 0.25 ACR News Updates 0.24 Conference
Update
  • Generated by clustering user transactions
    directly
  • Usage profiles represent groups of users commonly
    accessing certain pages together
  • Content profiles represent groups of pages with
    similar content

69
Comparison of Recommendations
70
Ontology-Based Usage Mining
  • Approach 1 Ontology-Enhanced Transactions
  • Initial transaction vector t ltweight(p1,t), ,
    weight(Pn,t)gt
  • Transform into content-enhanced transaction
  • t ltweight(o1,t), , weight(or,t)gt
  • The structured objects o1, , or are instances on
    ontology entities extracted from pages p1, , pn
    in the transaction
  • Now mining tasks can be performed based on
    ontological similarity among user transactions
  • Approach 2 Ontology-Enhanced Patterns
  • Discover usage patterns in the standard way
  • Transform patterns by creating an aggregate
    representation of the patterns based on the
    ontology
  • Requires the categorization of similar objects
    into ontology classes
  • Also requires the specification of different
    aggregation/combination function for each
    attribute of each class in the ontology

71
Example Ontology for a Movie Site
An example of a Movie object instance
72
Ontology-Based Pattern Aggregation
Year
Name
Genre
Actor
Usage profile
2002
A
S 0.7 T 0.2 U 0.1
Genre-All
Movie 1
0.50 Movie1.html 0.35 Movie2.html0.15 Movie3.html
Romance
Comedy
Romantic Comedy
KidFamily
Object Extraction
B
1999
S 0.5 T0.5
Movie 2
Genre-All
Romance
Comedy
Genre-All
2000
C
S 0.6 W 0.4
Movie 3
Ontology-Based Aggregation
Comedy
Semantic Usage pattern
73
Personalization with Semantic Usage Patterns
Current User Profile
Aggregate Semantic Usage Patterns
Note that the matching between the semantic
representations of users profile and patterns
requires computation of similarities at the
ontological level (may be defined based on
domain-specific characteristics)
Match Profiles
Extended User Profile
Instantiate to Real Web Objects
Recommendations of Items
74
Profile Injection Attacks
  • Consist of a number of "attack profiles"
  • added to the system by providing ratings for
    various items
  • engineered to bias the system's recommendations
  • Two basic types
  • Push attack (Shilling) designed to promote
    an item
  • Nuke attack designed to demote a item
  • Prior work has shown that CF recommender systems
    are highly vulnerable to such attacks
  • Attack Models
  • strategies for assigning ratings to items based
    on knowledge of the system, products, or users
  • examples of attack models random, average,
    bandwagon, segment, love-hate

75
A Successful Push Attack
Item1 Item 2 Item 3 Item 4 Item 5 Item 6 Correlation with Alice
Alice 5 2 3 3 ?
User 1 2 4 4 1 -1.00
User 2 2 1 3 1 2 0.33
User 3 4 2 3 2 1 .90
User 4 3 3 2 3 1 0.19
User 5 3 2 2 2 -1.00
User 6 5 3 1 3 2 0.65
User 7 5 1 5 1 -1.00
Attack 1 2 3 2 5 -1.00
Attack 2 3 2 3 2 5 0.76
Attack 3 3 2 2 2 5 0.93
BestMatch
Prediction ?
user-based algorithm using k-nearest neighbor
with k 1
76
A Generic Attack Profile
IS
IF

it
null null null
Ratings for l filler items
Ratings for k selected items
Rating for the target item
Unrated items in the attack profile
  • Attack models differ based on ratings assigned to
    filler and selected items

77
Average and Random Attack Models
IF

it
null null null rmax
Rating for the target item
Random ratings for l filler items
Unrated items in the attack profile
  • Random Attack filler items are assigned random
    ratings drawn from the overall distribution of
    ratings on all items across the whole DB
  • Average Attack ratings each filler item drawn
    from distribution defined by average rating for
    that item in the DB
  • The percentage of filler items determines the
    amount knowledge (and effort) required by the
    attacker

78
Bandwagon Attack Model
IS
IF

it
rmax rmax null null null rmax
Ratings for k frequently rated items
Random ratings for l filler items
Unrated items in the attack profile
Rating for the target item
  • What if the system's rating distribution is
    unknown?
  • Identify products that are frequently rated
    (e.g., blockbuster movies)
  • Associate the pushed product with them
  • Ratings for the filler items centered on overall
    system average rating (Similar to Random attack)
  • frequently rated items can be guessed or obtained
    externally

79
Segment Attack Model
IF
IS

it
rmax rmax rmin rmin null null null rmax
Ratings for k favorite items in user segment
Rating for the target item
Ratings for l filler items
Unrated items in the attack profile
  • Assume attacker wants to push product to a target
    segment of users
  • those with preference for similar products
  • fans of Harrison Ford
  • fans of horror movies
  • like bandwagon but for semantically-similar items
  • originally designed for attacking item-based CF
    algorithms
  • maximize sim(target item, segment items)
  • minimize sim(target item, non-segment items)

80
Nuke Attacks Love/Hate Attack Model
IF

it
rmax rmax null null null rmin
Min rating for the target item
Unrated items in the attack profile
Max rating for l filler items
  • A limited-knowledge attack in its simplest form
  • Target item given the minimum rating value
  • All other ratings in the filler item set are
    given the maximum rating value
  • Note
  • Variations of this (an the other models) can also
    be used as a push or nuke attacks, essentially by
    switching the roles of rmin and rmax.

81
How Effective Can Attacks Be?
  • First A Methodological Note
  • Using MovieLens 100K data set
  • 50 different "pushed" movies
  • selected randomly but mirroring overall
    distribution
  • 50 users randomly pre-selected
  • Results were averages over all runs for each
    movie-user pair
  • K 20 in all experiments
  • Evaluating results
  • prediction shift
  • how much the rating of the pushed movie differs
    before and after the attack
  • hit ratio
  • how often the pushed movie appears in a
    recommendation list before and after the attack

82
Example Results Average Attack
  • Average attack is very effective against user
    based algorithm (Random not as effective)
  • Item-based CF more robust (but vulnerable to
    other attack types such as segment attack
    Burke Mobasher, 2005

83
Example Results Bandwagon Attack
  • Only a small profile needed (3-7)
  • Only a few (lt 10) popular movies needed
  • As effective as the more data-intensive average
    attack (but still not effective against
    item-based algorithms)

84
Results Impact of Profile Size
Only a small number of filler items need to be
assigned ratings. An attacker, therefore, only
needs to use part of the product space to make
the attack effective.
In the item-based algorithm we dont see the same
drop-off, but prediction shift shows a
logarithmic behavior near maximum at about 7
filler size.
85
Example Results Segmented Attack Against
Item-Based CF
  • Very effective against targeted group
  • Best against item-based
  • Also effective against user-based
  • Low knowledge

86
Possible Solutions
  • Explicit trust calculation?
  • select peers through network of trust
    relationships
  • law of large numbers
  • hard to achieve numbers needed for CF to work
    well
  • Hybrid recommendation
  • Some indications that some hybrids may be more
    robust
  • Model-based recommenders
  • Certain recommenders using clustering are more
    robust, but generally at the cost of less
    accuracy
  • But a probabilistic approach has been shown to be
    relatively accurate
  • Detection and Response

87
Results Semantically Enhanced Hybrid
Semantic features extracted for movies top
actors, director, genre, synopsis (top
keywords), etc.
Alpha 0.0 100 semantic item-based
similarity Alpha 1.0 100 collaborative
item-based similarity
88
Approaches to Detection Response
  • Profile Classification
  • Classification model to identify attack profiles
    and exclude these profiles in computing
    predictions
  • Uses the characteristic features of most
    successful attack models
  • Designed to increase cost of attacks by detecting
    most effective attacks
  • Anomaly Detection
  • Classify Items (as being possibly under attack)
  • Not dependent on known attack models
  • Can shed some light on which type of items are
    most vulnerable to which types of attacks

But, what if the attack does not closely
correspond to known attack signature
In Practice need a comprehensive framework
combining both approaches
89
Conclusions
  • Why recommender systems?
  • Many algorithmic advances ? more accurate and
    reliable systems ? more confidence by users
  • Assist users in
  • Finding more relevant information, items,
    products
  • Give users alternatives ? broaden user knowledge
  • Building communities
  • Help companies to
  • Better engage users and customers ? building
    loyalty
  • Increase sales (on average 5-10)
  • Problems and challenges
  • More complex Web-based applications ? more
    complex user interactions ? need more
    sophisticated models
  • Need to further explore the impact of
    recommendations on (a) user behavior and (b) on
    the evolution of Web communities
  • Privacy, security, trust
Write a Comment
User Comments (0)
About PowerShow.com