Title: ECommerce WS 0607 S. 1
1E-Commerce Personalisierung und Recommender
Systems
2BegriffsklärungRecommender Systeme
- Begriff wurde 1997 durch Varian und Resnick
geprägt Communications of the ACM, Vol. 40 Nr.
3 - In everyday life, we rely on recommendations of
other people, either by word of mouth,
recommendation letters, movie and book reviews...
Recommender systems assist and augment this
natural social process ? Recommender Systeme
sind Anwendungen, die einem Akteur in einem
Entscheidungsprozess Empfehlungen zur Verfügung
stellen
Hal R. Varian
Paul Resnick
3(Vor)geschichte
- 1982 Peter J. Denning regt im Presidents
Letter des Journals Communications of the ACM
an, E-mails mit einem Indikator ihrer Wichtigkeit
zu versehen. Die Gewichtung dieses Indikators
erfolgt durch alle Mitglieder. - 1986 Im XEROX Palo Alto Research Center
(XPARC) wird zur Kanalisierung der
Informationsflut das System Tapestry
geschaffen. In diesem System können sowohl
inhaltsbasierte Filterungen (Content Based
Filtering) vorgenommen werden, als auch
Dokumentempfehlungen anderer Mitarbeiter
eingesehen werden (Collaborative Filtering).
Peter J. Denning
4Recommended References
- Shardanand, U., and Mayes, P. (1995) Social
Information Filtering Algorithms for Automating
Word of Mouth, in Proceedings of CHI95,
210-217. http//www.acm.org/sigchi/chi95/Electroni
c/documnts/papers/us_bdy.htm - Billsus, D., Pazzani, M.J. (1998) Learning
Collaborative Information Filters. In The 15th
International Conference on Machine Learning,
ICML-98. http//www.ics.uci.edu/dbillsus/papers/i
cml98.pdf - Smyth B., Cotter P., Surfing the Digital Wave,
Generating Personalised TV Listings using
Collaborative, Case-Based Recommendation, In
Proceedings of the Third International Conference
on Case-Based Reasoning ICCBR99, Springer. - Berkeley School of Information Systems, Link
Collection on Collaborative Filtering.http//www.
sims.berkeley.edu/resources/collab/
5Content-based Filtering
6Content-based Filtering/Selection
- Filtering and Selection basically means the same
- Filtering removing certain objects from a
universe - Selection picking certain objects from a
universe - Representation of products is required and a
notion of similarity between demands and products
- Roots of Content-based FilteringINFORMATION
RETRIEVAL
7Information Retrieval
- Analyzing the textual content of individual Web
pages - given users query
- determine a maximally related subset of documents
- Retrieval
- index a collection of documents (access
efficiency) - rank documents by importance (accuracy)
- Categorization (classification)
- assign a document to one or more categories
8Inverted Index
9Bucket Compression
- Reduce memory for each pointer in the buckets
- for each term sort occurrences by DID
- store as a list of gaps - the sequence of
differences between successive DIDs - Advantage significant memory saving
- frequent terms produce many small gaps
- small integers encoded by short variable-length
codewords - Example
- the sequence of DIDs (14, 22, 38, 42, 66, 122,
131, 226 ) - a sequence of gaps (14, 8, 16, 4, 24, 56, 9,
95)
10Content Based Ranking
- A boolean query
- results in several matching documents
- e.g., a user query in google Web AND graphs,
results in 4,040,000 matches - Problem
- user can examine only a fraction of result
- Content based ranking
- arrange results in the order of relevance to user
11Choice of Weights
What weights retrieve most relevant pages?
12Vector-space Model
- Text documents are mapped to a high-dimensional
vector space - Each document d
- represented as a sequence of terms ?(t)
- d (?(1), ?(2), ?(3), , ?(d))
- Unique terms in a set of documents
- determine the dimension of a vector space
13Example
Boolean representation of vectors V web,
graph, net, page, complex V1 1 1 0 0 0 V2
1 1 1 0 0 V3 1 0 0 1 1
14Vector-space Model
- ?1, ?2 and ?3 are terms in document, x and x? are
document vectors - Vector-space representations are sparse, V gtgt
d
15Document Similarity
- Ranks documents by measuring the similarity
between each document and the query - Similarity between two documents d and d? is a
function s(d, d?)? R - In a vector-space representation the cosine
coefficient of two document vectors is a measure
of similarity
16Cosine Coefficient
- The cosine of the angle formed by two document
vectors x and x? is - Documents with many common terms will have
vectors closer to each other than documents with
fewer overlapping terms
17Beispiel für Cosinus-Ähnlichkeitsfunktion
- Vier Dokumente zum Thema Astronomie stehen zur
Auswahl. - Der Akteur hat Text 1 gelesen und möchte wissen,
welcher der drei anderen Texte dem ersten Text am
ähnlichsten ist. - Dafür hat der Akteur drei Schlüsselbegriffe
definiert und ihnen einen Gewichtungsfaktor
gegeben.
18Latent Semantic Analysis
- Why need it?
- serious problems for retrieval methods based on
term matching - vector-space similarity approach works only if
the terms of the query are explicitly present in
the relevant documents - rich expressive power of natural language
- often queries contain terms that express concepts
related to text to be retrieved
19Synonymy and Polysemy
- Synonymy
- the same concept can be expressed using different
sets of terms - e.g. bandit, brigand, thief
- negatively affects recall
- Polysemy
- identical terms can be used in very different
semantic contexts - e.g. bank
- repository where important material is saved
- the slope beside a body of water
- negatively affects precision
20Latent Semantic Indexing(LSI)
- A statistical technique
- Uses linear algebra technique called singular
value decomposition (SVD) - attempts to estimate the hidden structure
- discovers the most important associative patterns
between words and concepts - Data driven
21LSI and Text Documents
- Let X denote a term-document matrix
- X x1 . . . xnT
- each row is the vector-space representation of a
document - each column contains occurrences of a term in
each document in the dataset - Latent semantic indexing
- compute the SVD of X
- ? - singular value matrix
- set to zero all but largest K singular values -
- obtain the reconstruction of X by
22SVD Mathematical Background
23SVD for Filtering
m x n
24LSI Example
- A collection of documents
- d1 Indian government goes for open-source
software - d2 Debian 3.0 Woody released
- d3 Wine 2.0 released with fixes for Gentoo 1.4
and Debian 3.0 - d4 gnuPOD released iPOD on Linux with GPLed
software - d5 Gentoo servers running at open-source mySQL
database - d6 Dolly the sheep not totally identical clone
- d7 DNA news introduced low-cost human genome
DNA chip - d8 Malaria-parasite genome database on the Web
- d9 UK sets up genome bank to protect rare sheep
breeds - d10 Dollys DNA damaged
25LSI Example
- The term-document matrix XT
- d1 d2 d3 d4 d5
d6 d7 d8 d9 d10 - opensource 1 0 0 0
1 0 0 0 0
0 - software 1 0 0
1 0 0 0 0
0 0 - Linux 0 0 0 1
0 0 0 0 0
0 - released 0 1 1
1 0 0 0 0
0 0 - Debian 0 1 1 0
0 0 0 0 0
0 - Gentoo 0 0 1 0
1 0 0 0 0
0 - database 0 0 0 0
1 0 0 1 0
0 - Dolly 0 0 0 0
0 1 0 0 0
1 - sheep 0 0 0 0
0 1 0 0 0
0 - genome 0 0 0 0
0 0 1 1 1
0 - DNA 0 0 0 0
0 0 2 0 0
1
26LSI Example
- The reconstructed term-document matrix
after projecting on a subspace of dimension K2 - ? diag(2.57, 2.49, 1.99, 1.9, 1.68, 1.53, 0.94,
0.66, 0.36, 0.10) - d1 d2 d3 d4
d5 d6 d7 d8 d9 d10 - open-source 0.34 0.28 0.38 0.42
0.24 0.00 0.04 0.07 0.02 0.01 - software 0.44 0.37 0.50 0.55
0.31 -0.01 -0.03 0.06 0.00 -0.02 - Linux 0.44 0.37 0.50 0.55
0.31 -0.01 -0.03 0.06 0.00 -0.02 - released 0.63 0.53 0.72 0.79
0.45 -0.01 -0.05 0.09 -0.00 -0.04 - Debian 0.39 0.33 0.44 0.48
0.28 -0.01 -0.03 0.06 0.00 -0.02 - Gentoo 0.36 0.30 0.41 0.45
0.26 0.00 0.03 0.07 0.02 0.01 - database 0.17 0.14 0.19 0.21
0.14 0.04 0.25 0.11 0.09 0.12 - Dolly -0.01 -0.01 -0.01 -0.02
0.03 0.08 0.45 0.13 0.14 0.21 - sheep -0.00 -0.00 -0.00 -0.01
0.03 0.06 0.34 0.10 0.11 0.16 - genome 0.02 0.01 0.02 0.01
0.10 0.19 1.11 0.34 0.36 0.53 - DNA -0.03 -0.04 -0.04 -0.06 0.11
0.30 1.70 0.51 0.55 0.81
27Vorteile und Nachteile des Content Based Filtering
- Vorteile
- Einfache Technik
- Neue Objekte (z.B. neue CDs) können schnell in
das System integriert werden, wenn die Attribute
gut erfassbar sind
- Nachteile
- Content Based Filtering kann keine wirklich
andersartigen Empfehlungen geben - Andersartige Objekte mit wenig Ähnlichkeiten zu
anderen Objekten können nur schwer empfohlen
werden - Die Auswahl der Attribute und deren Gewichtung
muss stimmen, damit eine gute Empfehlung
getroffen werden kann
28Collaborative Filtering
29Collaborative Filtering ApproachBasic Idea
- Select items based on aggregated user ratings of
those items - You buy an item only because many of your
friends (which share the same interest with you)
bought it an like it, although you dont really
know anything about the product. - Consider ratings of similar users (customers)
only - Requires stored user profiles of the kind
- Customer C1 likes (buys) product p1,p4,p8
- Customer C2 likes (buys) product p1,p2,p8
- ...
30Collaborative Filtering Approach
Products A,...,F
- Users 1, 2 and 3 are similar sincethey all
bought products A,B, and C - D E can be recommended to User 1 based on this
shared interest - Recommendation based on observations
- no detailed representation of D or E
- users must be identified, i.e., a user profile
must be available
User 2
User 1
F
B
D
A
E
C
User 3
31Collaborative Filtering
- Representation of input data
- Neighborhood formation
- Prediction/Top-N recommendation
32Challenges of RS
- Sparsity
- May hide good neighbors
- Results in poor quality and reduced coverage
- Scalability
- Enormous size of customer-product matrix
- Slow neighborhood search
- Slow prediction generation
33Challenges of RS
Challenges of RS
- Sparsity
- May hide good neighbors
- Results in poor quality and reduced coverage
- Scalability
- Enormous size of customer-product matrix
- Slow neighborhood search
- Slow prediction generation
34Collaborative Filtering Memory Based Approach
vs. Model Based Approach
- Für die Berechnung der besten Nachbarschaft
gibt es zwei Ansätze den Memory Based Approach
und den Model Based Approach. - Der Memory Based Approach verwendet alle
verfügbaren Daten um die Nachbarschaft zu
berechnen. - Der Model Based Approach erstellt ein
probabilistisches Modell aus einer Teilmenge der
Daten.
35Collaborative Filtering Memory Based Approach
vs. Model Based Approach
- Ein Memory Based Approach kann neue Daten sofort
inkrementell unterbringen, während ein Model
Based Approach dies erst nach einer Lernphase
kann. - Die Berechnung einer Empfehlung in einem Memory
Based Approach dauert länger als in einem Model
Based Approach, der seine Lernphase abgeschlossen
hat. - Mit steigender Datenmenge wird ein Memory Based
Approach langsamer, da Speicherbedarf und
Berechnungszeit linear ansteigen. - Für Model Based Approaches besteht bei sehr
großen Datenmengen die Gefahr einer prohibitiv
langen Lernphase
36Vorteile und Nachteile des Collaborative Filtering
- Vorteile
- Akteur bekommt auch unbekannte Objekte empfohlen,
welche inhaltlich mit den bisher erworbenen
differieren - Auch andersartige Objekte können empfohlen werden
(wenn sie von einem anderen Teilnehmer gekauft
bzw. bewertet wurden)
- Nachteile
- Cold-Start-Problem Neue Objekte können solange
nicht empfohlen werden, bis sie erstmals verkauft
wurden - Berechnung der Nachbarschaft ist aufwendig und
speicherintensiv (memory based approach) - Bei wenig Käufen/Bewertungen sinkt die
Empfehlungsqualität - Sparsity-Problem Die Teilnehmer-Objekt-Matrix
ist nicht hinreichend gefüllt
37Standard approach (1)
- Customer U gives ratings Ux for certain Products
xÎPU - A rating Ux is a value from an ordered set,
e.g., an Integer value 1..7, 1 dont like at
all ... 4 neutral ... 7
great stuff - Note Not every Customer rates every Product
- Determine similarity of customers U and V based
on the similarity of ratings of those products
both have rated, i.e., PUÇV.
38Standard approach (2)Distance/ Similarity
Measures for Customers
- Given two customers U and V
- Mean Squared Difference (Distance Measure)
- Pearson correlation coefficient may be better
rPearson(U,V) -
- ruv gt 0 positively related
- ruv 0 not related
- ruv lt 0 negatively related
39Standard approach (3)Determining Recommendations
- Profile of a new customer W is compared to the
profile of all known users U and the
similarity/distance rWU is determined - Users whose profile similarity exceeds a certain
threshold are selected - Rating for an item is a weighted average of
rating of similar users for that item - Products with the highest rating Wx are
recommended to W
40Shortcomings of thestandard approach
- Correlation only based on items which two
customers have in common - When thousands of items available only little
overlap! - Then Recommendations based on only a few
observations - lucky customers having bought rated only good
items get pulled down by subtracting the
customers average rating - Correlation Coefficient is not transitive,
however customer similarity is at least to some
degree transitive - If A and B correlated and B and C are correlated
then A and C should also be correlated
41Two-Layer Graph Model Huang 04
42Two-Layer Graph Model
43Two-Layer Graph Model
2-degree association C1-P1 0.50.6 0.3
3-degree association C1-P1 0.3 0.21
0.120.28 0.91