ECommerce WS 0607 S. 1 - PowerPoint PPT Presentation

1 / 43

About This Presentation

Title:

ECommerce WS 0607 S. 1

Description:

The reconstructed term-document matrix after projecting on a subspace of dimension K=2 ... Products with the highest rating Wx are recommended to W. E-Commerce ... – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 44

Provided by: stoc158

Category:

Tags: ecommerce

more less

Transcript and Presenter's Notes

Title: ECommerce WS 0607 S. 1

1
E-Commerce Personalisierung und Recommender
Systems
2
BegriffsklärungRecommender Systeme

Begriff wurde 1997 durch Varian und Resnick
geprägt Communications of the ACM, Vol. 40 Nr.
3
In everyday life, we rely on recommendations of
other people, either by word of mouth,
recommendation letters, movie and book reviews...
Recommender systems assist and augment this
natural social process ? Recommender Systeme
sind Anwendungen, die einem Akteur in einem
Entscheidungsprozess Empfehlungen zur Verfügung
stellen

Hal R. Varian
Paul Resnick
3
(Vor)geschichte

1982 Peter J. Denning regt im Presidents
Letter des Journals Communications of the ACM
an, E-mails mit einem Indikator ihrer Wichtigkeit
zu versehen. Die Gewichtung dieses Indikators
erfolgt durch alle Mitglieder.
1986 Im XEROX Palo Alto Research Center
(XPARC) wird zur Kanalisierung der
Informationsflut das System Tapestry
geschaffen. In diesem System können sowohl
inhaltsbasierte Filterungen (Content Based
Filtering) vorgenommen werden, als auch
Dokumentempfehlungen anderer Mitarbeiter
eingesehen werden (Collaborative Filtering).

Peter J. Denning
4
Recommended References

Shardanand, U., and Mayes, P. (1995) Social
Information Filtering Algorithms for Automating
Word of Mouth, in Proceedings of CHI95,
210-217. http//www.acm.org/sigchi/chi95/Electroni
c/documnts/papers/us_bdy.htm
Billsus, D., Pazzani, M.J. (1998) Learning
Collaborative Information Filters. In The 15th
International Conference on Machine Learning,
ICML-98. http//www.ics.uci.edu/dbillsus/papers/i
cml98.pdf
Smyth B., Cotter P., Surfing the Digital Wave,
Generating Personalised TV Listings using
Collaborative, Case-Based Recommendation, In
Proceedings of the Third International Conference
on Case-Based Reasoning ICCBR99, Springer.
Berkeley School of Information Systems, Link
Collection on Collaborative Filtering.http//www.
sims.berkeley.edu/resources/collab/

5
Content-based Filtering
6
Content-based Filtering/Selection

Filtering and Selection basically means the same
Filtering removing certain objects from a
universe
Selection picking certain objects from a
universe
Representation of products is required and a
notion of similarity between demands and products
Roots of Content-based FilteringINFORMATION
RETRIEVAL

7
Information Retrieval

Analyzing the textual content of individual Web
pages
given users query
determine a maximally related subset of documents
Retrieval
index a collection of documents (access
efficiency)
rank documents by importance (accuracy)
Categorization (classification)
assign a document to one or more categories

8
Inverted Index
9
Bucket Compression

Reduce memory for each pointer in the buckets
for each term sort occurrences by DID
store as a list of gaps - the sequence of
differences between successive DIDs
Advantage significant memory saving
frequent terms produce many small gaps
small integers encoded by short variable-length
codewords
Example
the sequence of DIDs (14, 22, 38, 42, 66, 122,
131, 226 )
a sequence of gaps (14, 8, 16, 4, 24, 56, 9,
95)

10
Content Based Ranking

A boolean query
results in several matching documents
e.g., a user query in google Web AND graphs,
results in 4,040,000 matches
Problem
user can examine only a fraction of result
Content based ranking
arrange results in the order of relevance to user

11
Choice of Weights
What weights retrieve most relevant pages?
12
Vector-space Model

Text documents are mapped to a high-dimensional
vector space
Each document d
represented as a sequence of terms ?(t)
d (?(1), ?(2), ?(3), , ?(d))
Unique terms in a set of documents
determine the dimension of a vector space

13
Example
Boolean representation of vectors V web,
graph, net, page, complex V1 1 1 0 0 0 V2
1 1 1 0 0 V3 1 0 0 1 1
14
Vector-space Model

?1, ?2 and ?3 are terms in document, x and x? are
document vectors
Vector-space representations are sparse, V gtgt
d

15
Document Similarity

Ranks documents by measuring the similarity
between each document and the query
Similarity between two documents d and d? is a
function s(d, d?)? R
In a vector-space representation the cosine
coefficient of two document vectors is a measure
of similarity

16
Cosine Coefficient

The cosine of the angle formed by two document
vectors x and x? is
Documents with many common terms will have
vectors closer to each other than documents with
fewer overlapping terms

17
Beispiel für Cosinus-Ähnlichkeitsfunktion

Vier Dokumente zum Thema Astronomie stehen zur
Auswahl.
Der Akteur hat Text 1 gelesen und möchte wissen,
welcher der drei anderen Texte dem ersten Text am
ähnlichsten ist.
Dafür hat der Akteur drei Schlüsselbegriffe
definiert und ihnen einen Gewichtungsfaktor
gegeben.

18
Latent Semantic Analysis

Why need it?
serious problems for retrieval methods based on
term matching
vector-space similarity approach works only if
the terms of the query are explicitly present in
the relevant documents
rich expressive power of natural language
often queries contain terms that express concepts
related to text to be retrieved

19
Synonymy and Polysemy

Synonymy
the same concept can be expressed using different
sets of terms
e.g. bandit, brigand, thief
negatively affects recall
Polysemy
identical terms can be used in very different
semantic contexts
e.g. bank
repository where important material is saved
the slope beside a body of water
negatively affects precision

20
Latent Semantic Indexing(LSI)

A statistical technique
Uses linear algebra technique called singular
value decomposition (SVD)
attempts to estimate the hidden structure
discovers the most important associative patterns
between words and concepts
Data driven

21
LSI and Text Documents

Let X denote a term-document matrix
X x1 . . . xnT
each row is the vector-space representation of a
document
each column contains occurrences of a term in
each document in the dataset
Latent semantic indexing
compute the SVD of X
? - singular value matrix
set to zero all but largest K singular values -
obtain the reconstruction of X by

22
SVD Mathematical Background
23
SVD for Filtering
m x n
24
LSI Example

A collection of documents
d1 Indian government goes for open-source
software
d2 Debian 3.0 Woody released
d3 Wine 2.0 released with fixes for Gentoo 1.4
and Debian 3.0
d4 gnuPOD released iPOD on Linux with GPLed
software
d5 Gentoo servers running at open-source mySQL
database
d6 Dolly the sheep not totally identical clone
d7 DNA news introduced low-cost human genome
DNA chip
d8 Malaria-parasite genome database on the Web
d9 UK sets up genome bank to protect rare sheep
breeds
d10 Dollys DNA damaged

25
LSI Example

The term-document matrix XT
d1 d2 d3 d4 d5
d6 d7 d8 d9 d10
opensource 1 0 0 0
1 0 0 0 0
0
software 1 0 0
1 0 0 0 0
0 0
Linux 0 0 0 1
0 0 0 0 0
0
released 0 1 1
1 0 0 0 0
0 0
Debian 0 1 1 0
0 0 0 0 0
0
Gentoo 0 0 1 0
1 0 0 0 0
0
database 0 0 0 0
1 0 0 1 0
0
Dolly 0 0 0 0
0 1 0 0 0
1
sheep 0 0 0 0
0 1 0 0 0
0
genome 0 0 0 0
0 0 1 1 1
0
DNA 0 0 0 0
0 0 2 0 0
1

26
LSI Example

The reconstructed term-document matrix
after projecting on a subspace of dimension K2
? diag(2.57, 2.49, 1.99, 1.9, 1.68, 1.53, 0.94,
0.66, 0.36, 0.10)
d1 d2 d3 d4
d5 d6 d7 d8 d9 d10
open-source 0.34 0.28 0.38 0.42
0.24 0.00 0.04 0.07 0.02 0.01
software 0.44 0.37 0.50 0.55
0.31 -0.01 -0.03 0.06 0.00 -0.02
Linux 0.44 0.37 0.50 0.55
0.31 -0.01 -0.03 0.06 0.00 -0.02
released 0.63 0.53 0.72 0.79
0.45 -0.01 -0.05 0.09 -0.00 -0.04
Debian 0.39 0.33 0.44 0.48
0.28 -0.01 -0.03 0.06 0.00 -0.02
Gentoo 0.36 0.30 0.41 0.45
0.26 0.00 0.03 0.07 0.02 0.01
database 0.17 0.14 0.19 0.21
0.14 0.04 0.25 0.11 0.09 0.12
Dolly -0.01 -0.01 -0.01 -0.02
0.03 0.08 0.45 0.13 0.14 0.21
sheep -0.00 -0.00 -0.00 -0.01
0.03 0.06 0.34 0.10 0.11 0.16
genome 0.02 0.01 0.02 0.01
0.10 0.19 1.11 0.34 0.36 0.53
DNA -0.03 -0.04 -0.04 -0.06 0.11
0.30 1.70 0.51 0.55 0.81

27
Vorteile und Nachteile des Content Based Filtering

Vorteile
Einfache Technik
Neue Objekte (z.B. neue CDs) können schnell in
das System integriert werden, wenn die Attribute
gut erfassbar sind

Nachteile
Content Based Filtering kann keine wirklich
andersartigen Empfehlungen geben
Andersartige Objekte mit wenig Ähnlichkeiten zu
anderen Objekten können nur schwer empfohlen
werden
Die Auswahl der Attribute und deren Gewichtung
muss stimmen, damit eine gute Empfehlung
getroffen werden kann

28
Collaborative Filtering
29
Collaborative Filtering ApproachBasic Idea

Select items based on aggregated user ratings of
those items
You buy an item only because many of your
friends (which share the same interest with you)
bought it an like it, although you dont really
know anything about the product.
Consider ratings of similar users (customers)
only
Requires stored user profiles of the kind
Customer C1 likes (buys) product p1,p4,p8
Customer C2 likes (buys) product p1,p2,p8
...

30
Collaborative Filtering Approach
Products A,...,F

Users 1, 2 and 3 are similar sincethey all
bought products A,B, and C
D E can be recommended to User 1 based on this
shared interest
Recommendation based on observations
no detailed representation of D or E
users must be identified, i.e., a user profile
must be available

User 2
User 1
F
B
D
A
E
C
User 3
31
Collaborative Filtering

Representation of input data
Neighborhood formation
Prediction/Top-N recommendation

32
Challenges of RS

Sparsity
May hide good neighbors
Results in poor quality and reduced coverage
Scalability
Enormous size of customer-product matrix
Slow neighborhood search
Slow prediction generation

33
Challenges of RS
Challenges of RS

Sparsity
May hide good neighbors
Results in poor quality and reduced coverage
Scalability
Enormous size of customer-product matrix
Slow neighborhood search
Slow prediction generation

34
Collaborative Filtering Memory Based Approach
vs. Model Based Approach

Für die Berechnung der besten Nachbarschaft
gibt es zwei Ansätze den Memory Based Approach
und den Model Based Approach.
Der Memory Based Approach verwendet alle
verfügbaren Daten um die Nachbarschaft zu
berechnen.
Der Model Based Approach erstellt ein
probabilistisches Modell aus einer Teilmenge der
Daten.

35
Collaborative Filtering Memory Based Approach
vs. Model Based Approach

Ein Memory Based Approach kann neue Daten sofort
inkrementell unterbringen, während ein Model
Based Approach dies erst nach einer Lernphase
kann.
Die Berechnung einer Empfehlung in einem Memory
Based Approach dauert länger als in einem Model
Based Approach, der seine Lernphase abgeschlossen
hat.
Mit steigender Datenmenge wird ein Memory Based
Approach langsamer, da Speicherbedarf und
Berechnungszeit linear ansteigen.
Für Model Based Approaches besteht bei sehr
großen Datenmengen die Gefahr einer prohibitiv
langen Lernphase

36
Vorteile und Nachteile des Collaborative Filtering

Vorteile
Akteur bekommt auch unbekannte Objekte empfohlen,
welche inhaltlich mit den bisher erworbenen
differieren
Auch andersartige Objekte können empfohlen werden
(wenn sie von einem anderen Teilnehmer gekauft
bzw. bewertet wurden)

Nachteile
Cold-Start-Problem Neue Objekte können solange
nicht empfohlen werden, bis sie erstmals verkauft
wurden
Berechnung der Nachbarschaft ist aufwendig und
speicherintensiv (memory based approach)
Bei wenig Käufen/Bewertungen sinkt die
Empfehlungsqualität
Sparsity-Problem Die Teilnehmer-Objekt-Matrix
ist nicht hinreichend gefüllt

37
Standard approach (1)

Customer U gives ratings Ux for certain Products
xÎPU
A rating Ux is a value from an ordered set,
e.g., an Integer value 1..7, 1 dont like at
all ... 4 neutral ... 7
great stuff
Note Not every Customer rates every Product
Determine similarity of customers U and V based
on the similarity of ratings of those products
both have rated, i.e., PUÇV.

38
Standard approach (2)Distance/ Similarity
Measures for Customers

Given two customers U and V
Mean Squared Difference (Distance Measure)
Pearson correlation coefficient may be better
rPearson(U,V)
ruv gt 0 positively related
ruv 0 not related
ruv lt 0 negatively related

39
Standard approach (3)Determining Recommendations

Profile of a new customer W is compared to the
profile of all known users U and the
similarity/distance rWU is determined
Users whose profile similarity exceeds a certain
threshold are selected
Rating for an item is a weighted average of
rating of similar users for that item
Products with the highest rating Wx are
recommended to W

40
Shortcomings of thestandard approach

Correlation only based on items which two
customers have in common
When thousands of items available only little
overlap!
Then Recommendations based on only a few
observations
lucky customers having bought rated only good
items get pulled down by subtracting the
customers average rating
Correlation Coefficient is not transitive,
however customer similarity is at least to some
degree transitive
If A and B correlated and B and C are correlated
then A and C should also be correlated

41
Two-Layer Graph Model Huang 04
42
Two-Layer Graph Model
43
Two-Layer Graph Model
2-degree association C1-P1 0.50.6 0.3
3-degree association C1-P1 0.3 0.21
0.120.28 0.91

Write a Comment

User Comments (0)