ECommerce WS 0607 S. 1 - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

ECommerce WS 0607 S. 1

Description:

The reconstructed term-document matrix after projecting on a subspace of dimension K=2 ... Products with the highest rating Wx are recommended to W. E-Commerce ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 44
Provided by: stoc158
Category:
Tags: ecommerce

less

Transcript and Presenter's Notes

Title: ECommerce WS 0607 S. 1


1
E-Commerce Personalisierung und Recommender
Systems
2
BegriffsklärungRecommender Systeme
  • Begriff wurde 1997 durch Varian und Resnick
    geprägt Communications of the ACM, Vol. 40 Nr.
    3
  • In everyday life, we rely on recommendations of
    other people, either by word of mouth,
    recommendation letters, movie and book reviews...
    Recommender systems assist and augment this
    natural social process ? Recommender Systeme
    sind Anwendungen, die einem Akteur in einem
    Entscheidungsprozess Empfehlungen zur Verfügung
    stellen

Hal R. Varian
Paul Resnick
3
(Vor)geschichte
  • 1982 Peter J. Denning regt im Presidents
    Letter des Journals Communications of the ACM
    an, E-mails mit einem Indikator ihrer Wichtigkeit
    zu versehen. Die Gewichtung dieses Indikators
    erfolgt durch alle Mitglieder.
  • 1986 Im XEROX Palo Alto Research Center
    (XPARC) wird zur Kanalisierung der
    Informationsflut das System Tapestry
    geschaffen. In diesem System können sowohl
    inhaltsbasierte Filterungen (Content Based
    Filtering) vorgenommen werden, als auch
    Dokumentempfehlungen anderer Mitarbeiter
    eingesehen werden (Collaborative Filtering).

Peter J. Denning
4
Recommended References
  • Shardanand, U., and Mayes, P. (1995) Social
    Information Filtering Algorithms for Automating
    Word of Mouth, in Proceedings of CHI95,
    210-217. http//www.acm.org/sigchi/chi95/Electroni
    c/documnts/papers/us_bdy.htm
  • Billsus, D., Pazzani, M.J. (1998) Learning
    Collaborative Information Filters. In The 15th
    International Conference on Machine Learning,
    ICML-98. http//www.ics.uci.edu/dbillsus/papers/i
    cml98.pdf
  • Smyth B., Cotter P., Surfing the Digital Wave,
    Generating Personalised TV Listings using
    Collaborative, Case-Based Recommendation, In
    Proceedings of the Third International Conference
    on Case-Based Reasoning ICCBR99, Springer.
  • Berkeley School of Information Systems, Link
    Collection on Collaborative Filtering.http//www.
    sims.berkeley.edu/resources/collab/

5
Content-based Filtering
6
Content-based Filtering/Selection
  • Filtering and Selection basically means the same
  • Filtering removing certain objects from a
    universe
  • Selection picking certain objects from a
    universe
  • Representation of products is required and a
    notion of similarity between demands and products
  • Roots of Content-based FilteringINFORMATION
    RETRIEVAL

7
Information Retrieval
  • Analyzing the textual content of individual Web
    pages
  • given users query
  • determine a maximally related subset of documents
  • Retrieval
  • index a collection of documents (access
    efficiency)
  • rank documents by importance (accuracy)
  • Categorization (classification)
  • assign a document to one or more categories

8
Inverted Index
9
Bucket Compression
  • Reduce memory for each pointer in the buckets
  • for each term sort occurrences by DID
  • store as a list of gaps - the sequence of
    differences between successive DIDs
  • Advantage significant memory saving
  • frequent terms produce many small gaps
  • small integers encoded by short variable-length
    codewords
  • Example
  • the sequence of DIDs (14, 22, 38, 42, 66, 122,
    131, 226 )
  • a sequence of gaps (14, 8, 16, 4, 24, 56, 9,
    95)

10
Content Based Ranking
  • A boolean query
  • results in several matching documents
  • e.g., a user query in google Web AND graphs,
    results in 4,040,000 matches
  • Problem
  • user can examine only a fraction of result
  • Content based ranking
  • arrange results in the order of relevance to user

11
Choice of Weights
What weights retrieve most relevant pages?
12
Vector-space Model
  • Text documents are mapped to a high-dimensional
    vector space
  • Each document d
  • represented as a sequence of terms ?(t)
  • d (?(1), ?(2), ?(3), , ?(d))
  • Unique terms in a set of documents
  • determine the dimension of a vector space

13
Example
Boolean representation of vectors V web,
graph, net, page, complex V1 1 1 0 0 0 V2
1 1 1 0 0 V3 1 0 0 1 1
14
Vector-space Model
  • ?1, ?2 and ?3 are terms in document, x and x? are
    document vectors
  • Vector-space representations are sparse, V gtgt
    d

15
Document Similarity
  • Ranks documents by measuring the similarity
    between each document and the query
  • Similarity between two documents d and d? is a
    function s(d, d?)? R
  • In a vector-space representation the cosine
    coefficient of two document vectors is a measure
    of similarity

16
Cosine Coefficient
  • The cosine of the angle formed by two document
    vectors x and x? is
  • Documents with many common terms will have
    vectors closer to each other than documents with
    fewer overlapping terms

17
Beispiel für Cosinus-Ähnlichkeitsfunktion
  • Vier Dokumente zum Thema Astronomie stehen zur
    Auswahl.
  • Der Akteur hat Text 1 gelesen und möchte wissen,
    welcher der drei anderen Texte dem ersten Text am
    ähnlichsten ist.
  • Dafür hat der Akteur drei Schlüsselbegriffe
    definiert und ihnen einen Gewichtungsfaktor
    gegeben.

18
Latent Semantic Analysis
  • Why need it?
  • serious problems for retrieval methods based on
    term matching
  • vector-space similarity approach works only if
    the terms of the query are explicitly present in
    the relevant documents
  • rich expressive power of natural language
  • often queries contain terms that express concepts
    related to text to be retrieved

19
Synonymy and Polysemy
  • Synonymy
  • the same concept can be expressed using different
    sets of terms
  • e.g. bandit, brigand, thief
  • negatively affects recall
  • Polysemy
  • identical terms can be used in very different
    semantic contexts
  • e.g. bank
  • repository where important material is saved
  • the slope beside a body of water
  • negatively affects precision

20
Latent Semantic Indexing(LSI)
  • A statistical technique
  • Uses linear algebra technique called singular
    value decomposition (SVD)
  • attempts to estimate the hidden structure
  • discovers the most important associative patterns
    between words and concepts
  • Data driven

21
LSI and Text Documents
  • Let X denote a term-document matrix
  • X x1 . . . xnT
  • each row is the vector-space representation of a
    document
  • each column contains occurrences of a term in
    each document in the dataset
  • Latent semantic indexing
  • compute the SVD of X
  • ? - singular value matrix
  • set to zero all but largest K singular values -
  • obtain the reconstruction of X by

22
SVD Mathematical Background
23
SVD for Filtering
m x n
24
LSI Example
  • A collection of documents
  • d1 Indian government goes for open-source
    software
  • d2 Debian 3.0 Woody released
  • d3 Wine 2.0 released with fixes for Gentoo 1.4
    and Debian 3.0
  • d4 gnuPOD released iPOD on Linux with GPLed
    software
  • d5 Gentoo servers running at open-source mySQL
    database
  • d6 Dolly the sheep not totally identical clone
  • d7 DNA news introduced low-cost human genome
    DNA chip
  • d8 Malaria-parasite genome database on the Web
  • d9 UK sets up genome bank to protect rare sheep
    breeds
  • d10 Dollys DNA damaged

25
LSI Example
  • The term-document matrix XT
  • d1 d2 d3 d4 d5
    d6 d7 d8 d9 d10
  • opensource 1 0 0 0
    1 0 0 0 0
    0
  • software 1 0 0
    1 0 0 0 0
    0 0
  • Linux 0 0 0 1
    0 0 0 0 0
    0
  • released 0 1 1
    1 0 0 0 0
    0 0
  • Debian 0 1 1 0
    0 0 0 0 0
    0
  • Gentoo 0 0 1 0
    1 0 0 0 0
    0
  • database 0 0 0 0
    1 0 0 1 0
    0
  • Dolly 0 0 0 0
    0 1 0 0 0
    1
  • sheep 0 0 0 0
    0 1 0 0 0
    0
  • genome 0 0 0 0
    0 0 1 1 1
    0
  • DNA 0 0 0 0
    0 0 2 0 0
    1

26
LSI Example
  • The reconstructed term-document matrix
    after projecting on a subspace of dimension K2
  • ? diag(2.57, 2.49, 1.99, 1.9, 1.68, 1.53, 0.94,
    0.66, 0.36, 0.10)
  • d1 d2 d3 d4
    d5 d6 d7 d8 d9 d10
  • open-source 0.34 0.28 0.38 0.42
    0.24 0.00 0.04 0.07 0.02 0.01
  • software 0.44 0.37 0.50 0.55
    0.31 -0.01 -0.03 0.06 0.00 -0.02
  • Linux 0.44 0.37 0.50 0.55
    0.31 -0.01 -0.03 0.06 0.00 -0.02
  • released 0.63 0.53 0.72 0.79
    0.45 -0.01 -0.05 0.09 -0.00 -0.04
  • Debian 0.39 0.33 0.44 0.48
    0.28 -0.01 -0.03 0.06 0.00 -0.02
  • Gentoo 0.36 0.30 0.41 0.45
    0.26 0.00 0.03 0.07 0.02 0.01
  • database 0.17 0.14 0.19 0.21
    0.14 0.04 0.25 0.11 0.09 0.12
  • Dolly -0.01 -0.01 -0.01 -0.02
    0.03 0.08 0.45 0.13 0.14 0.21
  • sheep -0.00 -0.00 -0.00 -0.01
    0.03 0.06 0.34 0.10 0.11 0.16
  • genome 0.02 0.01 0.02 0.01
    0.10 0.19 1.11 0.34 0.36 0.53
  • DNA -0.03 -0.04 -0.04 -0.06 0.11
    0.30 1.70 0.51 0.55 0.81

27
Vorteile und Nachteile des Content Based Filtering
  • Vorteile
  • Einfache Technik
  • Neue Objekte (z.B. neue CDs) können schnell in
    das System integriert werden, wenn die Attribute
    gut erfassbar sind
  • Nachteile
  • Content Based Filtering kann keine wirklich
    andersartigen Empfehlungen geben
  • Andersartige Objekte mit wenig Ähnlichkeiten zu
    anderen Objekten können nur schwer empfohlen
    werden
  • Die Auswahl der Attribute und deren Gewichtung
    muss stimmen, damit eine gute Empfehlung
    getroffen werden kann

28
Collaborative Filtering
29
Collaborative Filtering ApproachBasic Idea
  • Select items based on aggregated user ratings of
    those items
  • You buy an item only because many of your
    friends (which share the same interest with you)
    bought it an like it, although you dont really
    know anything about the product.
  • Consider ratings of similar users (customers)
    only
  • Requires stored user profiles of the kind
  • Customer C1 likes (buys) product p1,p4,p8
  • Customer C2 likes (buys) product p1,p2,p8
  • ...

30
Collaborative Filtering Approach
Products A,...,F
  • Users 1, 2 and 3 are similar sincethey all
    bought products A,B, and C
  • D E can be recommended to User 1 based on this
    shared interest
  • Recommendation based on observations
  • no detailed representation of D or E
  • users must be identified, i.e., a user profile
    must be available

User 2
User 1
F
B
D
A
E
C
User 3
31
Collaborative Filtering
  • Representation of input data
  • Neighborhood formation
  • Prediction/Top-N recommendation





32
Challenges of RS
  • Sparsity
  • May hide good neighbors
  • Results in poor quality and reduced coverage
  • Scalability
  • Enormous size of customer-product matrix
  • Slow neighborhood search
  • Slow prediction generation

33
Challenges of RS
Challenges of RS
  • Sparsity
  • May hide good neighbors
  • Results in poor quality and reduced coverage
  • Scalability
  • Enormous size of customer-product matrix
  • Slow neighborhood search
  • Slow prediction generation

34
Collaborative Filtering Memory Based Approach
vs. Model Based Approach
  • Für die Berechnung der besten Nachbarschaft
    gibt es zwei Ansätze den Memory Based Approach
    und den Model Based Approach.
  • Der Memory Based Approach verwendet alle
    verfügbaren Daten um die Nachbarschaft zu
    berechnen.
  • Der Model Based Approach erstellt ein
    probabilistisches Modell aus einer Teilmenge der
    Daten.

35
Collaborative Filtering Memory Based Approach
vs. Model Based Approach
  • Ein Memory Based Approach kann neue Daten sofort
    inkrementell unterbringen, während ein Model
    Based Approach dies erst nach einer Lernphase
    kann.
  • Die Berechnung einer Empfehlung in einem Memory
    Based Approach dauert länger als in einem Model
    Based Approach, der seine Lernphase abgeschlossen
    hat.
  • Mit steigender Datenmenge wird ein Memory Based
    Approach langsamer, da Speicherbedarf und
    Berechnungszeit linear ansteigen.
  • Für Model Based Approaches besteht bei sehr
    großen Datenmengen die Gefahr einer prohibitiv
    langen Lernphase

36
Vorteile und Nachteile des Collaborative Filtering
  • Vorteile
  • Akteur bekommt auch unbekannte Objekte empfohlen,
    welche inhaltlich mit den bisher erworbenen
    differieren
  • Auch andersartige Objekte können empfohlen werden
    (wenn sie von einem anderen Teilnehmer gekauft
    bzw. bewertet wurden)
  • Nachteile
  • Cold-Start-Problem Neue Objekte können solange
    nicht empfohlen werden, bis sie erstmals verkauft
    wurden
  • Berechnung der Nachbarschaft ist aufwendig und
    speicherintensiv (memory based approach)
  • Bei wenig Käufen/Bewertungen sinkt die
    Empfehlungsqualität
  • Sparsity-Problem Die Teilnehmer-Objekt-Matrix
    ist nicht hinreichend gefüllt

37
Standard approach (1)
  • Customer U gives ratings Ux for certain Products
    xÎPU
  • A rating Ux is a value from an ordered set,
    e.g., an Integer value 1..7, 1 dont like at
    all ... 4 neutral ... 7
    great stuff
  • Note Not every Customer rates every Product
  • Determine similarity of customers U and V based
    on the similarity of ratings of those products
    both have rated, i.e., PUÇV.

38
Standard approach (2)Distance/ Similarity
Measures for Customers
  • Given two customers U and V
  • Mean Squared Difference (Distance Measure)
  • Pearson correlation coefficient may be better
    rPearson(U,V)
  • ruv gt 0 positively related
  • ruv 0 not related
  • ruv lt 0 negatively related

39
Standard approach (3)Determining Recommendations
  • Profile of a new customer W is compared to the
    profile of all known users U and the
    similarity/distance rWU is determined
  • Users whose profile similarity exceeds a certain
    threshold are selected
  • Rating for an item is a weighted average of
    rating of similar users for that item
  • Products with the highest rating Wx are
    recommended to W

40
Shortcomings of thestandard approach
  • Correlation only based on items which two
    customers have in common
  • When thousands of items available only little
    overlap!
  • Then Recommendations based on only a few
    observations
  • lucky customers having bought rated only good
    items get pulled down by subtracting the
    customers average rating
  • Correlation Coefficient is not transitive,
    however customer similarity is at least to some
    degree transitive
  • If A and B correlated and B and C are correlated
    then A and C should also be correlated

41
Two-Layer Graph Model Huang 04
42
Two-Layer Graph Model
43
Two-Layer Graph Model
2-degree association C1-P1 0.50.6 0.3
3-degree association C1-P1 0.3 0.21
0.120.28 0.91
Write a Comment
User Comments (0)
About PowerShow.com