E-Commerce - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

E-Commerce

Description:

E-Commerce Outline Introduction Customer Data on the Web Automated Recommender Systems Networks and Recommendations Web Path Analysis for Purchase Prediction ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 34
Provided by: cjen2
Learn more at: http://ibook.ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: E-Commerce


1
E-Commerce
2
Outline
  • Introduction
  • Customer Data on the Web
  • Automated Recommender Systems
  • Networks and Recommendations
  • Web Path Analysis for Purchase Prediction

3
Introduction
  • Some Motivating Questions
  • Can we design algorithms to help recommend new
    products to visitors based on their browsing
    behavior?
  • Can we better understand factors influencing how
    customers make purchases on a website?
  • Can we predict in real time who will make
    purchases based on their observed navigation
    patterns?

4
Customer Data on the Web
  • Data collection on client, server sides and
    anywhere in between
  • Goal determine who is purchasing what products
  • Tracking customer data
  • Web logs, E-Commerce logs, cookies, explicit
    login
  • Data then used to provide personalized content to
    site users to
  • Assist customers in locating their target
    selections
  • Encourage customers to make certain selections

5
Automated Recommender Systems
  • Problem framed in two ways
  • Users vote for pages/items (binary)
  • Users rank pages/items (multivalued)
  • Results are captured in a generally sparse matrix
    (users x items)
  • Complication no votes can occur because users do
    not vote on items they do not like (Breeze, et al
    1998)
  • Ignored by most recommender systems

6
Automated Recommender Systems
7
Evaluating Recommender Systems
  • Cautions in data interpretation
  • Users may purchase items regardless of
    recommendations
  • Users may also avoid purchases they might have
    made based on recommendations
  • Approaches to recommender algorithms
  • Nearest-neighbor
  • Model-based collaborative filtering
  • Others?

8
Nearest-Neighbor Collaborative Filtering
  • Basic principle utilize users vote history to
    predict future votes/recommendations
  • Find most similar users to the target user in the
    training matrix and fill in the target users
    missing vote values based on these
    nearest-neighbors
  • A typical normalized prediction scheme
  • goal predict vote for item j based on
    other users, weighted towards those with
    similar past votes as target user a

9
Nearest-Neighbor Collaborative Filtering
  • Another challenge defining weights
  • What is the most optimal weight calculation to
    use?
  • Requires fine tuning of weighting algorithm for
    the particular data set
  • What do we do when the target user has not voted
    enough to provide a reliable set of
    nearest-neighbors?
  • One approach use default votes (popular items)
    to populate matrix on items neither the target
    user nor the nearest-neighbor have voted on
  • A different approach model-based prediction
    using Dirichlet priors to smooth the votes (see
    chapter 7)
  • Other factors include relative vote counts for
    all items between users, thresholding, clustering
    (see Sarwar, 2000)

10
Nearest-Neighbor Collaborative Filtering
  • Structure based recommendations
  • Recommendations based on similarities between
    items with positive votes (as opposed to votes of
    other users)
  • Structure of item dependencies modeled through
    dimensionality reduction via singular value
    decomposition (SVD) aka latent semantic indexing
    (see chapter 4)
  • Approximate the set of row-vector votes as a
    linear combination of basis column-vectors
  • i.e. find the set of columns to least-squares
    minimize the difference between the row
    estimations and their true values
  • Perform nearest-neighbor calculations to project
    predictions for all items

11
Model Based Collaborative Filtering
  • Recommendations based on a model of
    relationships between items based on historical
    voting patterns in the training set
  • Better performance than nearest-neighbor analysis
  • Joint distribution modeling
  • Uses one model as basis for predictions
  • Conditional distribution modeling
  • A model for each item predicting future vote
    based on votes for each of the other items

12
Model Based Collaborative Filtering
  • Joint distribution modeling A practical approach
  • Model joint distribution as a finite mixture of
    simpler distributions
  • Additional simplification is achieved by assuming
    that votes are independent of others within a
    component
  • Limitation assumes that users can be described
    with one model of the K mixture components
  • Hoffman and Puzicha (1999) propose a workaround
    asserting that each row of votes represents up to
    K mixture components, rather than a single
    component

13
Model Based Collaborative Filtering
  • Another limitation all predictions are based on
    the (static) training set
  • Conditional distribution modeling
  • Better results by creating a model for each item
    conditioned on the others rather than using a
    single joint density model
  • Decision trees Heckerman et al. (2000)
  • Greedy approach to approximate tree structure
  • Predictions are made for each item not purchased
    or visited
  • Performance
  • Accuracy nearly equal to Bayesian networks
  • Offline memory usage significantly less than
    Bayesian networks
  • Offline computation time complexity better than
    Bayesian networks

14
Model-Based Combining of Votes and Content
  • Combine content-specific information with other
    information (e.g. structure, vote)
  • Useful for determining item similarity (Mooney
    and Roy 2000) and creating user models
  • Useful when there is no vote history
  • Implementation (Popescul et al 2000)
  • Extension of (Hoffman and Puzicha 1999)
  • Joint density is determined assuming a hidden
    latent variable making users, documents, and
    words conditionally independent i.e.

15
Model-Based Combining of Votes and Content
  • The hidden variable represents multiple (hidden)
    topics of a document
  • Conditional probabilities of the hidden parameter
    are calculated using EM
  • Sparsity still remains a problem for
    content-based modeling

16
Challenges
  • Noisy Data
  • The same user may use multiple IP
    addresses/logins
  • Different users may use the same IP address/login
  • Privacy
  • No cookies!
  • Changing user habits
  • Previous history may not accurately predict
    present purchase selection
  • Continuous updating of user activities

17
Networks Recommendation
  • Word-of-Mouth
  • Needs little explicit advertising
  • Products are recommended to friends, family,
    co-workers, etc.
  • This is the primary form of advertising behind
    the growth of Google

18
Email Product Recommendation
  • Hotmail
  • Very little direct advertising in the beginning
  • Launched in July 1996
  • 20,000 subscribers after a month
  • 100,000 subscribers after 3 months
  • 1,000,000 subscribers after 6 months
  • 12,000,000 subscribers after 18 months
  • By April 2002 Hotmail had 110 million subscribers

19
Email Product Recommendation
  • What was Hotmails primary form of advertising?
  • Small link to the sign up page at the bottom of
    every email sent by a subscriber
  • Spreading Activation
  • Implicit recommendation

20
Spreading Activation
  • Network effects
  • Even if a small number of people who receive the
    message subscribe (0.1), the service will
    spread rapidly
  • This can be contrasted with the current practice
    of SPAM
  • SPAM is not sent by friends, family, co-workers
  • No implicit recommendation
  • SPAM is often viewed as not providing a good
    service

21
Modeling Spreading Activation
  • Diffusion Model
  • Montgomery (2002)
  • Applied models used in marketing literature, Bass
    (1969) to the hotmail phenomena
  • Similar word-of-mouth networks used in selling
    consumer electronics such as refrigerators and
    televisions
  • We want to predict at time t how many individuals
    k(t) will adopt the product out of a population
    of N possible adopters

22
Modeling Spreading Activation
  • Diffusion Model
  • Two ways individuals will subscribe
  • Direct Advertising
  • At time t, N k(t) individuals have not
    subscribed
  • a 0 percent of these individuals will subscribe
    due to direct advertising
  • Word-of-Mouth
  • At time t, there are k(t)(N k(t)) possible
    connections between subscribers and
    non-subscribers
  • ß 0 percent of these connections will cause a
    non-subscriber to subscribe

23
Modeling Spreading Activation
  • Combine these and we get the following
    expression
  • Solve this and we get

24
Modeling Spreading Activation
25
Modeling Spreading Activation
26
Modeling Spreading Activation
  • Diffusion Model
  • This does not completely model the what actually
    occurred
  • However, it is simple and provides a lot of
    interesting (useful) information
  • Other work
  • Domingos Richardson (2001) Markov Random Field
    Model
  • Daley Gani (1999) various deterministic and
    stochastic models

27
Purchase Prediction
  • We want to predict whether or not a shopper will
    make a purchase
  • We know demographics
  • We know page view patterns
  • Can we accurately predict if the user will make a
    purchase or not?

28
Purchase Prediction
  • Li et al. (2002)
  • Study 1160 shoppers at www.barnesandnoble.com
    between April 1 and April 30, 2002
  • The data was collected client side so they knew
    exactly what pages were displayed to the user
  • They also knew the demographics (predominantly
    well-educated and affluent)

29
Purchase Prediction
  • Li et al. (2002)
  • There were 14,512 page views which they divided
    into 1659 sessions
  • Mean 8.75
  • Median 5
  • Standard deviation 16.4
  • Min 1
  • Max 570
  • 7 of sessions contained a purchase

30
Purchase Prediction
  • Li et al. (2002)
  • Divided the pages into 8 classes
  • Home (H), main page
  • Account (A), account information pages
  • List (L), pages with lists of items
  • Product (P), page with a single item
  • Information (I), informational pages (shipping
    etc.)
  • Shopping cart (S)
  • Order (O), indicates a completed order
  • Entry or Exit (E), entering or leaving the site

31
Purchase Prediction
  • Li et al. (2002)
  • Each session was represented by a string of the
    form I H H I I L I I E
  • A session containing an O is considered having
    made a purchase
  • The average length of a session with a purchase
    was 34.5 and without was only 6.8

32
Purchase Prediction
  • Markov transition matrix
  • For sessions with no purchase

33
Purchase Prediction
  • Li et al. (2002)
  • They did several models based on this data
  • Tested on predicting next page and predicting a
    purchase
  • Best models 64 accurate at predicting next page
  • After 2 page views the best models predicted 12
    true positives and 5.3 false positives
  • After 6 page views 13.1 true positives and 2.9
    false positives
Write a Comment
User Comments (0)
About PowerShow.com