Integrating ECommerce and Data Mining: Architecture and Challenges - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Integrating ECommerce and Data Mining: Architecture and Challenges

Description:

Integrating E-Commerce and Data Mining: Architecture and Challenges ... Shopping cart info (contents of abandoned carts, add/remove items, initiate ... – PowerPoint PPT presentation

Number of Views:241
Avg rating:3.0/5.0
Slides: 19
Provided by: mathe182
Category:

less

Transcript and Presenter's Notes

Title: Integrating ECommerce and Data Mining: Architecture and Challenges


1
Integrating E-Commerce and Data
MiningArchitecture and Challenges
  • Suhail Ansari, Ron Kohavi, Llew Mason, Zijian
    Zheng
  • Blue Martini Software
  • Presented by Drew LaMar

2
Contents
  • Introduction
  • Integrated Architecture
  • Challenges
  • Questions

3
Introduction
  • Companies spending more on e-commerce
  • Personalization important for need to
    differentiate
  • Wider recognition of data mining techniques
  • E-commerce natural for data mining
  • Plenty of data records
  • Reliable data through electronic collection
  • Return on investment can be measured
  • Insight can be quickly turned into action
  • To take full advantage, need an integrated data
    mining architecture

4
Integrated Architecture
5
Business Data Definition
  • E-commerce business user defines the data and
    metadata
  • Key is defining a rich set of attributes for any
    type of data (personalization)
  • Merchandising info. (products, assortments,
    pricing)
  • Content information (web page templates, images,
    articles, multimedia)
  • Business rules (personalized content,
    cross-sells, up-sells)

6
Stage Data
  • Connects Business Data Definition to Customer
    Interaction
  • Advantages
  • Test changes before implementation
  • Change data structures between components
  • Zero down-time

7
Customer Interaction
  • General interaction (web sites, customer service,
    wireless apps, bricks-and-mortor)
  • Variability of sources makes data collecting
    integral
  • Data collecting implemented using On-Line
    Transaction Processing (OLTP)
  • E.g. Web banner advertisements
  • Cost of advertisement based on click-throughs
  • Effectiveness click-throughs vs. sales generated
  • Need to attract buyers, not browsers
  • To measure effectiveness requires multiple
  • data sources

8
Customer Interaction
  • Data collection (web site)
  • Clickstream data
  • Problems with web server logs
  • Difficult to identify users since requests are
    independent
  • Dynamic content, i.e. personalization, may not be
    captured in log
  • Mechanism used to send requests to server affects
    server log info
  • Most of the data in logs are requests for useless
    image files
  • Problems with packet sniffers
  • Look at data on the wire
  • Have problems identifying users and sessions
  • Cant see encoded data for secure transmission,
  • e.g. SSL (Secure Socket Layer)
  • Solution Collect data at the application
  • server level

9
Customer Interaction
  • Data collection
  • Clickstream data (contd)
  • Application server level
  • Detailed knowledge of images, products, and
    articles,
  • even if dynamically generated or encoded
  • Uses cookies or URL encoding to track users
    session
  • Keeps track of data absent in web server logs
  • Aborted pages
  • Local time of user
  • Speed of connection
  • Cookies on/off

10
Customer Interaction
  • Data collection
  • Business event logging
  • A business event is a subset of requests
    considered as one logical event or episode
  • Examples
  • Shopping cart info (contents of abandoned carts,
    add/remove items, initiate/finish checkout, etc.)
  • Search event (keywords and of results can be
    logged)
  • Register event
  • Business events used to track effect of deploying
  • business rules (personalization) to users, e.g.
  • offering of promotions
  • Can use control groups with personalization

11
Build Data Warehouse
  • Transfers data collected within Customer
    Interaction to Analysis, as well as business data
    defined within Business Data Definition
  • Builds a data warehouse for analysis
  • Data transformed to optimize for analysis

12
Analysis
  • Tools
  • Data transformations
  • Reporting
  • Data mining algorithms
  • Visualization
  • OLAP
  • Richness of metadata very important to success of
    analysis component

13
Analysis
  • Data transormations
  • Convert source data to format more suited to data
    mining
  • Types create new attributes, add hierarchy
    attributes, filter, sample, delete columns,
    score, etc.
  • Attribute examples
  • What percentage of each customers orders used
    VISA?
  • How much money does each customer spend on books?
  • What is the frequency of each customers
    purchases?

14
Analysis
  • Reporting
  • For business users to understand web site
    performance
  • Questions
  • What are the top selling products?
  • What are the top failed searches?
  • What are the top distribution of web browsers?
  • What are the top referrers by visit count?
  • What are the top referrers by sales amount?
  • What are the top abandoned products?

15
Analysis
  • Data mining algorithms
  • Model generation (i.e. classification and
    personalization)
  • Classification examples
  • What characterizes heavy spenders?
  • What characterizes customers that prefer
    promotion X over Y?
  • What characterizes customers that buy quickly?
  • What characterizes browsers and not buyers?
  • Interactive model modification
  • View a segment defined by a subset of rules
  • (e.g. age lt 30 AND session time lt 10 min)
  • Delete, add, or change a rule

16
Deploy Results
  • Transfers models, scores, results and new
    attributes constructed using data transformations
  • Contains new business rules for personalization
  • Results are immediately implemented into the
    business definition for the e-commerce companys
    business

17
Challenges
  • Make data mining models comprehensible to
    business users
  • Make data transformation and model building
    accessible to business users
  • Support multiple granularity levels
  • (customer attributes -gt sessions -gt page views)
  • Utilize hierarchies
  • Handle large amounts of data
  • Support and model external events
  • Support slowly changing dimensions
  • Identify bots and crawlers

18
Questions
Write a Comment
User Comments (0)
About PowerShow.com