Capabilities Apollo and SQL Server Data Mining - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Capabilities Apollo and SQL Server Data Mining

Description:

... and SQL Server Data Mining. Presented by. Jeff Kaplan, ... Paul Bradley, Ph.D., Principal Data Mining Technology. 312.787.7376. 2. Agenda. Apollo Overview ... – PowerPoint PPT presentation

Number of Views:222
Avg rating:3.0/5.0
Slides: 38
Provided by: SivaSab5
Category:

less

Transcript and Presenter's Notes

Title: Capabilities Apollo and SQL Server Data Mining


1
CapabilitiesApollo and SQL Server Data Mining
  • Presented by
  • Jeff Kaplan, Principal Client Services
  • Paul Bradley, Ph.D., Principal Data Mining
    Technology
  • 312.787.7376

2
Agenda
  • Apollo Overview
  • Data Mining 101
  • Project REAL Case Study
  • SQL Server 2005 Data Mining Demo
  • Real-life Examples

3
PART ONE
Apollo Overview
4
Company Background
overview
  • First company delivering true predictive analytic
    solutions
  • 10 plus years in data mining and data warehousing
  • Premier Partner for SQL Server 2005 Data Mining
  • Cater to a wide range of business including
    Microsoft, Sprint, Wal-Mart, Barnes Noble,
    Seattle Times, Knight Ridder
  • Variety of Industries
  • Retail and Consumer Goods
  • Media
  • Financial Services
  • Manufacturing
  • Public Services

5
Industry Recognition
overview
6
Testimonials
overview
7
Testimonials
overview
8
Testimonials
overview
9
Analytic Landscape
overview
10
Capabilities
overview
Marketing
Sales Distribution
Market Research
Operations
  • Claim Analysis
  • Call Center Analytics
  • Data Warehousing
  • Dashboard Reporting
  • Inventory Forecasting
  • Sales Forecasting
  • Pricing Optimization
  • Next Best Offer
  • Market Basket Analysis
  • Recency Frequency Modeling
  • Customer Acquisition
  • Campaign Targeting
  • Cross-sell/Up-sell
  • Customer Segmentation
  • Retention Modeling
  • Behavioral Targeting
  • Personalization
  • Correlation Analysis
  • Key Driver Analysis
  • Verbatim Summarization

11
overview
Customer Targeting Models
  • Score Model Results
  • Join Customer Data Sources
  • Deliver Targeted Predictions
  • Run Predictive Algorithms

Red Card
Customer Clustering Models
Phone
Predictive Models
Booking
SQL-Server 2005
Web
Call Center
Automate Predictions for Targeting, Forecasting,
Detection, etc.
Email
Dashboard Ad-hoc Reporting
Stores
Direct Mail
Measure Promotion Success
12
MS Data Mining
PART TWO
13
Background
ms data mining
  • Fastest Growing BI Segment (IDC)
  • Data Mining Tools 1.85B in 2006
  • Predictive Analytic projects yield a high median
    ROI of 145
  • Uses
  • Marketing Customer Acquisition and Targeting,
    Cross-Sell/Up-Sell
  • Retail Inventory Forecasting, Price Optimization
  • Market Research Driver Analysis, Verbatim
    Summarization
  • Operations Call Center Analytics
  • Finance Fraud Detection, Risk Models
  • Mainstream Emergence
  • E-commerce (e.g Amazon.com)
  • Search (e.g. Vivisimo.com)
  • Behavioral Advertising
  • SQL-Server is in a Unique Position to Service
    Market Needs

14
Evolution of SQL Server Data Mining
ms data mining
SQL 2005
SQL 2000
  • Enter the Game
  • Create industry standard
  • Target developer audience
  • V1.0 product with 2 algorithms

15
Value of Data Mining
ms data mining
Business Knowledge
SQL-Server 2005
Relative Business Value
Easy
Difficult
16
SQL-Server 2005 BI Platform
ms data mining
17
SQL Server 2005 BI Platform
ms data mining
  • Embed Data Mining Development Tool Integration
  • Make Decisions Without Coding
  • Customized Logic Based on Client Data
  • Logic Updated by Model Reprocessing
    Applications Do Not Need to be Re-Written,
    Re-Compiled, and Re-Deployed
  • Data Mining Key Points
  • Price Point to Achieve Market Penetration
  • Database Metaphors for Building, Managing,
    Utilizing Extracted Patterns and Trends
  • APIs for Embedding Data Mining Functionality into
    Applications

18
SQL-Server 2005 Algorithms
ms data mining
Decision Trees
Time Series
Neural Net
Clustering
Sequence Clustering
Association
Naïve Bayes
Linear and Logistic Regression
19
Project REAL
PART THREE
20
Client Profile Inventory Forecasting
project real
  • Create a Reference Implementation of a BI System
    Using Real Retail Data.
  • Partners - Barnes Noble, Microsoft, Scalability
    Experts, EMC, Unisys, Panorama, Apollo
  • Forecast Out-of-Stock for 5 Book Titles Across
    Entire Chain (800 Stores)
  • Predictive Models to Flag Items That Are Going to
    be Out-of-Stock
  • Model on 48 Weeks of Data, Predictions for Month
    of December
  • Models Predicted Out-of-Stock Occurrences gt 90
    Accuracy
  • Conservative Sales Opportunity for just 5 Titles
    6,800 per year
  • Extrapolate Across Millions of Titles - Million
    Dollar Sales Opportunity

21
Predictive Modeling Process
project real
STEP 1
ITEM
STORE

STEP 2 Identify the cluster which the store
belongs to, for the category of that item.
Each item belongs to a category
Category
CATEGORY
For the category, create a set of store clusters
predictive of sales in the category
STEP 3 Utilize sales data predict item sales 2
weeks out.
22
Store Clustering Demo
project real
23
Store Clustering Overview
project real
Average Category Sales 120, 2,081
Average Category Sales 685, 14,366
Average Category Sales 1,320, 22,805
Average Category Sales 8,936, 188,921
Average Category Sales 2,532, 45,153
24
Out-of-Stock Data Preparation Summary
project real
  • Apollo Explored 3 Data Preparation Strategies
  • Use Sales, On-Hand, On-Order History Data for All
    Stores in the Same Cluster
  • Build One Mining Structure per Cluster, For All
    Stores in that Cluster for Each Title
  • Build One Mining Model per Store, per Cluster for
    Each Title
  • Negative Few OOS Examples per Store,
    Computation to Deploy One Mining Model per
    Store/Title Combination
  • Use Sales, On-Hand, On-Order History for All
    Stores, Across All Clusters
  • Build One Mining Structure per Book, Use Cluster
    Membership of Store as Input Attribute
  • Positive Optimizes OOS Examples per Title by
    Considering All Stores
  • Negative Does Not Capture Derivative Sales
    Information
  • Removed Negative of Strategy 2
  • Included Historical Week-on-Week Sales Derivative
    Information for Each Title
  • Increase the Information Content of the Source
    Data for Modeling

25
Creating Variables for Success
project real
  • Using
  • Sales and Inventory History from January 2004 to
    end of November 2004
  • Recommend two (2) years of Historical Data to
    Increase accuracy for training model
  • Key
  • Store Fiscal Year WeekID
  • Predicted Variables
  • 1 Week Ahead OOS Boolean
  • 1 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4)
  • 2 Week Ahead OOS Boolean
  • 2 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4)
  • Input Attributes
  • Store Cluster Membership (Derived from Store
    Cluster Model)
  • Current Week Sales, On-Hand, On-Order
  • Preceding 1-5 Week Sales, On-Hand, On-Order
  • Sales Derivative Atttributes

26
Model Training and Testing Scenarios
project real
  • Purpose Intelligence on Model Training
    Frequency
  • Scenario 1 Train Models Every 2 Weeks
  • Training Dataset All Data Prior to Last 2
    Fiscal Weeks in December 2004
  • Test Dataset Last 2 Fiscal Week in December
    2004
  • Scenario 2 Train Models Monthly
  • Training Dataset All Data Prior to End of
    Fiscal November 2004
  • Test Dataset Fiscal Month of December 2004

27
Balancing Training Data
project real
  • When Considering All Stores, Still Have
    Un-Balanced Datasets
  • Store/Week Combinations Where OOS is False gtgt
    Store/Week Combinations Where OOS is True
  • Common in Many Data Mining Applications
  • Training Datasets were Balanced
  • Sample Store/Week Combinations Where OOS is False
    to Obtain Equal Proportion of True/False Values
  • Cost of Predictive Errors are Equal
  • Requested by Client

28
Prediction Methods
project real
  • Algorithm Selection
  • Microsoft Decision Trees for Predicting OOS
    Boolean flags
  • Consistently High Overall Accuracy
  • Straightforward Interpretation
  • Data Preparation
  • Scenario 2
  • Rebuild models monthly
  • Predictive Models are Contextual and Optimized
    for Behavior in the Coming Month

29
Prediction Methods
project real
  • Modeling Methodology Benefits
  • Scalability (Titles and Stores)
  • Saves 4x to 5x on Computational Cost when
    Rebuilding Models (versus Neural Networks)
  • 5 Minutes for All 5 Titles gt 1 Minute per Title
    for All Stores

30
Out-of-Stock Prediction Demo
project real
31
Predictive Models
project real
  • Identify Opportunities to Improve Forecasting
    Rules
  • Save Scored Results to Database and Leverage UI
    to View KPI and Alerts for Store Managers and
    Inventory analysts

32
Inventory Prediction Results
project real
  • 1 week and 2 week prediction accuracies

33
Sales Opportunity
project real
  • Data Mining created revenue generating
    opportunity
  • Based on 55 titles for Jan 2004 - Dec 2004
  • ( of weeks OOS across all stores)(Apollo Boolean
    Predicted Accuracy)
  • X (actual of actual sales across all stores) x
    (retail price)
  • Yearly Increase in Sales Opportunity using
    Apollo OOS Predictions

Sales bins produced 3.4K, 6.8K potential lift
in sales
34
PART FOUR
Client Profiles
35
Client Profile Customer Acquisition
client profiles
  • Decrease Subscriber Churn
  • Increase New Subscriptions
  • Segment Geo-Demographic and Attitudinal Behaviors
    for Subscribers and Non-Subscribers
  • Build Predictive Models to Identify Likely New
    Subscribers
  • Using Analysis to Deliver Targeted Marketing
    Campaigns for Acquisition
  • Increased Stop Saves by 2

36
Client Profile Cross sell / Up sell (Global
Catalog Retailer)
client profiles
  • Increase Average Purchase Size
  • Deploy Product Recommendations on their Website
  • Modeling Historical Sales to Determine Product
    Affinities
  • Incorporate Business Logic into Modeling Process
    (e.g. Same category recommendation)
  • Increase Average Shopping Cart Size
  • Increase Sales Lift
  • Data Mining Driven Product Recommendation
    Performed Better than Manual Recommendations

37
Client Profile Customer Support Automation
client profiles
  • Increase Visibility into Customer Service Center
  • Increase Speed of Customer Support
  • Utilizing Text Mining Engines to Automate
    Processing of Customer Support (Email, Web
    Inquiries, etc.)
  • Automating the Process of Rolling up Keywords
    into Concepts
  • Customer Support Center has the Ability to View
    Trends in Minutes versus Weeks
  • Improved Accuracy - Text Mining Engines Removed
    the Bias and Inaccuracies Often Occurring in Call
    Center Representative Notes and Tagging.

38
Client Profile Key Driver Analysis
client profiles
  • Evaluate Customer Satisfaction Metrics
  • Increase Customer Satisfaction
  • Partnered with Apollo to Develop Market Research
    Database and Reporting
  • Developed Models to Identify Key Satisfaction
    Drivers
  • Successfully Identified Drivers to Increase
    Customer Satisfaction
  • Delivered Driver Recommendations to Field
    Operations - Insight into Action
  • Company Wide (sales, marketing, executive level)
    Visibility into Customer Satisfaction Metrics

39
  • Presented by
  • Jeff Kaplan
  • Principal Client Services
  • jeff_at_apollodatatech.com
  • 312.787.7376
Write a Comment
User Comments (0)
About PowerShow.com