Title: Capabilities Apollo and SQL Server Data Mining
1CapabilitiesApollo and SQL Server Data Mining
- Presented by
- Jeff Kaplan, Principal Client Services
- Paul Bradley, Ph.D., Principal Data Mining
Technology - 312.787.7376
2Agenda
- Apollo Overview
- Data Mining 101
- Project REAL Case Study
- SQL Server 2005 Data Mining Demo
- Real-life Examples
3PART ONE
Apollo Overview
4Company Background
overview
- First company delivering true predictive analytic
solutions - 10 plus years in data mining and data warehousing
- Premier Partner for SQL Server 2005 Data Mining
- Cater to a wide range of business including
Microsoft, Sprint, Wal-Mart, Barnes Noble,
Seattle Times, Knight Ridder - Variety of Industries
- Retail and Consumer Goods
- Media
- Financial Services
- Manufacturing
- Public Services
5Industry Recognition
overview
6Testimonials
overview
7Testimonials
overview
8Testimonials
overview
9Analytic Landscape
overview
10Capabilities
overview
Marketing
Sales Distribution
Market Research
Operations
- Claim Analysis
- Call Center Analytics
- Data Warehousing
- Dashboard Reporting
- Inventory Forecasting
- Sales Forecasting
- Pricing Optimization
- Next Best Offer
- Market Basket Analysis
- Recency Frequency Modeling
- Customer Acquisition
- Campaign Targeting
- Cross-sell/Up-sell
- Customer Segmentation
- Retention Modeling
- Behavioral Targeting
- Personalization
- Correlation Analysis
- Key Driver Analysis
- Verbatim Summarization
11overview
Customer Targeting Models
- Join Customer Data Sources
- Deliver Targeted Predictions
- Run Predictive Algorithms
Red Card
Customer Clustering Models
Phone
Predictive Models
Booking
SQL-Server 2005
Web
Call Center
Automate Predictions for Targeting, Forecasting,
Detection, etc.
Email
Dashboard Ad-hoc Reporting
Stores
Direct Mail
Measure Promotion Success
12MS Data Mining
PART TWO
13Background
ms data mining
- Fastest Growing BI Segment (IDC)
- Data Mining Tools 1.85B in 2006
- Predictive Analytic projects yield a high median
ROI of 145 - Uses
- Marketing Customer Acquisition and Targeting,
Cross-Sell/Up-Sell - Retail Inventory Forecasting, Price Optimization
- Market Research Driver Analysis, Verbatim
Summarization - Operations Call Center Analytics
- Finance Fraud Detection, Risk Models
- Mainstream Emergence
- E-commerce (e.g Amazon.com)
- Search (e.g. Vivisimo.com)
- Behavioral Advertising
- SQL-Server is in a Unique Position to Service
Market Needs
14Evolution of SQL Server Data Mining
ms data mining
SQL 2005
SQL 2000
- Enter the Game
- Create industry standard
- Target developer audience
- V1.0 product with 2 algorithms
15Value of Data Mining
ms data mining
Business Knowledge
SQL-Server 2005
Relative Business Value
Easy
Difficult
16SQL-Server 2005 BI Platform
ms data mining
17SQL Server 2005 BI Platform
ms data mining
- Embed Data Mining Development Tool Integration
- Make Decisions Without Coding
- Customized Logic Based on Client Data
- Logic Updated by Model Reprocessing
Applications Do Not Need to be Re-Written,
Re-Compiled, and Re-Deployed - Data Mining Key Points
- Price Point to Achieve Market Penetration
- Database Metaphors for Building, Managing,
Utilizing Extracted Patterns and Trends - APIs for Embedding Data Mining Functionality into
Applications
18SQL-Server 2005 Algorithms
ms data mining
Decision Trees
Time Series
Neural Net
Clustering
Sequence Clustering
Association
Naïve Bayes
Linear and Logistic Regression
19Project REAL
PART THREE
20Client Profile Inventory Forecasting
project real
- Create a Reference Implementation of a BI System
Using Real Retail Data. - Partners - Barnes Noble, Microsoft, Scalability
Experts, EMC, Unisys, Panorama, Apollo - Forecast Out-of-Stock for 5 Book Titles Across
Entire Chain (800 Stores) - Predictive Models to Flag Items That Are Going to
be Out-of-Stock - Model on 48 Weeks of Data, Predictions for Month
of December - Models Predicted Out-of-Stock Occurrences gt 90
Accuracy - Conservative Sales Opportunity for just 5 Titles
6,800 per year - Extrapolate Across Millions of Titles - Million
Dollar Sales Opportunity
21Predictive Modeling Process
project real
STEP 1
ITEM
STORE
STEP 2 Identify the cluster which the store
belongs to, for the category of that item.
Each item belongs to a category
Category
CATEGORY
For the category, create a set of store clusters
predictive of sales in the category
STEP 3 Utilize sales data predict item sales 2
weeks out.
22Store Clustering Demo
project real
23Store Clustering Overview
project real
Average Category Sales 120, 2,081
Average Category Sales 685, 14,366
Average Category Sales 1,320, 22,805
Average Category Sales 8,936, 188,921
Average Category Sales 2,532, 45,153
24Out-of-Stock Data Preparation Summary
project real
- Apollo Explored 3 Data Preparation Strategies
- Use Sales, On-Hand, On-Order History Data for All
Stores in the Same Cluster - Build One Mining Structure per Cluster, For All
Stores in that Cluster for Each Title - Build One Mining Model per Store, per Cluster for
Each Title - Negative Few OOS Examples per Store,
Computation to Deploy One Mining Model per
Store/Title Combination - Use Sales, On-Hand, On-Order History for All
Stores, Across All Clusters - Build One Mining Structure per Book, Use Cluster
Membership of Store as Input Attribute - Positive Optimizes OOS Examples per Title by
Considering All Stores - Negative Does Not Capture Derivative Sales
Information - Removed Negative of Strategy 2
- Included Historical Week-on-Week Sales Derivative
Information for Each Title - Increase the Information Content of the Source
Data for Modeling
25Creating Variables for Success
project real
- Using
- Sales and Inventory History from January 2004 to
end of November 2004 - Recommend two (2) years of Historical Data to
Increase accuracy for training model - Key
- Store Fiscal Year WeekID
- Predicted Variables
- 1 Week Ahead OOS Boolean
- 1 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4)
- 2 Week Ahead OOS Boolean
- 2 Week Ahead Sales Bin (None, 1 to 2, 3 to 4, 4)
- Input Attributes
- Store Cluster Membership (Derived from Store
Cluster Model) - Current Week Sales, On-Hand, On-Order
- Preceding 1-5 Week Sales, On-Hand, On-Order
- Sales Derivative Atttributes
26Model Training and Testing Scenarios
project real
- Purpose Intelligence on Model Training
Frequency - Scenario 1 Train Models Every 2 Weeks
- Training Dataset All Data Prior to Last 2
Fiscal Weeks in December 2004 - Test Dataset Last 2 Fiscal Week in December
2004 - Scenario 2 Train Models Monthly
- Training Dataset All Data Prior to End of
Fiscal November 2004 - Test Dataset Fiscal Month of December 2004
27Balancing Training Data
project real
- When Considering All Stores, Still Have
Un-Balanced Datasets - Store/Week Combinations Where OOS is False gtgt
Store/Week Combinations Where OOS is True - Common in Many Data Mining Applications
- Training Datasets were Balanced
- Sample Store/Week Combinations Where OOS is False
to Obtain Equal Proportion of True/False Values - Cost of Predictive Errors are Equal
- Requested by Client
28Prediction Methods
project real
- Algorithm Selection
- Microsoft Decision Trees for Predicting OOS
Boolean flags - Consistently High Overall Accuracy
- Straightforward Interpretation
- Data Preparation
- Scenario 2
- Rebuild models monthly
- Predictive Models are Contextual and Optimized
for Behavior in the Coming Month
29Prediction Methods
project real
- Modeling Methodology Benefits
- Scalability (Titles and Stores)
- Saves 4x to 5x on Computational Cost when
Rebuilding Models (versus Neural Networks) - 5 Minutes for All 5 Titles gt 1 Minute per Title
for All Stores
30Out-of-Stock Prediction Demo
project real
31Predictive Models
project real
- Identify Opportunities to Improve Forecasting
Rules - Save Scored Results to Database and Leverage UI
to View KPI and Alerts for Store Managers and
Inventory analysts
32Inventory Prediction Results
project real
- 1 week and 2 week prediction accuracies
33Sales Opportunity
project real
- Data Mining created revenue generating
opportunity - Based on 55 titles for Jan 2004 - Dec 2004
- ( of weeks OOS across all stores)(Apollo Boolean
Predicted Accuracy) - X (actual of actual sales across all stores) x
(retail price) - Yearly Increase in Sales Opportunity using
Apollo OOS Predictions
Sales bins produced 3.4K, 6.8K potential lift
in sales
34PART FOUR
Client Profiles
35Client Profile Customer Acquisition
client profiles
- Decrease Subscriber Churn
- Increase New Subscriptions
- Segment Geo-Demographic and Attitudinal Behaviors
for Subscribers and Non-Subscribers - Build Predictive Models to Identify Likely New
Subscribers - Using Analysis to Deliver Targeted Marketing
Campaigns for Acquisition - Increased Stop Saves by 2
36Client Profile Cross sell / Up sell (Global
Catalog Retailer)
client profiles
- Increase Average Purchase Size
- Deploy Product Recommendations on their Website
- Modeling Historical Sales to Determine Product
Affinities - Incorporate Business Logic into Modeling Process
(e.g. Same category recommendation) - Increase Average Shopping Cart Size
- Increase Sales Lift
- Data Mining Driven Product Recommendation
Performed Better than Manual Recommendations
37Client Profile Customer Support Automation
client profiles
- Increase Visibility into Customer Service Center
- Increase Speed of Customer Support
- Utilizing Text Mining Engines to Automate
Processing of Customer Support (Email, Web
Inquiries, etc.) - Automating the Process of Rolling up Keywords
into Concepts - Customer Support Center has the Ability to View
Trends in Minutes versus Weeks - Improved Accuracy - Text Mining Engines Removed
the Bias and Inaccuracies Often Occurring in Call
Center Representative Notes and Tagging.
38Client Profile Key Driver Analysis
client profiles
- Evaluate Customer Satisfaction Metrics
- Increase Customer Satisfaction
- Partnered with Apollo to Develop Market Research
Database and Reporting - Developed Models to Identify Key Satisfaction
Drivers - Successfully Identified Drivers to Increase
Customer Satisfaction - Delivered Driver Recommendations to Field
Operations - Insight into Action - Company Wide (sales, marketing, executive level)
Visibility into Customer Satisfaction Metrics
39- Presented by
- Jeff Kaplan
- Principal Client Services
- jeff_at_apollodatatech.com
- 312.787.7376