MKT 700 Business Intelligence and Decision Models - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

MKT 700 Business Intelligence and Decision Models

Description:

MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1) – PowerPoint PPT presentation

Number of Views:135

Avg rating:3.0/5.0

Slides: 39

Provided by: Richard1809

Category:

more less

Transcript and Presenter's Notes

Title: MKT 700 Business Intelligence and Decision Models

1
MKT 700Business Intelligence and Decision Models

Algorithms and
Customer Profiling (1)

2
Classification and Prediction
3
ClassificationUnsupervised Learning
4
PredictingSupervised Learning
5
SPSS Direct Marketing
Classification Predictive
Unsupervised Learning RFM Cluster analysis Postal Code Responses NA
Supervised Learning Customer Profiling Propensity to buy
6
SPSS Analysis
Classification Predictive
Unsupervised Learning Hierarchical Cluster Two-Step Cluster K-Means Cluster NA
Supervised Learning Classification Trees CHAID CART Linear Regression Logistic Regression Artificial Neural Nets
7
Major Algorithms
Classification Predictive
Unsupervised Learning Euclidean Distance Log Likelihood NA
Supervised Learning Chi-square Statistics Log Likelihood GINI Impurity Index F-Statistics (ANOVA) Log Likelihood F-Statistics (ANOVA)
Nominal Chi-square, Log Likelihood Continuous
F-Statistics, Log Likelihood
8
Euclidean Distance
9
Euclidean Distance for Continuous Variables

Pythagorean distance ? vd2 v(a2b2)
Euclidean space ? vd2 v(a2b2c2)
Euclidean distance ? d ?(di)21/2(Cluster
Analysis with continuous var.)

10
Pearsons Chi-Square
11
Contingency Table
North South East West Tot.
Yes 68 75 57 79 279
No 32 45 33 31 141
Tot. 100 120 90 110 420
12
Observed and theoretical Frequencies
North South East West Tot.
Yes 68 66 75 80 57 60 79 73 279 66
No 32 34 45 40 33 30 31 37 141 34
Tot. 100 120 90 110 420
13
Chi-Square
Obs. fo fe fo-fe (fo-fe)2 (fo-fe)2 fe
1,1 68 1,2 75 1,3 57 1,4 79 2,1 32 2,2 45 2,2 33 2,4 31 66 80 60 73 34 40 30 37 2 -5 -3 6 -2 5 3 6 4 25 9 36 4 25 9 36 .0606 .3125 .1500 .4932 .1176 .6250 .3000 .9730 X2 3.032
14
Statistical Inference

DF (4 col 1) (2 rows 1) 3

15
Log Likelihood Chi-Square
16
Log Likelihood

Based on probability distributions rather than
contingency (frequency) tables.
Applicable to both categorical and continuous
variables, contrary to chi-square which must be
discreticized.

17
Contingency Table (Observed Frequencies)
Cluster 1 Cluster 2 Total
Male 10 30 40
18
Contingency Table (Expected Frequencies)
Cluster 1 Cluster 2 Total
Male 10 20 30 20 40 40
19
Chi-Square
Obs. fo Fe fo-fe (fo-fe)2 (fo-fe)2 fe
1,1 10 1,2 30 20 20 -10 10 100 100 5.00 5.00 X2 10.00
p lt 0.05 DF 1 Critical value 3.84
20
Log Likelihood Distance Probability
Cluster 1 Cluster 2
Male O E 10 20 30 20
O/E Ln (O/E) O Ln (O/E) 2?OLn(O/E) 10/20 .50 -.693 10-.693 -6.93 30/201.50 .405 30.405 12.164 2(-6.9312.164) 10.46
p lt 0.05 critical value 3.84 p lt 0.05 critical value 3.84
21
Variance, ANOVA, andF Statistics
22
F-Statistics

For metric or continuous variables
Compares explained (in the model) and unexplained
variances (errors)

23
Variance
SQUARED
VALUE VALUE VALUE MEAN DIFFERENCE
20 20 43.6 557
34 34 43.6 92.16
34 34 43.6 92.16
38 38 43.6 31.36
38 38 43.6 31.36
40 40 43.6 12.96
41 41 43.6 6.76
41 41 43.6 6.76
41 41 43.6 6.76
42 42 43.6 2.56
43 43 43.6 0.36
47 47 43.6 11.56
47 47 43.6 11.56
48 48 43.6 19.36
49 49 43.6 29.16
49 49 43.6 29.16
55 55 43.6 130
55 55 43.6 130
55 55 43.6 130
55 55 43.6 130

COUNT 20 20 SS 1461
DF 19
VAR 76.88
MEAN 43.6 43.6 SD 8.768
SS is Sum of Squares DF N-1 VARSS/DF SD vVAR
24
ANOVA

Two Groups T-test
Three Group Comparisons Are errors
(discrepancies between observations and the
overall mean) explained by group membership or by
some other (random) effect?

25
OnewayANOVA
Grand mean
Group 1 Group 2 Group 3 5.042
6 8 3
5 9 2 (X-Mean)2
4 7 1 0.918
5 8 3 0.002
4 9 2 1.085
6 7 1 0.002
5 8 3 1.085
4 9 2 0.918
0.002
Group means Group means 1.085
4.875 8.125 2.125 8.752
15.668
3.835
8.752
(X-Mean)2 (X-Mean)2 (X-Mean)2 15.668
1.266 0.016 0.766 3.835
0.016 0.766 0.016 8.752
0.766 1.266 1.266 15.668
0.016 0.016 0.766 4.168
0.766 0.766 0.016 9.252
1.266 1.266 1.266 16.335
0.016 0.016 0.766 4.168
0.766 0.766 0.016 9.252
16.335
4.875 4.875 4.875 4.168
9.252
SS Within 14.625
Total SS 158.958
26
MSS(Between)/MSS(Within)
Winthin groups Winthin groups Between Groups Total Errors

SS 14.625 144.333 158.958
DF 24-321 3-12 24-123
Mean SS 0.696 72.167 6.911

Between Groups Mean SS Between Groups Mean SS 72.167 103.624 p-value lt .05
Within Groups Mean SS Within Groups Mean SS 0.696
27
ONEWAY (Excel or SPSS)

Anova Single Factor Anova Single Factor Anova Single Factor

SUMMARY SUMMARY
Groups Groups Count Sum Average Variance
Group 1 Group 1 8 39 4.875 0.696
Group 2 Group 2 8 65 8.125 0.696
Group 3 Group 3 8 17 2.125 0.696

ANOVA ANOVA
Source of Variation Source of Variation SS df MS F P-value F crit
Between Groups Between Groups 144.333 2 72.167 103.624 1.318E-11 3.467
Within Groups Within Groups 14.625 21 0.696

Total Total 158.958 23

28
Profiling
29
Customer Profiling Documenting or Describing

Who is likely to buy or not respond?
Who is likely to buy what product or service?
Who is in danger of lapsing?

30
CHAID or CART

Chi-Square Automatic Interaction Detector
Based on Chi-Square
All variables discretecized
Dependent variable nominal
Classification and Regression Tree
Variables can be discrete or continuous
Based on GINI or F-Test
Dependent variable nominal or continuous

31
Use of Decision Trees

Classify observations from a target binary or
nominal variable ? Segmentation
Predictive response analysis from a target
numerical variable ? Behaviour
Decision support rules ? Processing

32
Decision Tree
33
Exampledmdata.sav

Underlying Theory
? X2

34
CHAID AlgorithmSelecting Variables

Example
Regions (4), Gender (3, including Missing)Age
(6, including Missing)
For each variable, collapse categories to
maximize chi-square test of independence
Ex Region (N, S, E, W,) ? (WSE, N)
Select most significant variable
Go to next branch and next level
Stop growing if estimated X2 lt theoretical X2

35
CART (Nominal Target)