Title: Marketing Research
1Marketing Research
- Aaker, Kumar, Day and Leone
- Tenth Edition
- Instructors Presentation Slides
2Chapter Twenty
Discriminant and Canonical Analysis
3Discriminant Analysis
- Used to classify individuals into one of two or
more alternative groups on the basis of a set of
measurements - Used to identify variables that discriminate
between naturally occurring groups
4Objectives of Discriminant Analysis
- Determining linear combinations of the predictor
variables to separate groups by measuring
between-group variation relative to within-group
variation - Developing procedures for assigning new objects,
firms, or individuals, whose profiles, but not
group identity are known, to one of the two
groups - Testing whether significant differences exist
between the two groups based on the group
centroids - Determining which variables count most in
explaining inter-group differences
5Basic Concept
If we can assume that two populations have the
same variance, then the usual value of C
is where X1 and XII are the mean values for the
two groups, respectively.
Distribution of two populations
6Discriminant Function
Zi b1 X1 b2 X2 b3 X3 ... bn Xn
- Where Z discriminant score
- b discriminant weights
- X predictor (independent)
variables
In a particular group, each individual has a
discriminant score (zi) S zi centroid (group
mean) where i individual
Indicates most typical location of an individual
from a particular group
7Discriminant Function A Graphical Illustration
8Cut-off Score
- Criterion against which each individuals
discriminant score is judged to determine into
which group the individual should be classified
For equal group sizes
For unequal group sizes
9Determination of Significance
- Null Hypothesis In the population, the group
means the discriminant function are equal - Ho µA µB
- Generally, predictors with relatively large
standardized coefficients contribute more to the
discriminating power of the function - Canonical or discriminant loadings show the
variance that the predictor shares with the
function
10Classification and Validation
- Holdout Method
- Uses part of sample to construct classification
rule other subsample used for validation - Uses classification matrix and hit ratio to
evaluate groups classification - Uses discriminant weights to generate
discriminant scores for cases in subsample
11Classification and Validation (Contd.)
- U - method or Cross Validation
- Uses all available data without serious bias in
estimating error rates - Estimated classification error rates
- P1 m1/ n1 P2 m2 / n2
- where m1 and m2 number of sample observations
mis-classified in groups G1 and G2
12Steps in Discriminant Analysis
13Export Data Set
Respid Will(y1) Govt(y2) Train(x5) Size(x1) Exp(x6
) Rev(x2) Years(x3) Prod(x4) 1 4 5 1 49 1 1000 5
.5 6 2 3 4 1 46 1 1000 6.5 4 3 5 4 1 54 1 1000 6
.0 7 4 2 3 1 31 0 3000 6.0 5 5 4 3 1 50 1 2000 6
.5 7 6 5 4 1 69 1 1000 5.5 9 . . . . . . . . . .
. . . . . . . . . . . . . . . . . 115 4 3 1 45
1 2000 6.0 6 116 5 4 1 44 1 2000
5.8 11 117 3 4 1 46 0 1000 7.0 3 118 3 4 1 54 1 10
00 7.0 4 119 4 3 1 49 1 1000 6.5 7 120 4 5 1 54 1
4000 6.5 7
Marketing Research 8th Edition Aaker, Kumar, Day
14Description of Variables
Variable Description Corresponding Name in Output Scale Values
Willingness to Export (Y1) Will 1(definitely not interested) to 5 (definitely interested)
Level of Interest in Seeking Govt Assistance (Y2) Govt 1(definitely not interested) to 5 (definitely interested)
Employee Size (X1) Size Greater than Zero
Firm Revenue (X2) Rev In millions of dollars
Years of Operation in the Domestic Market (X3) Years Actual number of years
Number of Products Currently Produced by the Firm (X4) Prod Actual number
Training of Employees (X5) Train 0 (no formal program) or 1 (existence of a formal program)
Management Experience in International Operation (X6) Exp 0 (no experience) or 1 (presence of experience)
15Export Data Set Discriminant Analysis Results
16Discriminant Analysis Results (Contd.)
17Discriminant Analysis Results (Contd.)
18Multiple Discriminant Analysis
- Number of possible discriminant functions
- Min (p, m-1)
- Where M number of groups
- P number of predictor variables
- Assumptions Underlying the Discriminant Function
- The p independent variables must have a
multivariate normal - distribution
- 2. The p x p variancecovariance matrix of the
independent variables in each of the two groups
must be the same
19Canonical Correlation Analysis
- Canonical correlation analysis is a multivariate
statistical model that helps the study of
interrelationships among sets of multiple
dependent variables and multiple independent
variables. - Sets of variables on each side are combined to
form linear composites such that the correlation
between these linear composites (canonical
variates) is maximized
Y1 Y2 Yn X1 X2 Xn
20Objectives of Canonical Correlation Analysis
- To determine whether two sets of variables are
independent of one another and estimate the
magnitude of the relationship between the two
sets. - Derive a set of weights for each set (dependent
and independent) of variables so that the linear
combinations of each set are maximally
correlated. - Explain nature of relationships among sets of
variables by measuring the relative importance of
each variable to the canonical functions
(relationships).
21Canonical Loadings and Roots
- Canonical loadings or canonical structure
coefficients measure the simple correlation
between an original observed variable in the
dependent or independent set and the sets
canonical variate or the linear composite. - reflects the variance that the original variable
shares with the canonical variate or the relative
contribution of each of the variable to the
canonical function. - Canonical roots or the eigenvalues are the
squared canonical correlations (i.e. correlation
between dependent and independent canonical
variate) - reflects the percentage of variance in the
dependent canonical variate that can be explained
by the independent canonical variate.
22Interpreting Canonical Functions
- Sign and magnitude of canonical weights
(standardized coefficients) on each of the
canonical functions help to identify the relative
importance of each of the variables in deriving
the canonical relationships. - Maximum number of canonical function that can be
extracted equals the number of variables in the
smallest data set (independent set or dependent
set). - Redundancy index (the amount of variance in
canonical variate explained by the other
canonical variate in the canonical function
obtained by multiplying the shared variance of
the variate with the squared canonical
correlation) helps to overcome the bias and
uncertainty in using canonical roots as a measure
of shared variance.
23Limitations of Canonical Correlation Analysis
- Procedures that maximize the correlation do not
necessarily maximize interpretation of the pairs
of canonical variates therefore canonical
solutions are not easily interpretable. - Rotation of canonical variate (like in factor
analysis) to improve interpretability is not a
common practice and not available in most
computer programs. - If a non-linear relationship between dimensions
in a pair is suspected, use of canonical
correlation may be inappropriate unless the
variables are transformed or combined to capture
the non-linear relationship. - Only orthogonal solution is normally available.
- Changing variable in one set alters the
composition of canonical variate in the other set
significantly. - There is no causal relationship but is only a
correlational technique.
24Export Data Set Canonical Correlation Results
The CANCORR Procedure Attitude to
Exporting 2 Firm characteristics
6 Observations 120
Adjusted Approximate
Squared Canonical
Canonical Standard Canonical
Correlation
Correlation Error
Correlation 1 0.857700
0.850646 0.024233
0.735649 2 0.434392
0.405915 0.074372
0.188697
25Canonical Correlation Results (Contd.)
Raw Canonical Coefficients for the Attitude to
Exporting
attitude1 attitude2
y1 y1 0.663025751
-0.825828605 y2
y2 0.1747547312 1.1757781282
Raw Canonical Coefficients for the Firm
characteristics
demographics1 demographics2
x1 x1 0.0590789526
0.03138617 x2 x2
0.0001734106 0.0009537723
x3 x3 -0.372885396
0.1278689212 x4
x4 0.1427469498 -0.150119835
x5 x5 0.1194923096
0.4450507388 x6
x6 0.0015015543 -0.164606455
26Canonical Correlation Results (Contd.)
Standardized Canonical
Coefficients for the Attitude to Exporting
attitude1
attitude2 y1
y1 0.8531 -1.0625
y2 y2 0.2003
1.3478 Standardized Canonical
Coefficients for the Firm characteristics
demographics1
demographics2 x1
x1 0.6122 0.3252
x2 x2 0.1641
0.9028 x3
x3 -0.3222 0.1105
x4 x4 0.3605
-0.3791 x5
x5 0.0550 0.2048
x6 x6 0.0007
-0.0738
27Canonical Correlation Results (Contd.)
- Correlations Between the Attitude to Exporting
and Their Canonical Variables -
- attitude1 attitude2
- y1 y1 0.9891
-0.1470 - y2 y2 0.7798
0.6261 - Correlations Between the Firm characteristics and
Their Canonical Variables -
- demographics1 demographics2
- x1 x1 0.8771
0.0208 - x2 x2 0.0223
0.9038 - x3 x3 -0.4618
0.4067 - x4 x4 0.7944
-0.1369 - x5 x5 0.4331
0.3525 - x6 x6 0.5672
-0.1114
- Correlations Between the Attitude to Exporting
and the Canonical Variables of the Firm
characteristics -
- demographics1 demographics2
- y1 y1 0.8484
-0.0639 - y2 y2 0.6688
0.2720 - Correlations Between the Firm characteristics and
the Canonical Variables of the Attitude to
Exporting - attitude1 attitude2
- x1 x1 0.7523 0.0090
- x2 x2 0.0191 0.3926
- x3 x3 -0.3961 0.1767
- x4 x4 0.6814 -0.0595
- x5 x5 0.3714 0.1531
- x6 x6 0.4865 -0.04