Title: Multiple Discriminant Analysis
1Multiple Discriminant Analysis
2Discriminant Analysis
- Disccriminant analysis involves deriving a
variate, the linear combination of two or more
independent variables that will discriminate best
between a priori defined groups
Zjk a w1X1k w2 x2kwnxnk Zjk
discriminant Z score of discriminant function j
for object k a intercept wi discriminant
weight for independent variable i xik
independent variable I for object k
3Discriminant Analysis
- Discriminant Analysis is used when we know the
classes that exist in the population and we want
to estimate a function that will classify
instances into the proper classes - Discriminant analysis is used when we have a
categorical dependent variable (the classes) and
continuous independent variables
4Assumptions
- Independent variables are normally distributed
- Relationships are linear
- No multicollinearity
- Equal variance in groups
- Should have at least 20 instances for each
independent variable
5Two Groups with Two Independent Variables
Classes are clearlydistinct but
neitherindependent variablediscriminates
betweenthe classes.
6Discriminant Function
Goal is to find a linearfunction D(x1,x2)of
the independent variables that
discriminatesbetween the groups.
Another way to look at it is that we rotate the
axes inspace until they maximally separate the
groups.
7Cant Always Find a Clean Function
The black line representsthe optimum
discriminantfunction but clearly thereremains
considerableoverlap.
8Cant Always Find a Clean Function
After projecting thedistribution of thedata
onto the discriminant functionits easy to see
theoverlap.
However, if we selecta cutoff point of
about-1.7 we can discriminatefairly well.
9Objectives of Discriminant Analysis
- Determine whether statistically significant
differences exist between the average score
profiles for two or more a priori defined classes - Determine which of the independent variables
account the most for the differences in the
average score profiles
10Objectives of Discriminant Analysis
- Establish procedures for classifying instances
into groups - Establish the number and composition of the
dimensions of discrimination between classes
formed from the set of independent variables
11Dependent Variable
- Dependent variable should represent two or more
mutually exclusive, collectively exhaustive
groups - If the dependent variable is interval or metric
data then it must be transformed into a
categorical variable
12Division of Sample
- Sample should be divided into a analysis sample
and holdout sample - Size of classes in holdout sample should be
proportionate with the overall sample - The model is estimated with the analysis sample
and verified on the holdout sample
13Computational Method
- Simultaneous estimation
- All independent variables are included
- Stepwise estimation
- Algorithm selects the independent variable that
best discriminates then selects the next best,
etc. - A variable may be removed if a combination of
variables are found that include the information
of the variable - Continues until no variables can be added that
increase discrimination
14Statistical Testing
- Must test the overall significance of the model
- Must also test the significance of each of the
discriminant functions - If a function is not significant, model should be
re-estimated with the number of functions to be
derived limited to the number that are significant
15Assessing Overall Fit
- A model may be statistically significant but may
not adequately discriminate between classes - This is especially true with very large sample
sizes - Can use graphs of discriminant functions
- Best to use classification matrices
16Classification Matrix
Performed on holdout sample
t-test of classification accuracy
P proportion correctly classified N sample
size
17Cutting Score Determination
- The cutting score is the criterion against which
each instances discriminant score is compared to
determine which class the instances belongs to - Optimum cutting score
ZCE ZA ZB 2 ZCE Critical cutting
scoreZA centroid for class A ZB centroid for
class B
18Cutting Score for Unequal Class Sizes
- If classes are not the same size (number of
instances) then cutting score is calculated as
ZCU NAZB NBZA NA NB ZCU critical
cutting score NA number in class A NB number
in class B ZA centroid for class A ZB
centroid for class B
19Chance-Based Criteria
- If class sizes are equal then the chance
classification is 1/G where G is the number of
classes - If we have a sample in which 75 of the instances
are in one class then if we simply classified all
the instances into the larger class we would
achieve 75 accuracy - Should be able to predict at least 25 higher
than chance-based classification
20Chance-Based Criteria
- Maximum chance criterion
- Proportion in largest class
- Proportional chance criterion
- Considers the fact that classifying instances in
smaller groups is riskier than classifying
instances in larger classes
CPRO p2 (1-p)2 p proportion of individuals
in class 1 (1-p) proportion of individuals in
class 2