Title: Multivariate statistics and Market segmentation: Principal Components Analysis Cluster Analysis
1Multivariate statistics and Market
segmentationPrincipal Components
AnalysisCluster Analysis
- AE B37 - Week 7 19 February 2003 MM
2Further readings
- Further readings
- Malhotra Chapter 19, 20
- Churchill, Iacobucci Chapters 17
- Aaker et al. Chapter 21
3Lecture outline
- Basic statistical concepts
- Factor analysis and Principal Components Analysis
- Data reduction and summarisation
- Cluster Analysis
- Grouping similar statistical units
- Joint application of PCA and CA
- SPSS application
4Basic statistical concepts
- Variance
- Covariance
- Correlation and covariance
- Standardisation
5Factor Analysis
- A statistical procedure for data reduction,
i.e. summarising a given set of variables into a
reduced set of unrelated variables, explaining
most of the original variability - Objectives of Factor analysis
- Identification of a smaller set of unrelated
variables replacing the original set - Identification of underlying factors explaining
correlation among variables - Selection of a smaller set of salient variables
6Factor Analysis and marketing research
- Identification of customers characteristics prior
to clustering into groups (market segmentation) - Identification of product/brand attributes that
influence consumer choice - Understanding the correlation between target
consumer and media consumption habits
7Some notation
- p variables have been recorded on n individuals
- Xj indicates the generic variable j
- xij refers to the value of the j-th variable as
recorded on the i-th individual - Xjxij i1,2,,n j1,2,,p
- ?X is the variance-covariance matrix of X
8Week7.sav variable view
- p9 (all variables but the first (custid)
9Week7.sav Data view
X1 X2 X3 X4 X5 X6 X7 X8
X9
10The correlation matrix
11Factor analysis model
- X1 m1 g11F1 g12F2 g1mFme1
- X2 m2 g21F1 g22F2 g2mFme2
- ?
- Xj mj gj1F1 gj2F2 gjmFmej
- ?
- Xp mp gp1F1 gp2F2 gpmFmep
X m GF e
where
Fi (i1,2,,m) are uncorrelated random variables
(common factors) m?p mi (i1,2,,p) are unique
factors for each variable ei (i1,2,,p) are
error random variables, uncorrelated with each
other and with F and represent the residual error
due to the use of common factors
12Factor analysis model (factors view)
- F1 b11X1 b12X2 b1pXp
- F2 b21X1 b22X2 b2pXp
- ?
- Fj bj1X1 bj2X2 bjpXp
- ?
- Fm bp1X1 bp2X2 bppXp
F bX
The common factors are linear combinations of the
original variables
13Estimation
- There is not an unique solution (set of common
factors) any orthogonal rotation of the
solution is acceptable (factor rotation) - Variables in X need to be standardised prior to
analysis - Factor analysis estimate the following
quantities - The simple correlations (covariance) between each
factor i and the original variables j (factor
loadings), i.e. the coefficients gij (the factor
or component matrix) - The values of each common factor, for each of the
statistical units (factor scores)
14Summarising covariance
- The original set of variables X is characterised
by a p?p variance-covariance matrix
15Covariance matrix for the residual variable
- By summarising the original data through m
factors we commit an error measured by the
residuals ei, whose diagonal variance-covariance
matrix is
16The fundamental relationship of Factor analysis
Original variance
Residual variance
Communality of Xi portion of the variance of Xi
explained by the m factors Communalities allow
to identify which of the variables is best
explained by the selected factors
17Principal Component Analysis
- It is a special case / estimation method of
factor analysis - The factors are built so that the first component
has the maximum possible amount of explained
variance - All original variance is considered, whereas in
factor analysis the estimates are only based in
common variance - Component scores can be computed exactly, whilst
factor scores are estimated there is no
guarantee that estimated factor scores will be
actually uncorrelated between each other
18Choice of the number of principal components
- Level of explained variance
- Usually the m components explaining 70-80 of
the total variability - Eigenvalues of the data correlation matrix
- The eigenvalues corresponding to each component
represents the amount of variance they explain.
The sum of eigenvalues equals the original number
of variables - Eigenvalues larger than 1 (explaining more
variance than the average component) - Scree diagram
19Scree diagram (elbow rule)
20The component scores
- F1 b11X1 b12X2 b1pXp
- The component scores are computed for each case
and each of the m principal components - The values of the component scores (standardised
to have mean 0 and variance 1) can be used for
summarising the data (plots or subsequent
analysis) - The essential characteristic of the components is
the lack of correlation between each other
21Spss
22Spss Output (1)
23(No Transcript)
24The factor scores
25Interpreting the component matrix
1. Family Supermarket shopper
3. Single frequent cost-caring shopper
2. Family quality shopper
4. Vegetarian shopper
26Cluster Analysis
- It is a class of techniques used to classify
cases into groups that are relatively homogeneous
within themselves and heterogeneous between each
other, on the basis of a defined set of
variables. These groups are called clusters.
27Cluster Analysis and marketing research
- Market segmentation. E.g. clustering of consumers
according to their attribute preferences - Understanding buyers behaviours. Consumers with
similar behaviours/characteristics are clustered - Identifying new product opportunities. Clusters
of similar brands/products can help identifying
competitors / market opportunities - Reducing data. E.g. in preference mapping
28Steps to conduct a Cluster Analysis
- Select a distance measure
- Select a clustering algorithm
- Determine the number of clusters
- Validate the analysis
29(No Transcript)
30Defining distance the Euclidean distance
- Dij distance between cases i and j
- xki value of variable Xk for case j
- Problems
- Different measures different weights
- Correlation between variables (double counting)
- Solution Principal component analysis
31Clustering procedures
- Hierarchical procedures
- Agglomerative (start from n clusters, to get to 1
cluster) - Divisive (start from 1 cluster, to get to n
cluster) - Non hierarchical procedures
- K-means clustering
32Agglomerative clustering
33Agglomerative clustering
- Linkage methods
- Single linkage (minimum distance)
- Complete linkage (maximum distance)
- Average linkage
- Wards method
- Compute sum of squared distances within clusters
- Aggregate clusters with the minimum increase in
the overall sum of squares - Centroid method
- The distance between two clusters is defined as
the difference between the centroids (cluster
averages)
34K-means clustering
- The number k of cluster is fixed
- An initial set of k seeds (aggregation centres)
is provided - First k elements
- Other seeds
- Given a certain treshold, all units are assigned
to the nearest cluster seed - New seeds are computed
- Go back to step 3 until no reclassification is
necessary - Units can be reassigned in successive steps
(optimising partioning)
35Hierarchical vs Non hierarchical methods
- Hierarchical clustering
- No decision about the number of clusters
- Problems when data contain a high level of error
- Can be very slow
- Initial decision are more influential (one-step
only)
- Non hierarchical clustering
- Faster, more reliable
- Need to specify the number of clusters
(arbitrary) - Need to set the initial seeds (arbitrary)
36Suggested approach
- First perform a hierarchical method to define the
number of clusters - Then use the k-means procedure to actually form
the clusters
37Defining the number of clusters elbow rule (1)
n
38Elbow rule (2) the scree diagram
39Validating the analysis
- Impact of initial seeds / order of cases
- Impact of the selected method
- Consider the relevance of the chosen set of
variables
40SPSS Example
41(No Transcript)
42Number of clusters 10 6 4
43(No Transcript)