Title: PLS Multinomial Logit in Satisfaction Surveys
1PLS Multinomial Logit in Satisfaction Surveys
- 6th International Conference on Partial Least
Square and Related Method Beijing - Sep. 4
Sep. 7
2Introduction
- Satisfaction surveys measure value judgments of
customers on products or services. - The objective of such studies is twofolds
- To deliver an accurate and robust measurement of
a Global Satisfaction Index of customers. - To prioritize the quality improvements that could
be undertaken, in order to get a significant
impact on this Global Satisfaction Index.
3Introduction
- Typically, these studies entail implementing
Correlations, Multiple Regression or Logit
Multinomial Regression in order to relate Global
Satisfaction to more specific components of
product quality. - However, a common pitfall of such studies is that
the various quality assessement considered as
explanatory variables of the Global Satisfaction
Index tend to be highly correlated.
4Introduction
- From the research work of M. Tenenhaus, V. Vinzi
and P. Bastien, we developed a software solution
implementing... - PLS Logit Multinomial Regression
- The present communication is intended to bring to
light the benefits of this methodology over the
classical Logit Multinomial Regression on a
sample of ten real surveys selected in various
domains.
5Hypothesis to check in this test
- The regular Logit Multinomial Regression highly
suffers from missing data and multi colinearity
between explanatory variables - Multi colinearity leads to a lack of robustness
in the model estimations. - Another outcome could be the so-called
suppressive effect that leads parameters to be
erroneously - non significant. - A third effect is that some parameters may
display counter-intuitive results (e.g.
unexpected sign of a parameter bh) - Missing data in explanatory variables are not
uncommon in Market Research surveys the Logit
MNL method deals with missing data list wise,
i.e. each row of data with at least one missing
data is discarded from the analysis, which may
turn out to be very costly, as we will see later.
6Hypothesis to check in this test
We would like to check if the PLS Logit
Multinomial approach offers the opportunity to
improve the regular Logit Multinomial model on
each of these angles
7Summary
- Real issues of Satisfaction Surveys
- Choice of an appropriate MNL Regression model
- The PLS Logit Multinomial Regression model
- Description of the data sets sample used in the
test - Main results of the Test
- Conclusion
8Real issues of Satisfaction Surveys
9Real issues of Satisfaction Surveys
Satisfaction Surveys consist in measuring
personal level of consumerssatisfaction.
The relevance of such studies is based on a
strong belief that consumer satisfaction is a
necessary condition for true growth and long term
profitability of the firm.
10Real issues of Satisfaction Surveys
- As consumer satisfaction depends partly on
personal involvement of managers and employees,
the credibility of such a measurement tool should
be unquestionable. - First, it appears necessary to depend on a
reliable Global Criterion of Satisfaction. This
criterion may refer either to - Global Satisfaction with the Brand or
- Loyalty to the Brand or
- "Would Recommend the Brand".
11Real issues of Satisfaction Surveys
-
- "Would Recommend the Brand" ?
- The latter, which tests for both the rational and
the emotional dimension, appeared to be the
better predictor of customer behavior across a
range of industries.
12The "Net Promoter Score"
- In the following test, we will go through ten
surveys based on the "Net Promoter Score"
paradigm. - The selected criterion is a zero-to-ten scale,
according to the degree of agreement to the
selected unique statement - "Would you recommend Brand X to a friend or a
colleague?" -
- from 0 "not at all likely"
- to 10"extremely likely".
"Would you recommend Brand X to a friend or a
colleague?"
13Definition of the "Net Promoter Score"
- Then, we interpret these degrees on a behavioral
scale made of three clusters - the first, including grades of nine and ten, is
called "Promoters", - seven and eight being the "Passively Satisfied",
and - the remaining segment - one to six - being
qualified as "Detractors". - The frequency of the former is called the
Promoter score or P score. - The frequency of the latter is called the
Detractor score or D score. - Then the unique criterion is the "Net Promoter
Score" NPS P - D. NPS was created by Fred
Reicheld (Bain Co) the whole system is called
Satmetrix.
NPS
14From the "Net Promoter Score" to Action Plan
- Once having this Global NPS score, the key issue
in consumer satisfaction management is to specify
and implement an appropriate Action Plan. - A first approach, purely managerial, is based on
an internal development of what could be called
an NPS corporate culture, as for instance a
remuneration plan based on the NPS score. - The other, more Market Research oriented,
consists in identifying a series of significant
satisfaction drivers to prioritize.
15CHOICE OF AN APPROPRIATE REGRESSION MODEL
16"Drivers"
Besides the "Global Satisfaction" criterion, The
survey includes also other questions. Those
questions aim at knowing the degree
ofsatisfaction on specific characteristics of
the Product or Service. Those criteria are also
called "Drivers" because they are supposed to
influence more of less the Global
Satisfaction score.
17A two-dimensions scheme
Further research, based on deduction from F.
Herzberg "Motivator-Hygienic Theory" , shows that
things are not symmetrical, and that a two
dimensional scheme should be considered.
- The one dimension scheme assumes that each
satisfaction drivers operate along a continuum
from "dissatisfaction" to "satisfaction".
18"Attractive" and "Must be" drivers
- Satisfaction drivers are then classified in 3
categories - Attractive qualities are able to bring about a
high level of satisfaction - Must be qualities that, if default, may lead to a
strong feeling of dissatisfaction, - Symmetrical are in between.
As an example of Must-be quality, a ball-pen user
may be dissatisfied when the ink flow is
insufficient, but the same user won't be highly
satisfied if it is sufficient.
19The Category Base Logit Multinomial Model
- We meet again our dependant variable with the 3
clusters that allows to compute the NPS score. - Detractors
- Neutral (Passively Satisfied)
- Promoters
Basically, the model should explain a Global
Satisfaction score in a two dimensional scheme.
This may be done with a Logit Multinomial
Regression model where the base category is the
middle "Passive" cell.
20Meaning of the 2 sets of b parameters
- This two dimensional arrangement allows to get
two sets of b parameters. - The meaning of those 2 sets of b parameters is
made explicit in the following logit expressions.
We also find again the 4 categories of drivers
according to resp. values of b1 and b2 b1 lt
b2 ? "Attractive driver b1 gt b2 ?
" Must be driver b1 ? b2 ?
"Symmetrical driver b1 b2 ? 0 ?
"Ineffective driver
N.B. indices j are omitted
21Estimation of the of the b parameters
- We first estimated these b parameters using a
Logit Multinomial Regression model, namely the
Nomreg procedure in the SPSS package. - In this Logit Multinomial model, the criterion
used to estimate the parameters is the Maximum
likelihood. The parameters estimates are computed
using the Newton-Raphson iterative algorithm. - We recall here that this algorithm deals with
missing data listwise, which means that every
record with at least one missing data in the
explanatory variable is discarded. We will see in
the analysis of real surveys data set, the
importance of this feature.
22The PLS Logit Multinomial Regression
23The PLS Logit Multinomial an iterative process
This iterative process follow the PLS1 algorithm
(see Tenehaus)
24Step 1 To compute new weights
Initial explanatory variables
Residual explanatory variables
Regression weights
PLS components
Independent variables are orthogonal so there is
no damage due to co linearity
25Step 2 To compute a new PLS Component
The logit estimation will be expressed as a
linear function of the original variables
Components function on the original independent
variables
26Step 3 To compute new X residuals
?
- Maximum number of Components reached
- Others
Regression and OLS Regression have to deal with
orthogonal independant variables, thus are not
exposed to the potential damages of colinearity.
27Datasets Sample
28Description of the ten datasets
The selected data sets come from real
Satisfaction surveys made by the GN Research
company. These surveys were chosen to be
heterogeneous in several respects
- First of all, the studies were addressed to
different populations, in different fields, which
implied different level of satisfaction scores. - They also have different forms, with very
different size of samples or different numbers of
explanatory variables.
29Selected Sectors and Target Groups
30Number of variables
31Mising values
69
Missing list wise
23
Missing cells
The graph shows the of missing cells (used in
PLS MNL) ascompared to the of missing data
list wise (used in regular MNL Regression).
32Dependant variables frequencies
Frequencies of the 3 categories of customers
according to their Promotors score
N.B. they are classified below in decreasing
order of NPS scores
33Net Promoters Score
NPS score Promotors - Detractors
34RESULTS OF THE TEST
354. Results of the test
- The following results aim at doing comparisons
between Logit Multinomial Regression with SPSS
and PLS Logit Multinomial Regression. For the
latter, we used the Logycs software developed
conjointly by Interstat and GN Research. - We are interested here in two global criteria
- Scores of prediction of the dependant variable,
summary of the "confusion table". - Scores of conformity of the weights signs to
experienced analyst expectations
36Scores of recognition
N.B. These scores are computed on the learning
sample (i.e. without cross validation).
At first glance, scores of recognition obtained
with the regular Logit MNL algorithm seem better
(on average 6.9 better) than those obtained with
the PLS model
explanation ?
37Scores of recognition
The reason why the recognition scores of the
regular MNL regression seem better is that they
are computed on very small samples.
The of observations "list wise" actually used
by the PLS MNL model is always 100. The used
by the regular MNL model is between 5 and
51. N.B. "H" is a special case, where there is
no missing data at all.
38Data sets features revisited
- Columns (7) and (8) show the number of
observations actually used by each model
- Then, columns (9) and (10) show the average
number of observations per estimated parameter,
the number of which is approximated as twice the
number of independent variables (1).
39How much more efficient the PLS MNL model is .
- This allows finally to compute an "efficiency
ratio (11) (10)/(9) between both methods . - The average of these ratios is 8.3, i.e. the PLS
method allows to use 8.3 times more observations
than the regular method. - Beyond the loss of precision, such a huge amount
of eliminated data imply a large number of
uncontrollable biases.
40Scores of conformity to b sign expectations
Each parameter of the model should also be
meaningful, as far as its sign is concerned. This
is especially critical for diagnosis and
simulation purposes. So, we need to know what
the right sign must be.This information is
provided either by the expectations of
experienced analysts or by the signs of the
simple correlations.
The average ratio of conformity are respectively
67 for regular MNL and 82 for PLS MNL, i.e. 15
better in absolute value, or a relative increase
of 20.
41CONCLUSION
42Conclusion
- The two pitfalls of the regular MNL method
considered in the premise of this test was
missing data and multi colinearity. - Missing data management appeared to be the main
contribution of the PLS method, due to the
frequency of missing data in our Satisfaction
surveys sample. The list wise manner of the
regular method to deal with missing data leads to
huge reductions of data available. It appears to
be totally inappropriate and brings considerable
biases in the parameter estimates. - Multi colinearity leads to counter-intuitive
results. We tested the frequency of unexpected
signs of bh parameters. Here again, the PLS
method brings a significant contribution. We
estimate a 20 decrease in the number of counter
intuitive parameters.
43gnresearch
gnresearch tunisia Zone industrielle Sidi
Daoud 2046 La Marsa - Tunis 216 71 77 77
88 info.tn_at_gnresearch.com
gnresearch germany Opitzstrasse 12 40474
Düsseldorf 49 (0)211 47 84 70 Info.de_at_gnresearch.
com
gnresearch italy Via di Priscilla, 101 00199
Roma Corso Garibaldi, 86 20121 Milan Via
Demetrio Marin, 3 70125 Bari 39 06 86 51
71 Info.it_at_gnresearch.com
gnresearch albania Rruga Abdyl Frasheri,
31 Blloku, Tirana 355 42 25 82
61 Info.al_at_gnresearch.com
gnresearch france 3, rue Henri Rol Tanguy 93100
Montreuil 33 (0) 1 45 30 72 00 info.fr_at_gnresearch
.com