Discrete Choice Modeling presentation

About This Presentation

Transcript and Presenter's Notes

Title: Discrete Choice Modeling

1
Discrete Choice Modeling

William Greene
Stern School of Business
New York University

2
Part 11

Modeling Heterogeneity
Latent Class Models
The Mixed Logit Model

3
Heterogeneity

Observational Observable differences across
choice makers
Choice strategy How consumers make decisions.
(Omitted attributes)
Structure Model frameworks
Preferences Model parameters

4
Accommodating Heterogeneity

Observed? Enter in the model in familiar (and
unfamiliar) ways.
Unobserved?

5
Observable (Quantifiable) Heterogeneity in
Utility Levels
Choice, e.g., among brands of cars xitj
attributes price, features Zit observable
characteristics age, sex, income
6
Observable Heterogeneity in Preference Weights
7
Quantifiable Heterogeneity in Scaling
wit observable characteristics age, sex,
income, etc.
8
Attention to Heterogeneity

Modeling heterogeneity is important
Scaling is extremely important
Attention to heterogeneity an informal survey
of four literatures

9
Heterogeneity in Choice Strategy

Consumers avoid complexity
Lexicographic preferences eliminate certain
choices ? choice set may be endogenously
determined
Simplification strategies may eliminate certain
attributes
Information processing strategy is a source of
heterogeneity in the model.

10
Structural Heterogeneity

Marketing literature
Latent class structures
Yang/Allenby - latent class random parameters
models
Kamkura et al latent class nested logit models
with fixed parameters

11
Heteroscedasticity in the MNL Model

Motivation Scaling in utility functions
If ignored, distorts coefficients
Random utility basis
Uij ?j ?xij ?zi ?j?ij
i 1,,N j 1,,J(i)
F(?ij/ ?j) 1 Exp(-Exp(?ij/ ?j)) now
scaled
Extensions Relaxes IIA
Allows heteroscedasticity

12
Latent Heterogeneity

Limitation of the MNL Model Fundamental tastes
are the same across all individuals
How to adjust the model to allow variation across
individuals?
Full random variation
Latent clustering allow some variation

13
Heterogeneity

Modeling individual heterogeneity
Latent class Discrete approximation
Mixed logit Continuous
The mixed logit model (generalities)
Data structure RP and SP data
Induces heterogeneity
Induces heteroscedasticity scaling problem

14
A Latent Class Model

Within a class
Class sorting is probabilistic (to the analyst)
determined by individual characteristics

15
Latent Classes and Random Parameters
16
Latent Class Probabilities

Ambiguous at face value Classical Bayesian
model?
Equivalent to random parameters models with
discrete parameter variation
Using nested logits, etc. does not change this
Precisely analogous to continuous random
parameter models
Not always equivalent zero inflation models

17
Estimates from the LCM

Taste parameters within each class ?q
Parameters of the class probability model, ?q
For each person
Posterior estimates of the class they are in qi
Posterior estimates of their taste parameters
E?qi
Posterior estimates of their behavioral
parameters, elasticities, marginal effects, etc.

18
Using the Latent Class Model

Computing Posterior (individual specific) class
probabilities
Computing posterior (individual specific) taste
parameters

19
Application Brand Choice

True underlying model is a three class LCM
NLOGIT
lhschoice
choicesBrand1,Brand2,Brand3,None
Rhs Fash,Qual,Price,ASC4
LCMMale,Age25,Age39
Pts3 Pds8 Par

20
MNL Starting Values and Basis
Normal exit from iterations. Exit
status0. ---------------------------------------
------ Discrete choice (multinomial logit)
model Log likelihood function
-4158.503 Number of parameters
4 Akaike IC 8325.006 Bayes IC
8349.289 Finite sample corrected AIC
8325.018 R21-LogL/LogL Log-L fncn
R-sqrd RsqAdj Constants only -4391.1804
.05299 .05101 Response data are given as
ind. choice. Number of obs. 3200,
skipped 0 bad obs. --------------------------
------------------- ---------------------------
------------------ Notes No coefficientsgt
P(i,j)1/J(i). Constants only gt
P(i,j) uses ASCs only. N(j)/N if
fixed choice set. N(j) total
sample frequency for j N total
sample frequency. These 2 models
are simple MNL models. R-sqrd 1 -
LogL(model)/logL(other)
RsqAdj1-nJ/(nJ-nparm)(1-R-sqrd)
nJ sum over i, choice set sizes
---------------------------------------------
21
One Class MNL Estimates
----------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ---------------------
----------------------------------- FASH1
1.47890473 .06776814 21.823 .0000
QUAL1 1.01372755 .06444532 15.730
.0000 PRICE1 -11.8023376 .80406103
-14.678 .0000 ASC41 .03679254
.07176387 .513 .6082
22
Three Class LCM
Normal exit from iterations. Exit
status0. ---------------------------------------
------ Latent Class Logit Model
Log likelihood function -3649.132
Number of parameters 20
Restricted log likelihood -4436.142
Chi squared 1574.019
Degrees of freedom 20
ProbChiSqd gt value .0000000
R21-LogL/LogL Log-L fncn R-sqrd RsqAdj
No coefficients -4436.1420 .17741 .17569
Constants only -4391.1804 .16899 .16725
At start values -4158.5428 .12250 .12067
Response data are given as ind. choice.
---------------------------------------------
---------------------------------------------
Latent Class Logit Model
Number of latent classes 3
-------------------------------------------
LCM model with panel has 400 groups.
Fixed number of obsrvs./group 8
Discrete parameter variation specified.
-------------------------------------------
Number of obs. 3200, skipped 0 bad obs.
---------------------------------------------
LogL for one class MNL -4158.503
23
Estimated LCM Utilities
----------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ---------------------
-----------------------------------
Utility parameters in latent class --gtgt 1 FASH1
3.02569837 .14335927 21.106
.0000 QUAL1 -.08781664 .12271563
-.716 .4742 PRICE1 -9.69638056
1.40807055 -6.886 .0000 ASC41
1.28998874 .14533927 8.876 .0000
Utility parameters in latent class --gtgt 2
FASH2 1.19721944 .10652336 11.239
.0000 QUAL2 1.11574955 .09712630
11.488 .0000 PRICE2 -13.9345351
1.22424326 -11.382 .0000 ASC42
-.43137842 .10789864 -3.998 .0001
Utility parameters in latent class --gtgt 3
FASH3 -.17167791 .10507720 -1.634
.1023 QUAL3 2.71880759 .11598720
23.441 .0000 PRICE3 -8.96483046
1.31314897 -6.827 .0000 ASC43
.18639318 .12553591 1.485 .1376
24
Estimated LCM Class Probability Model
----------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ---------------------
-----------------------------------
This is THETA(1) in class probability model.
Constant -.90344530 .34993290 -2.582
.0098 _MALE1 .64182630 .34107555
1.882 .0599 _AGE251 2.13320852
.31898707 6.687 .0000 _AGE391
.72630019 .42693187 1.701 .0889
This is THETA(2) in class probability model.
Constant .37636493 .33156623 1.135
.2563 _MALE2 -2.76536019 .68144724
-4.058 .0000 _AGE252 -.11945858
.54363073 -.220 .8261 _AGE392
1.97656718 .70318717 2.811 .0049
This is THETA(3) in class probability model.
Constant .000000 ......(Fixed
Parameter)....... _MALE3 .000000
......(Fixed Parameter)....... _AGE253
.000000 ......(Fixed Parameter).......
_AGE393 .000000 ......(Fixed
Parameter).......
25
Estimated LCM Conditional Parameter Estimates
26
Estimated LCM Conditional Class Probabilities
27
Average Estimated Class Probabilities

MATRIX list 1/400 classp_i'1
Matrix Result has 3 rows and 1 columns.
1
--------------
1 .50555
2 .23853
3 .25593
This is how the data were simulated. Class
probabilities are .5, .25, .25. The model
worked.

28
Application Long Distance Drivers Preference
for Road Environments

New Zealand survey, 2000, 274 drivers
Mixed revealed and stated choice experiment
4 Alternatives in choice set
The current road the respondent is/has been
using
A hypothetical 2-lane road
A hypothetical 4-lane road with no median
A hypothetical 4-lane road with a wide grass
median.
16 stated choice situations for each with 2
choice profiles
choices involving all 4 choices
choices involving only the last 3 (hypothetical)

Hensher and Greene, A Latent Class Model for
Discrete Choice Analysis Contrasts with Mixed
Logit Transportation Research B, 2003
29
Attributes

Time on the open road which is free flow (in
minutes)
Time on the open road which is slowed by other
traffic (in minutes)
Percentage of total time on open road spent with
other vehicles close behind (ie tailgating) ()
Curviness of the road (A four-level attribute -
almost straight, slight, moderate, winding)
Running costs (in dollars)
Toll cost (in dollars).

30
Experimental Design

The four levels of the six attributes that were
chosen are as follows
Free Flow Travel Time -20, -10, 10, 20
Time Slowed Down -20, -10, 10, 20
Percent of time with vehicles close behind-50,
-25, 25, 50
Curvinessalmost, straight, slight, moderate,
winding
Running Costs -10, -5, 5, 10
Toll cost for car and double for truck if trip
duration is
1 hours or less 0, 0.5, 1.5, 3
between 1 hour and 2 hours 30 minutes 0, 1.5,
4.5, 9
more than 2 and a half hours 0, 2.5, 7.5, 15

31
Survey
32
Estimated Latent Class Model
33
Estimated Value of Time Saved

34
Distribution of Parameters Value of Time on 2
Lane Road

35
Continuous Random Variation in Preference Weights
36
Classical Estimation Platform The Likelihood
Expected value over all possible realizations of
?i (according to the estimated asymptotic
distribution). I.e., over all possible samples.
37
Maximum Simulated Likelihood
True log likelihood
Simulated log likelihood
38
Computational Difficulty?

Outside of normal linear models with normal
random coefficient distributions, performing the
integral can be computationally challenging.
(AR, p. 62)
(No longer even remotely true)
MSL with dozens of parameters is simple
Multivariate normal (multinomial probit) is no
longer the benchmark alternative. (See McFadden
and Train)
Intelligent methods of integration (Halton
sequences) speed up integration by factors of as
much as 10. (These could be used by Bayesians.)

39
Random Parameters Model

Allow model parameters as well as constants to be
random
Allow multiple observations with persistent
effects
Allow a hierarchical structure for parameters
not completely random
Uitj ?1xi1tj ?2itxi2tj ?izit
?ijt
Random parameters in multinomial logit model
?1 nonrandom (fixed) parameters
?2it random parameters that may vary across
individuals and across time
Maintain I.I.D. assumption for ?ijt (given ?)

40
Random Parameters Logit Model
Multiple choice situations Independent
conditioned on the individual specific parameters
41
Random Parameters Specification
?2it(k) parameter on kth attribute
?2k ?kzi ?kvit Mean
?2k ?kzi may depend on characteristics Var
iance ?k ?kMay be correlated with other
parameters Distribution Depends on
specification of vit Vit may be a random
effect or correlated across time to capture
persistence of preferences across choice
settings Elements of ? and/or choice specific
constants ? may also Be random
42
Modeling Variations

Parameter specification
Nonrandom variance 0
Fixed mean not to be estimated. Free variance
Fixed range mean estimated, triangular from 0
to 2?
Hierarchical structure - ?i ? ?(k)zi
Stochastic specification
Normal, uniform, triangular (tent) distributions
Strictly positive lognormal parameters (e.g.,
on income)
Autoregressive v(i,t,k) u(i,t,k)
r(k)v(i,t-1,k) this picks up time effects in
multiple choice situations, e.g., fatigue.

43
Estimating the Model

Denote by ?1 all fixed parameters in the
model Denote by ?2i,t all random and hierarchical
parameters in the model
44
Estimating the RPL Model

Denote by ?1 all fixed parameters in the model
Denote by ?2i,t all random and hierarchical
parameters in the model
Estimation ?1
?2it ?2 ?zi Gvi,t
Uncorrelated G is diagonal
Autocorrelated vi,t Rvi,t-1 ui,t
(1) Estimate structural parameters
(2) Estimate individual specific utility
parameters
(3) Estimate elasticities, etc.

45
Simulation Based Estimation

Choice probability Pdata ?(?1,?2,?,G,R,vi,t)
Need to integrate out the unobserved random term
EPdata ?(?1,?2,?,G,R,vi,t)
Pvi,tf(vi,t)dvi,t
Integration is done by simulation
Draw values of v and compute ? then probabilities
Average many draws
Maximize the sum of the logs of the averages
(See TrainCambridge, 2003 on simulation
methods.)

46
Customers Choice of Energy Supplier

California, Stated Preference Survey
361 customers presented with 8-12 choice
situations each
Supplier attributes
Fixed price cents per kWh
Length of contract
Local utility
Well-known company
Time-of-day rates (11 in day, 5 at night)
Seasonal rates (10 in summer, 8 in winter, 6
in spring/fall)

47
Population Distribution

Normal for
Contract length
Local utility
Well-known company
Log-normal for
Time-of-day rates
Seasonal rates
Price coefficient held fixed

48
Estimated Model

Estimate Std
error Price
-.883 0.050 Contract mean
-.213 0.026 std dev
.386 0.028 Local mean
2.23 0.127
std dev 1.75 0.137 Known
mean 1.59 0.100
std dev .962 0.098 TOD
mean 2.13 0.054
std dev .411
0.040 Seasonal mean 2.16
0.051 std dev .281
0.022 Parameters of underlying normal.
49
Distribution of Brand Value
Standard deviation
10 dislike local utility

2.0
0
2.5

Brand value of local utility

Contract Length

29
Mean
-.24
Standard Deviation
.55
0
-0.24

Local Utility

10
Mean
2.5
Standard Deviation
2.0
0
2.5

Well-known Company

5
Mean
1.8
Standard Deviation
1.1
0
1.8
51
Time of Day Rates (Customers do not like.)

Time-of-day Rates

0
-10.4

Seasonal Rates

-10.2
0
52
Expected Preferences of Each Customer
Population Mean
Customer As Conditional Mean
Contract length
-0.24
2.20
Local utility
2.50
3.30
Well-known company
1.80
2.00
Time-of-day rates
-10.40
-6.30
Seasonal rates
-10.20
-6.60
Customer likes long-term contract, local
utility, and non-fixed rates. Local utility
can retain and make profit from this customer by
offering a long-term contract with time-of-day
or seasonal rates.
53
A General Extension of the RPL

54
Other Model extensions

AR(1) wi,k,t rkwi,k,t-1 vi,k,t
Dynamic effects in the model
Restricting Sign
Restricting Range and Sign Using triangular
distribution and range 0 to 2?.

55
Heteroscedasticity and Heterogeneity
Why is heteroscedasticity important? Why should
only the means of the random parameters be
heterogeneous?
56
Estimating Individual Parameters

Model estimates structural parameters
Objective, model of individual specific
parameters
Can individual specific parameters be estimated?

57
Estimating Individual Distributions

Posterior estimates of E?i
Use the same methodology to estimate E?i2 and
Var?i.
Plot individual confidence intervals (assuming
near normality)
Sample from the distribution and plot kernel
density estimates

58
Posterior Estimation of ?i
Estimate by simulation
59
Application Shoe Brand Choice

Simulated Data Stated Choice, 400 respondents, 8
choice situations
3 choice/attributes NONE
Fashion High / Low
Quality High / Low
Price 25/50/75,100 coded 1,2,3,4
Heterogeneity Sex, Age (lt25, 25-39, 40)
Underlying data generated by a 3 class latent
class process (100, 200, 100 in classes)
Thanks to www.statisticalinnovations.com (Latent
Gold and Jordan Louviere)

60
Error Components Logit Modeling

Alternative approach to building cross choice
correlation
Common effects
Example

61
Implied Covariance Matrix
62
Error Components Logit Model
Correlation 0.2837 / 1.6449 0.2837 0.1468
63
Extending the Basic MNL Model
64
Error Components Logit Model
65
Random Parameters Model
66
Heterogeneous (in the Means) Random Parameters
Model
67
Heterogeneity in Both Means and Variances
68
Individual Effects Model
69
(No Transcript)
70
Individual E?idatai Estimates
The intervals could be made wider to account for
the sampling variability of the underlying
(classical) parameter estimators.
71
What is the Individual Estimate?

Point estimate of mean, variance and range of
random variable ?i datai.
Value is NOT an estimate of ?i it is an
estimate of E?i datai
What would be the best estimate of the actual
realization ?idatai?
An interval estimate would account for the
sampling variation in the estimator of O that
enters the computation.
Bayesian counterpart to the preceding? Posterior
mean and variance? Same kind of plot could be
done.

72
WTP Application (Value of Time Saved)

Estimating Willingness to Pay for Increments to
an Attribute in a Discrete Choice Model

Random
73
Extending the RP Model to WTP

Use the model to estimate conditional
distributions for any function of parameters
Willingness to pay ?i,time / ?i,cost
Use same method

74
Estimation of WTP from ?i
Estimate by simulation
75
Stated Choice Experiment Travel Mode by Sydney
Commuters
76
Would You Use a New Mode?
77
Value of Travel Time Saved

Write a Comment

User Comments (0)

About PowerShow.com

Discrete Choice Modeling PowerPoint PPT Presentation