Models with limited dependent variables - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

Models with limited dependent variables

Description:

Computation of the Hessian may cause problems. B. ML Estimation. Alternatives procedures: Approximations to the Hessian. Other procedures, such as steepest-ascent ... – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 79
Provided by: cam8150
Category:

less

Transcript and Presenter's Notes

Title: Models with limited dependent variables


1
Models with limited dependent variables
  • Doctoral Program 2006-2007
  • Katia Campo

2
Introduction
3
Limited dependent variables
Discrete dependent variable
Continuous dependent variable
Truncated, Censored
4
Discrete choice models
  • Choice between different options (j)
  • Single Choice (binary choice models)
  • e.g. Buy a product or not, follow higher
    education or not, ...
  • j1 (yes/accept) or 0 (no/reject)
  • Multiple Choice (multinomial choice models),
  • e.g. cars, stores, transportation modes
  • j1(opt.1), 2(opt.2), ....., J(opt.J)

5
Truncated/censored regression models
  • Truncated variable
  • observed only beyond a certain threshold level
    (truncation point)
  • e.g. store expenditures, income
  • Censored variables
  • values in a certain range are all transformed to
    (or reported as) a single value (Greene, p.761)
  • e.g. demand (stockouts, unfullfilled demand),
    hours worked

6
Duration/Hazard models
  • Time between two events, e.g.
  • Time between two purchases
  • Time until a consumer becomes inactive/cancels a
    subscription
  • Time until a consumer responds to direct mail/ a
    questionnaire
  • ...

7
Need to use adjusted models Illustration
Frances and Paap (2001)
8
Overview
  • Part I. Discrete Choice Models
  • Part II. Censored and Truncated Regression Models
  • Part III. Duration Models

9
Recommended Literature
  • Kenneth Train, Discrete Choice Methods with
    Simulation, Cambridge University Press, 2003
    (Part I)
  • Ph.H.Franses and R.Paap, Quantitative Models in
    Market Research, Cambridge University Press, 2001
    (Part I-II-III Data www.few.eur.nl/few/people/pa
    ap)
  • D.A.Hensher, J.M.Rose and W.H.Greene, Applied
    Choice Analysis, Cambridge University Press, 2005
    (Part I)

10
Part I. Discrete Choice Models
11
Overview Part I, DCM
  • Properties of DCM
  • Estimation of DCM
  • Types of Discrete Choice Models
  • Binary Logit Model
  • Multinomial Logit Model
  • Nested logit model
  • Probit Model
  • Ordered Logit Model
  • Heterogeneity

12
Notation
  • n decision maker
  • i,j choice options
  • y decision outcome
  • x explanatory variables
  • ? parameters
  • ? error term
  • I. indicator function, equal to 1 if
    expression within brackets is true, 0 otherwise
  • e.g. Iyjx 1 if j was selected (given x),
    equal to 0 otherwise

13
A. Properties of DCM
Kenneth Train
  • Characteristics of the choice set
  • Alternatives must be mutually exclusive
  • no combination of choice alternatives
  • (e.g. different brands, combination of diff.
    transportation modes)
  • Choice set must be exhaustive
  • i.e., include all relevant alternatives
  • Finite number of alternatives

14
A. Properties of DCM
Kenneth Train
  • Random utility maximization
  • Ass decision maker selects the alternative that
    provides the highest utility,
  • i.e. Selects i if Uni gt Unj ? j ? i
  • Decomposition of utility into a deterministic
    (observed) and random (unobserved) part
  • Unj Vnj ?nj

15
A. Properties of DCM
Kenneth Train
  • Random utility maximization

16
A. Properties of DCM
Kenneth Train
  • Identification problems
  • Only differences in utility matter
  • Choice probabilities do not change when a
    constant is added to each alternatives utility
  • Implication
  • Some parameters cannot be identified/estimated
    Alternative-specific constants Coefficients of
    variables that change over decision makers but
    not over alternatives
  • Normalization of parameter(s)

17
A. Properties of DCM
Kenneth Train
  • Identification problems
  • Overall scale of utility is irrelevant
  • Choice probabilities do not change when the
    utility of all alternatives are multiplied by the
    same factor
  • Implication
  • Coefficients of ? models (data sets) are not
    directly comparable
  • Normalization (var.of error terms)

18
A. Properties of DCM
Kenneth Train
  • Aggregation
  • Biased estimates when aggregate values of the
    explanatory variables are used as input
  • Consistent estimates can be obtained by sample
    enumeration
  • - compute prob./elasticity for each dec.maker
  • - compute (weighted) average of these values

Swait and Louvière(1993), Andrews and Currim
(2002)
19
Properties of DCM
Keneth Train
  • Aggregation

20
B. Estimation DCM
  • Numerical maximization (ML-estimation)
  • Simulation-assisted estimation
  • Bayesian estimation

(see Train)
21
B. ML-estimation
  • Objective find those parameter values most
    likely to have produced the sample observations
    (Judge et al.)
  • Likelihood for one observation Pn(X,?)
  • Likelihoodfunction
  • L(?) ?n Pn(X,?)
  • Loglikelihood
  • LL(?) ? n ln(Pn(X,?))

22
B. ML Estimation
  • Determine ? for which LL(?) reaches its max
  • First derivative 0 ? no closed-form solution
  • Iterative procedure
  • Starting values ?0
  • Determine new value ?t1 for which LL(?t1) gt
    LL(?t)
  • Repeat procedure ii until convergence (small
    change in LL(?))

23
B. ML Estimation
24
B. ML Estimation
  • - Direction and step size ?t ? ?t1 ?
  • based on taylor approximation of LL(?t1) (with
    base (?t))
  • LL(?t1) LL(?t)(?t1- ?t)gt1/2(?t1-
    ?t)Ht (?t1- ?t) 1
  • with

25
B. ML Estimation
  • - Direction and step size ?t ? ?t1 ?
  • Optimization of 1 leads to
  • ? Computation of the Hessian may cause problems

26
B. ML Estimation
  • Alternatives procedures
  • Approximations to the Hessian
  • Other procedures, such as steepest-ascent

See e.g. Train, Judge et al.(1985)
27
B. ML Estimation
  • Properties ML estimator
  • Consistency
  • Asymptotic Normality
  • Asymptotic Efficiency

See e.g. Greene (ch.17), Judge et al.
28
B.Diagnostics and Model Selection
  • Goodness-of-Fit
  • Joint significance of explanatory vars
  • LR-test LR -2(LL(?0) - LL(?))
  • LR ?²(k)
  • Pseudo R² 1 - LL(?)
  • LL(?0)

29
B.Diagnostics and Model Selection
  • Goodness-of-Fit
  • Akaike Information Criterion
  • AIC 1/N (-2 LL(?) 2k)
  • CAIC -2LL(?) k(log(N)1)
  • BIC 1/N (-2 LL(?) k log(N))
  • sometimes conflicting results

30
B.Diagnostics and Model Selection
  • Model selection based on GoF
  • Nested models LR-test
  • LR -2(LL(?r) - LL(?ur))
  • rrestricted model urunrestricted (full) model
  • LR ?²(k) (kdifference in of parameters)
  • Non-nested models
  • AIC, CAIC, BIC ? lowest value

31
C. Discrete Choice Models
  1. Binary Logit Model
  2. Multinomial Logit Model
  3. Nested logit model
  4. Probit Model
  5. Ordered Logit Model

32
1. Binary Logit Model
  • Choice between 2 alternatives
  • Often accept/reject or yes/no decisions
  • E.g. Purchase incidence make a purchase in the
    category or not
  • Dep. var. yn 1, if option is selected
  • 0, if option is not
    selected
  • Model P(yn1 xn)

33
1. Binary Logit Model
  • Based on the general RUM-model
  • Ass. error terms are iid and follow an extreme
    value or Gumbel distribution

34
1. Binary Logit Model
  • Based on the general RUM-model
  • Pn ? Ißxn en gt 0 f(e) de
  • ? Ien gt -ßxn f(e) de
  • ?e-ßx f(e) de
  • 1 F(- ßxn)
  • 1 1/(1exp(ßxn))
  • exp(ßxn)/(1exp(ßxn))
  • Ass. error terms are iid and follow an extreme
    value/Gumbel distr.

35
1. Binary Logit Model
  • Leads to the following expression for the logit
    choice probability

36
1. Binary Logit Model
  • Properties
  • Nonlinear effect of explanatory vars on
    dependent variable
  • Logistic curve with inflection point at P0.5

37
1. Binary Logit Model
38
1. Binary Logit Model
  • Effect of explanatory variables ?
  • For
  • Quasi-elasticity

39
1. Binary Logit Model
  • Effect of explanatory variables ?
  • For
  • Odds ratio is equal to

40
1. Binary Logit Model
  • Estimation ML
  • Likelihoodfunction L(?)
  • ?n P(yn1x,?)yn (1- P (yn1x,?))1-yn
  • Loglikelihood LL(?)
  • ? n yn ln(P (yn1x,?) )
  • (1-yn) ln(1- P (yn1x,?))

41
1. Binary Logit Model
  • Forecasting accuracy
  • Predictions yn1 if F(Xn ?) gt c (e.g. 0.5)
  • yn0 if F(Xn ?) ? c
  • Compute hit rate of correct predictions

42
1. Binary Logit Model
  • Example Purchase Incidence Model
  • ptn(inc) probability that household n
    engages
  • in a category purchase in the
    store
  • on purchase occasion t,
  • Wtn the utility of the purchase option.

Bucklin and Gupta (1992)
43
1. Binary Logit Model
  • Example Purchase Incidence Model

With CRn rate of consumption for household
n INVnt inventory level for household n, time
t CVnt category value for household n, time t
Bucklin and Gupta (1992)
44
1. Binary Logit Model
  • Data
  • A.C.Nielsen scanner panel data
  • 117 weeks 65 for initialization, 52 for
    estimation
  • 565 households 300 selected randomly for
    estimation, remaining hh holdout sample for
    validation
  • Data set for estimation 30.966 shopping trips,
    2275 purchases in the category (liquid laundry
    detergent)
  • Estimation limited to the 7 top-selling brands
    (80 of category purchases), representing 28
    brand-size combinations ( level of analysis for
    the choice model)

Bucklin and Gupta (1992)
45
1. Binary Logit Model
Goodness-of-Fit
Model param. LL U² (pseudo R²) BIC
Null model Full model 1 4 -13614.4 -11234.5 - .175 13619.6 11255.2
46
1. Binary Logit Model
Parameter estimates
Parameter Estimate (t-statistic)
Intercept ?0 CR ?1 INV ?2 CV ?3 -4.521 (-27.70) .549 (4.18) -.520 (-8.91) .410 (8.00)
47
Variable Coefficient Std.
Error z-Statistic Prob.   C 0.2221
21 0.668483 0.332277 0.7397 DISPLHEINZ 0.57338
9 0.239492 2.394186 0.0167 DISPLHUNTS -0.55764
8 0.247440 -2.253674 0.0242 FEATHEINZ 0.505656
0.313898 1.610896 0.1072 FEATHUNTS -1.055859 0.
349108 -3.024445 0.0025 FEATDISPLHEINZ
0.428319 0.438248 0.977344 0.3284 FEATDISPLHUN
TS -1.843528 0.468883 -3.931748 0.0001 PRICEHEIN
Z -135.1312 10.34643 -13.06066 0.0000 PRICEHUNTS
222.6957 19.06951 11.67810 0.0000

Binary Logit Model (Franses and Paap
www.few.eur.nl/few/people/paap)
48
Binary Logit Model (Franses and Paap
www.few.eur.nl/few/people/paap)
Mean dependent var 0.890279     S.D. dependent
var 0.312598 S.E. of regression 0.271955     Ak
aike info criterion 0.504027 Sum squared
resid 206.2728     Schwarz criterion 0.523123
Log likelihood -696.1344    Hannan-Quinn
criter. 0.510921 Restr. log likelihood -967.918
 Avg. log likelihood -0.248797 LR statistic
(8 df) 543.5673     McFadden R-squared 0.280792
Probability(LR stat) 0.000000 Obs
with Dep0 307  Total obs 2798 Obs with
Dep1 2491
49
Binary Logit Model (Franses and Paap
www.few.eur.nl/few/people/paap)
50
Binary Logit Model (Franses and Paap
www.few.eur.nl/few/people/paap)
51
2. Multinomial Logit Model
  • Choice between Jgt2 categories
  • Dependent variable yn 1, 2, 3, .... J
  • Explanatory variables
  • Different across individuals, not across
    categories (standard MNL model)
  • Different across (individuals and) categories
    (conditional MNL model)
  • Model P(ynjXn)

52
2. Multinomial Logit Model
  • Based on the general RUM-model
  • Ass. error terms are iid following an extreme
    value or Gumbel distribution

53
2. Multinomial Logit Model
  • Identification problem ? select reference
    category and set coeffients equal to 0

54
2. Multinomial Logit Model
  • Conditional MNL model

55
2. Multinomial Logit Model
  • Interpretation of parameters
  • Derivative (marginal effect)
  • Cross-effects

(Traditional MNL model, see Franses en Paap p.80)
56
2. Multinomial Logit Model
  • Interpretation of parameters
  • Overall effect

57
2. Multinomial Logit Model
  • Interpretation of parameters
  • Probability-ratio
  • Does not depend on the other alternatives!

58
2. Multinomial Logit Model
  • Estimation
  • ML estimation

(znj1 if j is selected, 0 otherwise)
59
2. Multinomial Logit Model
  • Estimation
  • Alternative estimation procedures
  • Simulation-assisted estimation (Train, Ch.10)
  • Bayesian estimation (Train, Ch.12)

60
2. Multinomial Logit Model
  • Example

Bucklin and Gupta (1992)
61
2. Multinomial Logit Model
  • Variables
  • Ui constant for brand-size i
  • BLhi loyalty of household h to brand of
    brandsize i
  • LBPhit 1 if i was last brand purchased, 0
    otherwise
  • SLhi loyalty of household h to size of
    brandsize i
  • LSPhit 1 if i was last size purchased, 0
    otherwise
  • Priceit actual shelf price of brand-size i at
    time t
  • Promoit promotional status of brand-size i at
    time t

Bucklin and Gupta (1992)
62
2. Multinomial Logit Model
  • Data
  • A.C.Nielsen scanner panel data
  • 117 weeks 65 for initialization, 52 for
    estimation
  • 565 households 300 selected randomly for
    estimation, remaining hh holdout sample for
    validation
  • Data set for estimation 30.966 shopping trips,
    2275 purchases in the category (liquid laundry
    detergent)
  • Estimation limited to the 7 top-selling brands
    (80 of category purchases), representing 28
    brand-size combinations ( level of analysis for
    the choice model)

Bucklin and Gupta (1992)
63
2. Multinomial Logit Model
Goodness-of-Fit
Model param. LL U² (pseudo R²) BIC
Null model Full model 27 33 -5957.3 -3786.9 - .364 6061.6 3914.3
Bucklin and Gupta (1992)
64
2. Multinomial Logit Model
Estimation Results
Parameter Estimate (t-statistic)
BL ?1 LBP ?2 SL ?3 LSP ?4 Price ?5 Promo ?6 3.499 (22.74) .548 (6.50) 2.043 (13.67) .512 (7.06) -.696 (-13.66) 2.016 (21.33)
Bucklin and Gupta (1992)
65
2. Multinomial Logit Model
  • Scale parameter
  • Variance of the extreme value distribution ?²/6
  • If true utility is Unj ?xnj ?nj with
    var(?nj) ?² (?²/6), the estimated
    representative utility Vnj ?xnj involves a
    rescaling of ? ? ? ? / ?
  • ? and ? can not be estimated separately
  • take into account that the estimated coeffients
    indicate the variables effect relative to the
    variance of unobserved factors
  • Include scale parameters if subsamples in a
    pooled estimation (may) have different error
    variances

66
2. Multinomial Logit Model
  • Scale parameter in case of pooled estimation of
    subsamples with different error variance
  • For each subsample s, multiply utility by µs,
    which is estimated simultaneously with ?
  • Normalization set µs equal to 1 for 1 subs.
  • Values of µs reflect diffs in error variation
  • µsgt1 error variance is smaller in s than in the
    reference subsample
  • µslt1 error variance is larger in s than in the
    reference subsample

Swait and Louviere (1993), Andrews and Currim
(2002)
67
2. Multinomial Logit Model
  • Example
  • Data from online experiment, 2 product categories
  • Three diff.assortments, assigned to
    diff.respondent groups
  • Assortment 1 small assortment
  • Assortment 2 ass.1 extended with add.brands
  • Assortment 3 ass.1 extended with add types
  • Explanatory variables are the same (hh chars,
    MM), with exception of the constants
  • A scale factor is introduced for assortment 2 and
    3 (assortment 1 is reference with scale factor 1)

Breugelmans et al (2005)
68
2. Multinomial Logit Model
Table 1 Descriptives for each assortment
(margarine and cereals)
Breugelmans et al (2005)
a common refers to attribute levels that are
present in all three assortments
69
2. Multinomial Logit Model
  • MNL-model Pooled estimation
  • Phit,a the probability that household h chooses
    item i at time t, facing assortment a
  • uhit,a the choice utility of item i for
    household h facing assortment a
  • f(household variables, MM-variables)
  • Cha set of category items available to household
    h within assortment a
  • µa Gumbel scale factor

Breugelmans et al, based on Andrews and Currim
2002 Swait and Louvière 1993
70
2. Multinomial Logit Model
  • Estimation results
  • Goodness-of-Fit
  • (average) LL -0.045 (M), -0.040 (C)
  • BIC 2929 (M), 4763(C)
  • CAIC 2871 (M), 4699 (C)
  • Scale factors
  • M 1.2498 (ass2), 1.2627 (ass3)
  • C 1.0562 (ass2), 0.7573 (ass3)

Breugelmans et al (2005)
71
2. Multinomial Logit Model
Margarine Margarine Margarine Margarine Cereals Cereals Cereals Cereals
Variable Assortment 1 Assortment 2 Assortment 3 Variable Assortment 1 Assortment 2 Assortment 3
Scale factor Mean Last purchase Item preference Brand asymmetry Size asymmetry Sequence Proximity 1.00b 2.0675 2.8310 0.2805 -0.0841 - d 0.8332 1.2498 2.5840c 3.5382c 0.4228 -0.0880 0.3672 1.0303 1.2627 2.6106c 3.5747c 0.5400 0.0169 -0.1190 0.6235 Scale factor Mean Last purchase Item preference Brand asymmetry Taste asymmetry Type asymmetry Sequence Proximity 1.00b 0.6441 5.2011 0.0077 -0.0260 0.3119 -0.3311 2.0041 1.0562 0.6803c 5.4934c 0.6130 0.2938 -0.0614 -0.0695 0.7214 0.7573 0.4888c 3.9109c 0.0969 -0.1596 0.3816 0.6190 4.1140
(Excluding brand/size constants)
Breugelmans et al (2005)
72
2. Multinomial Logit Model
  • Limitations of the MNL model
  • Independence of Irrelevant Alternatives
    (proportional substitution pattern)
  • Order (where relevant) is not taken into account
  • Systematic taste variation can be represented,
    not random taste variation
  • No correlation between error terms (iid errors)

73
2. Multinomial Logit Model
  • Independence of irrelevant alternatives
  • Ratio of choice probabilities for 2 alternatives
    i and j does not depend on other alternatives
    (see above)
  • Implication proportional substitution patterns
  • Cf. Blue Bus Red Bus Example
  • T1 Blue bus (P50), Car (P50)
  • T2 Blue bus (P33), Car (P33),Red bus (P33)

74
2. Multinomial Logit Model
  • Independence of irrelevant alternatives
  • New alternatives or alternatives for which
    utility has increased - draw proportionally from
    all other alternatives
  • Elasticity of Pni wrt variable xnj

75
2. Multinomial Logit Model
  • Independence of irrelevant alternatives
  • Hausman-McFadden specification test

Basic idea if a subset of the choice set is
truly irrelevant, omitting it should not
significantly affect the estimates.
76
2. Multinomial Logit Model
  • Independence of irrelevant alternatives
  • Hausman-McFadden specification test
  • Procedure
  • -Estimate logit model twice
  • a. on full set of alternatives
  • b. on subset of alternatives
  • (and subsample with choices from this
    set) -When IIA is true,


77
2. Multinomial Logit Model
  • Independence of irrelevant alternatives
  • Alternative Procedure
  • -Estimate logit model twice
  • a. on full set of alternatives
  • b. on subset of alternatives
  • (and subsample with choices from this set)
  • - compute LL for subset b with parameters
  • obtained for set a
  • - Compare with LLb GoF should be similar


78
2. Multinomial Logit Model
  • Solutions to IIA
  • Model with attribute-specific constants
    (intrinsic preferences)
  • Nested Logit Model
  • Models that allow for correlation among the error
    terms, such as Probit Models

Write a Comment
User Comments (0)
About PowerShow.com