Title: Limited Dependent Variable Models
1Limited Dependent Variable Models
- Course Applied Econometrics
- Lecturer Zhigang Li
2Limited Dependent Variable Models
- Examples
- Discrete dependent variable models
- Binary dependent variable models
- Corner solution response models
- Censored and truncated variables models
- Count variable (nonnegative integer values) models
3Linear Probability Model (LPM)(Section 7.5)
- yßXu,
- where y is a binary variable, one for success and
zero for failure - In this model, ß measures the change in the
probability of success when x changes. - Shortcomings
- Predictions (probability of success) can be less
than zero or greater than one. - Probability of success is linearly related to
independent variables for all values. - Heteroskedasticity must be present.
4Binary Response Models
- A latent variable model
- Y1 if YßXegt0
- Y0 if YßXelt0
- This implies
- P(Y1X)G(ßX)
- Logit Model e follows a logistic distribution
- P(Y1X)eßX/(1eßX)
- Probit Model e follows a normal distribution
- P(Y1X)?-8ßXf(v)dv
- The magnitude of ß is not meaningful because the
latent variable Y does not has a well-defined
unit of measurement. Nevertheless, we may measure
the effect of X on the probability for Y to be
one.
5Binary Response Models Interpretation I
- The partial effect of (continuous) xj is
- ?p(X)/?xjG(ßX)ßjg(ßX)ßj
- Where g(.) is the density function of e.
- Implications
- The effect of xj depends on the value of X.
- The relative effect of xi and xj is fixed.
6Binary Response Models Interpretation II
- Probit g(0).4
- Logit g(0).25
- Linear probability model g(0)1
- To make the logit and probit slope estimates
comparable, we can multiply the probit estimates
by .4/.251.6. - The logit slope estimates should be divided by 4
to make them roughly comparable to the LPM
(Linear Probability Model) estimates.
7Binary Response Models Evaluation
- A rough measure of the performance of the binary
models is called percent correctly predicted,
i.e. the percentage of times the predicted yi
matches the actual yi. - It is important to note that one should report
the percentage correctly predicted for each
outcome (0 and 1).
8Tobit Model for Corner Solution Reponses
- Corner Solution Response A variable is zero for
a nontrivial fraction of the population but is
roughly continuously distributed over positive
values. - E.g., monthly earning
- A linear model is conceptually wrong because it
predicts negative values for the dependent
variable.
9A Tobit Model
- Latent variable yßXu, ux Normal (0,s2).
- Observed response ymax(0,y)
- Likelihood of yi
- yigt1 f(y-ßX)/s/s
- yi0 P(ylt0X)1-F(ßX/s)
10What if OLS is used?
- Conditional expectation E(yygt0,x)
- E(yygt0,x)ßXs?(ßX/s)
- Where ?(c)f(c)/F(c) is called the inverse Mills
ratio. - Unconditional expectation E(yx)
- E(yx)P(ygt0x)E(yygt0,x)F(ßX/s)ßXs?(ßX/s),
which is a nonlinear function of x and ß. - Simple OLS can not consistently estimate ß in
either of the above cases.
11What if OLS is used (continued)?
- The partial effects of xj on E(yygt0,x) and
E(yx) have the same sign as the coefficient ßj,
but the magnitude depends on the values of all
explanatory variables and parameters. - ?E(yygt0,x)/?xß1-?(ßX/s)ßX/s?(ßX/s)
- ?E(yx)/?x ßF(ßX/s)
- To make the Tobit coefficient comparable to OLS
estimates, we must multiple the Tobit estimate by
an adjustment factor F(ßX/s).
12A Tobit Model Specification Issues
- The Tobit model relies crucially on normality and
homoskedasticity. If any of the assumptions fail,
then it is hard to know what the Tobit MLE is
estimating. - Nevertheless, for moderate departures from the
assumptions, the Tobit model may provide good
estimates. - In a Tobit model, xj has similar effects on both
the selection decision and the magnitude
decision. This restriction may be unrealistic and
can be tested. (See pp.573.) - This problem may be solved with two-part models,
in which P(ygt0x) and E(yygt0,x) depend on
different parameters.
13Censored Regression Model I
- yßXu, ux,c Normal (0,s2)
- wmin(y,c)
- Note that u is independent of c.
- With censored data, OLS is simply wrong due to
endogeneity resulted from nonrandom measurement
errors. - With corner solution data, OLS is right on the
average.
14Censored Regression Model II
- An OLS regression using only the uncensored
observations produces inconsistent estimators of
ß. - If there is heteroskedasticity or nonnormality,
the MLEs are generally inconsistent.
15Truncated Regression Models
- In a truncated regression model, we do not
observe any information about a certain segment
of the population (therefore we have a nonrandom
sampling of dependent variables). In a censored
regression model, we still have some information
on censored observations. - OLS tends to flatten the estimated line relative
to the true regression line in the whole
population. - Likelihood of yi is f(yxß)/F(cxß).
16Poisson Regression Model
- Dependent variable is a count variable, which
takes on nonnegative integer values 0, 1, 2, - Likelihood of yi
- P(yhx)exp-exp(ßx)exp(ßx)h/h!
- As with the probit, logit, and Tobit models, we
cannot directly compare the magnitudes of the
Poisson estimates of an exponential function with
the OLS estimates of a linear model. Some rough
comparison is possible after some adjustment (see
section 17.3).
17Issues with the Poisson Model
- Poisson distribution may be a too strict
assumption on the error term. - All moments of the Poisson distribution are
determined by the mean. - Fortunately, whether or not the Poisson
distribution holds, we still get consistent,
asymptotically normal estimates of the ß. - The estimated standard errors, however, may be
inconsistent and need to be adjusted. (P. 576)
18Nonrandom Sample Selection (in dependent
variables)
- Truncated regression is a special case of
nonrandom sample selection. - Incidental Truncation
- We do not observe y because of the outcome of
another variable. - For example, wage offers are observed only for
those who are working. Labor force participation
may be affected by some unobserved variables that
also affect wage offer. This would produce biased
estimates in the wage offer equation.
19Consistency of OLS with Selected Sample
- If sample selection is entirely random, then OLS
estimates are unbiased. - If sample selection depends on the explanatory
variables and additional random terms that are
independent of x and u, then OLS is also
consistent. - If sample selection is correlated with error
term, then OLS is inconsistent. - Truncated data
- Incidental truncation
20Modeling Incidental Truncation
- Population model yXßu, X exogenous
- Incidental truncation
- sysXßsu
- s1Z?vgt0, Z exogenous
- s1 if observed 0 otherwise
- Correlation between u and v generally causes a
sample selection (endogeneity) problem.
21Consistency of Incidental Truncation Model
- What are we estimating with the incidentally
truncated data? - yXß??(Z?)
- ?0 when u and v are uncorrelated.
- ?(.) is the inverse Mills ratio
- Since Z may include X, the estimate of ß may be
biased if the term ??(Z?) is omitted from OLS
regression. - Solution Heckit method
- Estimate ?,calculate ?(Z?), and include it in the
OLS regression. - It is preferred to have Z including X as a
subset. Otherwise, multicollinearity problem may
result.