Estimation taking account of sample selection with Stata - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

Estimation taking account of sample selection with Stata

Description:

The truncreg command is useful to estimate regression models with a truncated sample ... ( 1985), Advanced Econometrics, Basil Blackwell, Oxford. Gourieroux C. ... – PowerPoint PPT presentation

Number of Views:562

Avg rating:3.0/5.0

Slides: 20

Provided by: Nico156

Category:

more less

Transcript and Presenter's Notes

Title: Estimation taking account of sample selection with Stata

1
Estimation taking account of sample selection
with Stata

Cheti Nicoletti
ISER, University of Essex
2009

Estimation commands
truncreg, tobit,
heckman, heckprobit,
treatreg, ivreg
Other useful commands
ivprobit, ivtobit
Useful option in the estimation commands
pweights

3
truncreg

The truncreg command is useful to estimate
regression models with a truncated sample
Ex Health insurance claims observed only when
amount claimed is higher than a fixed threshold.
truncreg y x1 x1 x2 xk , ll(c)

4
tobit

The tobit command is useful to estimate
regression models with a censored dependent
variable (deterministic censure)
3 Different types of models
Tobit with fixed censoring value (tobit)
Censored regression with varying censoring value
(cnreg)
Regression with interval data (intreg)

5
tobit

Tobit first type (consumption of a good)
tobit y x1 x2 xk , ll(0)
tobit y x1 x2 xk , ul(c)

6
cnreg

Tobit first type
Ex. minimum wage with different levels in
different years
cnreg y x1 x2 xk censored(d)

7
intreg

Interval data regression (ExBracket information
on income for people refusing to give the exact
value)
Whet yi is not declared we observe the range to
which yi belong
(0, 5000, (5000,15000, (15000,30000,
(30000,8 say (ai, bi

8
Estimating the regression with interval data in
Stata

The command intreg needs two variables to define
the dependent variable, say y1 and y2
intreg y1 y2 x1 x2 xk

Individuals giving y1 y2
An exact value of their income Example A range for their income Example Example y 5980 y in (a,b) (5000, 15000 (30000, 8 y 5980 a 5000 30000 y 5980 b 15000 .
9
heckman

The heckman command is used to estimate
Generalized Tobit or Tobit of the 2nd type using
ML estimation (default option) or the two-step
estimation (option twostep)
heckman y x1 x2 xk, select(z1 z2 zs)
heckman y x1 x2 xk, select(d z1 z2 zs)
heckman y x1 x2 xk, select(z1 z2 zs) twostep

10
heckprobit

The heckman command is used to estimate a probit
model with selection (option twostep does not
exist because inconsistent)
heckprobit p x1 x2 xk, select(z1 z2 zs)

11
Impact of an endogenous dummy Homogenous
treatment effect

y1 earnings for trained people
y0 earnings for non-trained people
d dummy indicating participation to the training
program
yy1 d y0 (1-d)
yx? ? d?
dz ? u where dl(dgt0)
We have a selection problem because of the
correlation
between u and ?. This implies that d is not
independent of ?.

12
treatreg

The treatreg command is used to evaluate the
effect of a endogenous binary variables
(treatment, program, ) on a continuous variable
of interest (see previous slide).
treatreg y x1 x2 xk , treat(dz1 z2 zs)
Ex Sample of graduated students with and without
a master degree
ylog earnings, d1 if master degree, 0 otherwise
x age, age square, d, sex, type first degree
z mothers level of education, fathers level
of education, sex, type first degree

13
How to use weights in Stata

Most Stata commands can deal with weighted data.
Stata allows four kinds of weights
fweights, or frequency weights, are weights that
indicate the number of duplicated observations.
pweights, or sampling weights, are weights that
denote the inverse of the probability that the
observation is included due to the sampling
design and or nonresponse.
aweights, or analytic weights, are weights that
are inversely proportional to the variance of an
observation i.e., the variance of the j-th
observation is assumed to be sigma2/w_j, where
w_j are the weights.
iweights, or importance weights, are weights that
indicate the "importance" of the observation in
some vague sense.

14
Option pweights

Usually sample surveys provide weights to take
account of sampling design and nonresponse.
Let p be individual weight
Then we can run a regression with weighted
observations
regress y x1 x2 xk pweightp
Let us assume to have a sample with a sample
selection problem (due to observables), then we
can use propensity score weighting
A possible simplified way to estimate your own
weights is described in the following
probit d z1 z2 zs
predict prop
gen invprop1/prop
reg y x1 x2 xk pweightinvprop

15
For complex survey design it is better to use

svyset pweightp
svy regress y x1 x2 xk
svyset have options for cluster sampling designs
or other complex design
Declare survey design for dataset
svyset pweightp, strata(stratid)

16
ivreg

The ivreg command is used to estimate regression
model by using instrumental variables for
potential endogenous explanatory variables.
Evaluation of the impact of years of schooling on
earnings
yx? ? d?
Problem d and ? are correlated
Solution 1 IV estimation ( IVz parental
interest in the child education, bad financial
shock of the family when the child is age 11-16,
presence of older siblings, Blundell et al 2003)
ivreg y x1 x1 x2 xk (dz1 z2 zs)

17
STATA program for evaluation

Abadie A., Drukker D., Herr J.L., Imbens G.W.
(2001), Implementing Matching Estimators for
Average Treatment Effects in Stata, The Stata
Journal, 1, 1-18 http//ksghome.harvard.edu/.aaba
die.academic.ksg/software.html
Becker S.O., Ichino A. (2002), Estimation of
average treatment effects based on propensity
scores. The Stata Journal, 2, 358-377
http//www.lrz-muenchen.de/sobecker/pscore.html
Sianesi B. (2001), Implementing Propensity Score
Matching Estimators with STATA, UK Stata Users
Group, VII Meeting London, http//ideas.repec.org/
c/boc/bocode/s432001.html

18
Text Book References

Amemiya T. (1985), Advanced Econometrics, Basil
Blackwell, Oxford.
Gourieroux C. (2000), Econometrics of
Qualitative Dependent Variables, Cambridge
University Press, Cambridge.
Greene W.H. (2000), Econometric Analysis, Third
edition, Prentice-hall, London.
Maddala G. S. (1983), Limited-Dependent and
Qualitative Variables in Econometrics, Cambridge
University Press, Cambridge.
Wooldridge J.M. (2002), Econometric Analysis of
Cross-Section and Panel Data, MIT press
Lee M. (2005) Micro-Econometrics for policy,
program and treatment effects. Advanced Text in
Econometrics. Oxford University Press, Oxford

19
Survey Articles

Angrist J. (2001), Estimation of
Limited-Dependent Variable Models with Binary
Endogenous Regressors Simple Strategies for
Empirical Practice, Journal of Business and
Economic Statistics, 19, 2-28.
Angrist J.D., Krueger A.B. (1999), Empirical
strategies in labor economics, published as
working paper Princeton University, 401, and in
O. Ashenfelter and D. Card, eds., Handbook of
Labor Economics, Volume 3A, Amsterda,, 1277-1366.
Blundell R., Costa-Dias M. (2002), Alternative
approaches to evaluation in empirical
microeconomics', published as IFS, Cemmap working
paper, 10, and in Portuguese Economic Journal,
Vol.1, 91-115, 2002.
Blundell R., Powell J.L. (2001), Endogeneity in
nonparametric and semiparametric regression
models, IFS, Cemmap working paper, CWP09/01,
Chapter 8 in Advances in Economics and
Econometrics , M. Dewatripont, Hansen, L. and S.
J. Turnsovsky (eds.), Cambridge University Press,
ESM 36, pp 312-357,2003.
Heckman J.J., Ichimura H., Smith J.A., Todd P.
(1998), Characterization of Selection Bias Using
Experimental Data, Econometrica, 66, 1017-1098.
Heckman J.J., LaLonde R.J., Smith J.A. (2000),
The economics and econometrics of active labor
market programs, in O. Ashenfelter and D. Card,
(eds.), Handbook of Labor Economics, vol. 3,
North Holland, Amsterdam.
Moffitt R. (2004), An introduction to the
symposium of matching econometrics, Review of
Economics and Statistics, vol. 1, a collection
of articles on matching by various authors.
Vella F. (1998), Estimating models with sample
selection bias a survey', The Journal of Human
Resources, vol. 3, 127-169.