Limited Dependent Variables - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Limited Dependent Variables

Description:

Ordinal. Nominal. Categorical and Limited Dependent Variables. Binary Variables. Ordinal Variables. Nominal Variables. Censored Variables. Count Variables ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 40
Provided by: vincen53
Category:

less

Transcript and Presenter's Notes

Title: Limited Dependent Variables


1
Limited Dependent Variables
  • Lecture 2
  • Main Reading Gujarati Chapter 15

2
The goal of todays lecture
  • To introduce binary choice models
  • To discuss the problems with using OLS to
    estimate binary choice models
  • To examine two binary models
  • Logit Model
  • Probit Model
  • To work through some examples

3
Introduction Levels of Measurement
  • Cardinal
  • Ordinal
  • Nominal

4
Categorical and Limited Dependent Variables
  • Binary Variables
  • Ordinal Variables
  • Nominal Variables
  • Censored Variables
  • Count Variables

5
Introduction 2
  • Many economic variables are discrete (e.g.
    employment status, marital status) rather than
    continuous (e.g. income, government expenditure).
  • Discrete variables can be modeled by including a
    dummy variable in the regression
  • Example a model of sexual discrimination

6
  • This model can be estimated using OLS. The
    presence of a discrete variable on the right hand
    side of the equation makes no difference to
    estimation procedure
  • The discrete variable can also appear on the left
    of the equation (i.e. as the dependent variable)
    as in the following labour force participation
    model

7
There are three ways to estimate a model of this
type
  • Linear Probability Model Use OLS
  • Probit Model
  • Logit Model

8
Linear Probability Model (LPM)
  • Transport Decisions
  • Estimate by OLS

9
Linear Probability Model
  • A regression model with a dummy dependent
    variable is a linear probability model
  • To see its properties note the following
  • Since the mean error is zero, E(Yi) ?0 ?1X1i
    ?2X2i ?3X3i ? ?kXki
  • If Pi Prob(Yi 1) and 1 ? Pi Prob(Yi 0),
    then E(Yi) 1 ? Pi 0 ? (1 ? Pi) Pi
  • The model is Pi ?0 ?1X1i ?2X2i ?3X3i ?
    ?kXki
  • The estimated slope coefficients tell the impact
    of a unit change in that explanatory variable on
    the probability that Y 1

10
  • We interpret as being the predicted
    probability that person i drives to work and b as
    being the marginal probability
  • Note individual i either drives or does not drive
    (i.e. Yi is either zero or one)
  • The fitted value is interpreted as being either
  • probability that a person with the same value
    for X as person i would drive
  • or
  • The proportion of people with the same X as
    person i that would drive

11
(No Transcript)
12
There are a number problems with the LPM
  • First, there is no guarantee that all of the
    predicted probabilities will be between zero and
    one.
  • Second the marginal effects dont make sense. b
    cannot really be interpreted as the marginal
    effect of cost on the probability of driving.
  • This is a serious problem for policy evaluation

13
  • Suppose that the coefficients estimated by OLS
    are a0.5 and b0.2
  • If the cost of public transport is 2, the fitted
    value is 0.9 i.e. 90 of people will drive to
    work
  • Evaluate the effect of increasing the cost of
    public transport to 4.
  • b0.2 implies that each extra 1 will increase
    the proportion of individuals driving by 20
  • Thus the predicted portion driving is 130 which
    is nonsense.

14
Both problems are caused by the linear nature of
the model.
15
Further problems with LPM
  • R-Squared no longer a good measure of fit
  • Since the dependent variable takes only two
    values, the error term takes only two values
  • This implies that the errors can no longer be
    viewed as normal
  • The errors are also heteroscedastic
  • See Gujarati exercise 15.10

16
We need a model where the probability never goes
above 1. The slope of the curve must diminish
as it gets closer to one. i.e. a non-linear
model
17
Estimation
  • Cannot use OLS, because the model is non-linear
  • Use Maximum Likelihood (ML)
  • The likelihood function is the probability that
    an econometric model could generate the actual
    data seen by the econometrician
  • By choosing the parameters of the model so that
    the likelihood is a large as possible then we get
    the ML estimates of the parameters of the model.

18
  • As a simple example, suppose there is a sample of
    just three observations Drive, Drive, Bus.
  • i.e. Y11, Y21, Y30
  • The probability that such an outcome could be
    generated by the econometric model (i.e. the
    likelihood) is
  • LProb(Y11 AND Y21 AND Y30)
  • LProb(Y11)Prob(Y21)Prob(Y30)

19
  • Usually we take the log of the Likelihood (same
    max) to get
  • lnLlnP(Y11)lnP(Y21)lnP(Y30)
  • We can substitute in the expression for
    probability in the Probit model that we got
    earlier
  • Then we get the computer to try different values
    for a and b. The values that maximize lnL are the
    ML estimates of a and b.

20
  • In general, for a sample of N observations, the
    log-likelihood of the sample will be

21
Algorithm
  • Choose starting values for the parameters of the
    model i.e. a and b in this simple example.
    (Could start with the LPM estimates or with
    zeros)
  • Calculate the value of the log likelihood
  • Evaluate F (a bXi) for every observation
  • If yi1 then lnP(yi) lnF (a bXi),
  • If yi0 then lnP(yi) ln1-F (a bXi)
  • Calculate lnL by adding up all the lnP

22
  • Try another combination of parameters
  • There are various different methods of deciding
    which should be the next set of parameters to
    try.
  • Keep repeating the procedure until cannot get a
    higher lnL by choosing new parameters.
  • The set of parameters that generate the largest
    lnL are known as Maximum Likelihood estimates
    of the model.
  • A PC will run through this algorithm in seconds
    even with thousands of observations and many
    variables.

23
Latent Variable Model
24
Latent Variable Models
  • The probit and logit can be motivated by basic
    utility theoretic models

25
Logit and Probit Models
  • The likelihood function is

26
Logit Model
  • For the logit model we specify
  • Prob(Yi 1) ? 0 as ?0 ?1X1i ? ??
  • Prob(Yi 1) ? 1 as ?0 ?1X1i ? ?
  • Thus, probabilities from the logit model will be
    between 0 and 1

27
Logit Model
  • A complication arises in interpreting the
    estimated ?s
  • With a linear probability model, a ? estimate
    measures the ceteris paribus effect of a change
    in the explanatory variable on the probability Y
    equals 1
  • In the logit model

28
Probit Model
  • Probit is a non-linear model
  • The fitted value is guaranteed to between zero
    and one
  • The marginal effect is such that model will never
    predict a probability above 1. The marginal
    affect of increasing the cost of transport will
    be less when probability of driving is close to 1
  • Non-linear implies cannot use OLS

29
  • More formally, we say that the probability that
    Y1 (i.e. that an individual drives) is a
    non-linear function, F, of the variables.
  • We choose the function to ensure that it has the
    desired shape as in the previous diagram
  • In the case of Probit we use F, the cumulative
    distribution function of a normal random
    variable.

30
Probit Model
  • In the probit model, we assume the error in the
    utility index model is normally distributed
  • ?i N(0,?2)
  • Where F is the standard normal cumulative density
    function (c.d.f.)

31
Probit Model
  • The c.d.f. of the logit and the probit look quite
    similar
  • Calculating the derivative
  • Where is the density function of the normal
    distribution

32
Probit Model
  • The derivative is nonlinear
  • Often evaluated at the mean of the explanatory
    variables
  • Common to estimate the derivative as the
    probability Y 1 when the dummy variable is 1
    minus the probability Y 1 when the dummy
    variable is 0
  • Calculate how the predicted probability changes
    when the dummy variable switches from 0 to 1

33
  • The mathematical expression for F is
  • i.e. F(z) is the area under the normal density
    curve

34
  • Because F is itself a probability distribution
    function, all its values will be between zero and
    one.
  • Therefore the estimated probabilities are
    guaranteed to be between zero and one.
  • The function F has a shape similar to that in the
    previous diagram
  • The marginal affect less when probability of is
    close to 1 i.e. curve is flat close to 1
  • Probit solves these two problems but creates two
    others i) inconvenient expression for marginal
    probabilities ii) cant estimate using OLS

35
Marginal Probability
  • Tempting to think that b is equal to the marginal
    probability
  • This is not true for Probit precisely because it
    is a non linear model
  • Because of the shape of the function F the
    marginal probability will diminish as X
    increases.

36
(No Transcript)
37
  • The marginal probability is affected by b but it
    is a non-linear function and is not equal to b.
  • Wrong to say b equal to 0.2 implies 20 increase
    in travel by car for every 1 increase in bus
  • This is a consequence of ensuring that marginal
    probability is low when probability is high and
    vice-versa i.e. as in the diagram.
  • The marginal probability will have the same sign
    as b. This is often all that we want.
  • Often report marginal probability evaluated at
    the means

38
Likelihood Ratio Test
  • Cant use F-test ---- because there are no SSR
  • LR test is the equivalent
  • Intuition -- see if the restriction changes the
    likelihood significantly
  • Test Statistic
  • Critical Value c2 with d.f. equal to no. of
    restrictions

39
Empirical Examples
  • Discrimination in loan approvals in the United
    States
  • Non-Voting in Ireland
Write a Comment
User Comments (0)
About PowerShow.com