Nonlinear Regression - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Nonlinear Regression

Description:

I have data for 353 coal transportation routes from Wyoming to electric ... Switch to aluminum cars. Declining fuel prices. Wage determination. Gender. Schooling ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 29
Provided by: sger2
Category:

less

Transcript and Presenter's Notes

Title: Nonlinear Regression


1
Chapter 6
  • Nonlinear Regression
  • Functions

2
Types of Nonlinearity
  • Intrinsically linear regression models vs. models
    that are not intrinsically linear
  • Intrinsically linear models are nonlinear models
    that can be linearized with simple
    transformations. These are the only models that
    we will study
  • Methods of detecting or modeling nonlinearities
  • When nonlinearity in the relationship between Y
    and X depends on X alone
  • When nonlinearity in the relationship between Y
    and X depends on another variable, Z.

3
Examples
  • Nonlinear relationship between Y and X depends on
    X
  • Wages increase at an increasing (or decreasing)
    rate with years of schooling
  • Sales increase at a decreasing rate with
    advertising expenditure
  • Output rises at a decreasing rate with input of
    labor
  • Nonlinear relationship between Y and X depends on
    Z
  • Effect of years of schooling on wages depends on
    gender
  • Effect of advertising on sales depends on the
    type of business
  • Effect of additional labor on output depends on
    how much capital and other inputs are available.

4
Logarithms
  • Suppose that we want to estimate the relationship
    between wages, years of schooling, and (possibly)
    other variables
  • We might estimate Wagesiß0ß1Schoolingiui
  • In this case, ß1 is interpreted as the increase
    in wages per additional year of schooling
  • Suppose, however, that additional years of
    schooling increase wages, not by a constant
    amount, but instead by a constant percentage
  • This means that wages increase with years of
    schooling at an increasing rate e.g., another
    year of college adds more to wages than another
    year of high school

5
Diagram
Wages
Schooling
6
Logarithms (Continued)
  • An easy way to model this is to use logarithms.
  • Instead of using the linear model on the previous
    slide, we might instead estimate
  • Wagesiexpß0ß1Schoolingiui
  • where exp is the natural logarithm base e2.71828
  • In this case, we could take the natural log of
    both sides of the equation to obtain
    ln(Wagesi)ß0ß1Schoolingiui
  • Natural log transformations are available in
    LIMDEP using the create command, e.g.
  • Createlnwagelog(wage)

7
Logarithms (Continued)
  • In this model, how should we interpret the slope
    coefficient, ß1?
  • Let WWages and SSchooling
  • ß1 dlog(W)/d(S)
  • dlog(W)/d(W) x d(W)/d(S)
  • (1/W) x d(W)/d(S)
  • ?W/?S(1/W)
  • percentage change in W given a one unit change
    in S

8
Logarithms (Continued)
  • Suppose that we wish to estimate a relationship
    like Yi?Xi?ui
  • We cannot estimate this relationship directly,
    but can take natural logs to obtain
    lnYiln(?)?ln(Xi)ln(ui)
  • How is the slope coefficient interpreted in this
    case?
  • ?dln(Y)/dln(X)
  • dln(Y)/dY/dln(X)/dX(dY/dX)
  • (dY/dX)/(X/Y)?(?Y/Y)/(?X/X)

9
Logarithms (Continued)
  • So, in this case ? interpreted as the elasticity
    of Y with respect to X
  • Example Want to know the relationship between
    average variable cost (AVC) and marginal cost
    (MC) for a railroad hauling coal
  • Let total variable costVC and let output Q
  • Then, the elasticity of VC with respect to Q is
    (dVC/VC)/(dQ/Q)(dVC/dQ)(Q/VC)MC/AVC

10
Logarithms (Continued)
  • I have data for 353 coal transportation routes
    from Wyoming to electric utilities in the East
    and Midwest
  • For each, I know total variable cost and
    ton-miles of coal transported. Let Qton-miles
    (TM)
  • I estimate the relationship VCi?TMi?ui by
    estimating ln(VCi)ln(?)?ln(TMi)ln(ui)
  • The estimate of ? is 0.98 with s.e.0.003
  • Thus, MC?AVC

11
Hypothesis Testing and Confidence Intervals
  • Hypothesis testing about one coefficient would be
    carried out using a t-test in the same way that
    we have previously studied this topic
  • Confidence intervals for a coefficient would be
    developed in the same way too.

12
Use of Logarithmic Models
  • Prediction is not as simple as in the linear
    model
  • Suppose we want to estimateYi?Xi?ui using a
    log-log model. We would run lnYiln(?)?ln(Xi)ln
    (ui)
  • How do we get a confidence interval for the
    change in Y (not lnY) for a given change in X?
  • Comparison of R2 for a given model estimated in
    linear, log-linear, and log-log form.
  • TSS is the same in the log-linear and log-log
    forms, but will be different in the linear model

13
Polynomials
  • Another way to deal with a nonlinear relationship
    between X and Y is to assume that the
    relationship can be modeled as a polynomial
  • A simple example is
  • Wiß0ß1Schoolingiß2 Schoolingi2ui
  • What is the effect of a one unit change in
    schooling on wages d(W)/d(S)ß12ß2
  • change in wages for a one unit change in
    schooling is (ß12ß2)/w
  • Elasticity of W wrt S (ß12ß2) x (S/W)

14
Possible Outcomes
Wages
Wages
Schooling
Schooling
ß1gt0, ß2gt0
ß1gt0, ß2lt0
15
(Unlikely) Possible Outcomes
Wages
Wages
Schooling
Schooling
ß1lt0, ß2gt0
ß1lt0, ß2lt0
16
Polynomials (Continued)
  • Advantages of specifying wages as a quadratic
    function of education
  • Simplicity In LIMDEP, createschooling2schooling
    2
  • Allows greater flexibility than does a natural
    log transformation (log transformation envisions
    an exponential relationship)
  • Disadvantages
  • Resulting estimated relationship may be
    implausible
  • Example Marginal effect of additional schooling
    could be lt0

17
Polynomials (Continued)
  • Cubic specification Wiß0ß1Schoolingiß2
    Schoolingi2?3Schoolingi3 ui
  • Might be useful in getting more flexibility in
    modeling a relationship between two variables
  • Not used much
  • Multicollinearity can be a problem

18
Polynomials or Dummy Variables?
  • Suppose we want to estimate Wagesi?0?1Schooli?
    2Experienceiui
  • We suspect that the relationship between wages
    and schooling is nonlinear
  • An alternative to using a polynomial
    specification to model this complication would be
    to use dummy variables split years of schooling
    up into n categories and include dummy variables
    for n-1 of these categories in the regression

19
Example
  • Suppose we distinguish between three levels of
    schooling less than high school (ltHS), high
    school (HS), and more than high school (gtHS).
  • Then, instead of estimating Wagesi?0?1Schooli?2
    Experienceiui
  • We instead estimate Wagesi?0?1HSi
    ?2(gtHSi)?3Experienceiui
  • Notice that the omitted category is ltHS and that
    the wage increase for an additional year of
    schooling may not be a constant amount

20
Determination of Wages
gtHS
Wages
HS
ltHS
Experience
21
Interaction Variables
  • Suppose that in the regression
    Wagesi?0?1Schooli?2Experienceiui, we suspect
    that the effect of work experience on wages
    depends on the level of schooling.
  • In this case, we may wish to estimate
    Wagesi?0?1Schooli?2Experiencei ?3 (Experience
    x Schooling)ui

22
Interaction Variables
  • In this situation, ?Wagesi/?Experiencei?2?3Schoo
    ling
  • If ?3gt0, this would mean that an additional year
    of work experience raises wages by more for a
    person with more schooling.
  • Also, because Wagesi/?Schoolingi?1?3Experiencei
    this specification suggests that if ?3gt0, then
    the returns to schooling rise with years of work
    experience

23
Examples
  • Railroad costs
  • Switch to aluminum cars
  • Declining fuel prices
  • Wage determination
  • Gender
  • Schooling
  • Student test scores
  • Percentage of English learners
  • Class size

24
Hypothesis Testing
  • In linear, log-linear, and log-log models,
    methods for testing Ho ßi0 are the same and can
    be carried out using a t-test.
  • Note that the t-test involves just one parameter
  • Suppose that we wanted to test whether two slope
    parameters are zero Ho ßißj0. How would we
    carry out the test in this case?

25
Examples
  • When would we want to do this?
  • Suppose that we regress Wiß0ß1Schoolingiß2
    Schoolingi2ui
  • Then we might want to test whether schooling has
    a significant effect on wages.
  • In this case, the test would be Ho ß1ß20
  • Or, in the wage regression in the handout, the
    test of whether schooling affects wages would be
    whether the coefficients of the five schooling
    dummy variables are equal to zero
  • Or, test whether an effect is zero when an
    interaction variables are used

26
Three Approaches
  • Again, suppose that the test to be carried out
    is Ho ßißj0
  • Three ways to do the test
  • Sequential t-tests use t-tests to first test one
    parameter, then test the other. Weak method
    because the null hypothesis says that both
    parameters are zero at the same time
  • Equation (5.21), p.168 in the textbook. This is
    the best method because it corrects for
    heteroskedasticity, but the test statistic is
    complicated to compute using LIMDEP or other
    software.

27
Three Approaches (Continued)
  • An approximate method that does not correct for
    heteroskedasticity but is simple to implement
    (the Rule-of-Thumb F-statistic, see p. 193 of
    text). This statistic is calculated as the
    product of ratios
  • F(SSRR-SSRU)/SSRU x (n-k-1)/q
  • SSRU is the sum of squared residuals from the
    unrestricted regression in which variables i and
    j are included as explanatory variables
  • SSRR is the sum of squared residuals from the
    restricted regression in which variables i and j
    are excluded
  • q is the number of restrictions tested
  • n-k-1 is the number of degrees of freedom
    available after estimating slope coefficients in
    the unrestricted model

28
Properties of F-statistic
  • SSRR-SSRU)/SSRU0. Why?
  • SSRR-SSRU)/SSRU tells the percentage increase
    in the sum of squared residuals that results from
    excluding two or more variables from a regression
  • (n-k-1)/q adjusts this percentage for degrees
    of freedom
  • If no heteroskedasticity, under the null
    hypothesis Ho ßißj0, the resulting statistic
    is distributed as F with q degrees of freedom in
    the numerator and n-k-1 degrees of freedom in the
    denominator.
  • Recall that an F-distributed variable is the
    ratio of two independent Chi-squares.
Write a Comment
User Comments (0)
About PowerShow.com