Lecture 3: Regressions Chapters 15 - PowerPoint PPT Presentation

1 / 82
About This Presentation
Title:

Lecture 3: Regressions Chapters 15

Description:

Construct hypothesis of interest based on the empirical model ... Construct hypotheses of interest: (e.g. higher wage in legal employment reduces crime) ... – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 83
Provided by: alb3
Category:

less

Transcript and Presenter's Notes

Title: Lecture 3: Regressions Chapters 15


1
Lecture 3 Regressions (Chapters 1-5)
  • Eco420Z
  • Dr. S. Chen

2
Steps in empirical economic analysis
  • Example1 the effect of job training on
    productivity
  • Write down an empirical model
  • Wage?0 ?1educ ?2exper ?3trainingu
  • where ?0 ,?1 ,?2 ,?3 are parameters
  • u is the error term.
  • Construct hypotheses of interest
  • (e.g. job training would increase wages)
  • H0 ?30
  • H1 ?3gt0

3
Steps in empirical economic analysis
  • Formation of questions of interest
  • Based on economic model or based on common sense
  • Write down an empirical model
  • Construct hypothesis of interest based on the
    empirical model

4
  • Example2 Economic model of Crime
  • Economic model cost-benefit analysis of an
    individuals participation in crime
  • Cost current wages in legal employment prob. of
    being caught/convicted average sentence length
  • Benefit wage for criminal activity
  • Write down an empirical model
  • Crime?0 ?1wage ?2freqarr ?3freqconv
    ?4avesen u
  • Construct hypotheses of interest (e.g. higher
    wage in legal employment reduces crime)
  • H0 ?10 H1 ?1lt0

5
Data structure
  • Cross-sectional data
  • By individuals (WAGE1.DTA)
  • By countries
  • Time-series data
  • Annually (GDP data)
  • Weekly (money supply data)
  • Daily (stock price)
  • Pooled cross sections
  • Combine 2 cross sectional data sets (t1.4)
  • Panel/longitudinal data (t1.5)
  • Trace a given set of cross sections over time

6
(No Transcript)
7
(No Transcript)
8
Ceteris Paribus- other things being equal
  • Most of economic questions are ceteris paribus by
    nature.
  • Ex1 demand curve Holding other factors fixed,
    quantity demanded increases with price
  • Ex2 policy analysis about job training and
    productivity
  • Need to control for enough many covariates to
    answer questions about causal effects of x on y.
  • How many is many enough?
  • Issues of omitted variable instrumental
    variable methods

9
Example (1.4)
  • Estimate the return to education
  • If a person is randomly chosen from the
    population and given another year of education,
    by how much will his or her wage increase?
  • log(wage) ?0 ?1educ?2 exper ?3exper2u
  • Parameter of interest ?1
  • But we dont observe ability or taste for
    education (problems of omitting variables)
  • We will resolve this problem using instrumental
    method (chapter 15).

10
Simple Regression Models
  • Simple linear model
  • y ?0 ?1xu
  • where y dependent variable
  • x independent/explanatory variable or the
    regressor
  • u error term or disturbance
  • ?0 intercept parameter
  • ?1 slope parameter (or marginal effect of x
    on y)

11
  • Example (log wage regressions are linear in years
    of education)
  • log(wage) ?0 ?1educu
  • where ?1 measures the return to an additional
    years of schooling, holding all other factors
    fixed.

12
  • Consider a simple linear regression model
  • y ?0 ?1xu
  • Use the average value of x to predict the average
    value of y
  • Ey E?0 ?1xu
  • ?0 ?1 Ex Eu
  • Zero mean assumption Eu0
  • We got
  • Ey ?0 ?1 Ex

13
  • Use x to predict the average value of y (if we
    know x) take expectations conditional on x
  • Eyx E?0 ?1xu x
  • ?0 ?1x Eu x
  • Zero conditional mean assumption Eux0
  • So we get Eyx ?0 ?1x
  • called the population regression function
    (Fig2.1)

14
Zero conditional mean assumption
  • Eux0 the expected (or average) value of error
    term is zero, for any slice of the population
    described by the value of x.
  • Example no matter how many yrs of schooling, the
    error term must equal zero. We say schooling and
    the error term are uncorrelated denoted by
  • Cov(x,u)Exu0
  • This is a strong assumption
  • In the wage regression model, what if the
    unobserved ability (contained in u) is correlated
    education (x)? The zero conditional mean doesnt
    hold in general. We study this simple case first
    although it often doesnt hold.

15
Derive the OLS Estimators
  • The error term must satisfy 2 conditions
  • Zero mean Eu0
  • Zero conditional mean Exu0
  • But we don't observe the error term u. In terms
    of variables that we can observe (e.g. x and y)
  • Zero mean Ey -?0- ?1x0
  • Zero conditional mean Ex(y -?0- ?1x)0
  • The sample counter part
  • Zero mean
  • Zero conditional mean
  • Two equations and two unknowns

16
  • Using the last two equalities, we derive the OLS
    estimators for parameters ?0 and ?1
  • Require x has enough variation (Fig 2.3)

17
Use OLS estimates for prediction
  • Sample regression function
  • Fitted value
  • Residual diff. between actual y and fitted
    value
  • Sum of squared residuals

18
Goodness of Fit
  • Total sum of square (SST)
  • explained sum of square (SSE)
  • residual sum of square (SSR)
  • R-squared SSE/SST 1-SSR/SST

19
Examples and STATA recitation
  • Example 2.3 fig 2.5 (CEOSAL1.DTA)
  • Explain CEO's salary using the return on equity
  • Can you interpret the results?
  • Example 2.4 (WAGE1.DTA)
  • Estimate return to education
  • Can you interpret the results?

20
Properties of OLS Estimators (1)
  • Unbiasedness
  • Required conditions
  • Linear in parameters
  • Random sampling
  • Enough variations in regressor x
  • Zero conditional mean

21
Properties of OLS Estimators (2)
  • Best Linear Unbiased Estimator (BLUE)
  • The best the most efficient i.e. The variances
    of the OLS estimators are the smallest among all
    linear estimators
  • Also called the Gauss-Markov theorem

22
  • Required conditions for BLUE
  • Linear in parameters
  • Random sampling
  • Enough variations in regressor x
  • Zero conditional mean
  • Constant variance the variance of the error
    term does not varying with regressor x
  • The above assumptions are Gauss-Markov
    assumptions.

23
  • Assumption 5 is a very strong assumption
  • Fig 2.8 (case of constant variance)
  • Fig 2.9 (cased of heteroskedasticity)

24
Variance of OLS Estimators in Simple Linear
Regressions (1)
  • By the assumption 5, we have
  • Var(yx) Var(?0 ?1xu x)
  • Var(ux) ?2
  • We call ?2 the error variance and call ? the
    standard deviation of the error term.
  • An unbiased estimator of the error variance
  • We call the standard error of the
    regression.

25
Variance of OLS Estimators (2)
  • Variance of the OLS estimator
  • Intuition when there are more variations in x,
    the OLS estimate will be more accurate.
  • But we dont know ?, replace ? with standard
    error of regression
  • The standard error of the slope estimator is

26
Examples (Problem 2.7)
  • What is the standard error of the OLS estimator
    for the slope?

27
Functional forms
  • Linear models
  • Whenever we can transform a nonlinear model to a
    linear one, we should do so and apply
    Gauss-Markov theorem.
  • Example1 log wage regression
  • Nonlinear models

28
Multiple Regression Analysis Estimation
  • Motivation (examples)
  • other things being equal-- explaining the
    effect of per student spending on the avg test
    scores
  • Avgscore?0?1expend?2avgincu
  • Extended functional form suppose family
    consumption is a quadratic function of family
    income
  • Cons?0?1inc?2inc2u

29
Interpretations of OLS Results
  • Partial effect (marginal effect)
  • Example1
  • Example2

30
STATA Recitation
  • Determinants of College GPA
  • Example 3.1, 3.4 (GPA1.dta)
  • Explaining Arrest Record
  • Example 3.5 (CRIME1.dta)

31
Zero mean assumption
  • Consider a multiple linear regression model
  • y ?0 ?1x1 ?2x2 ?kxk u
  • Use the average value of x to predict the average
    value of y
  • Ey E?0 ?1x1 ?2x2 ?kxk u
  • ?0 ?1 Ex1 ?2Ex2 ?kExk Eu
  • Zero mean assumption Eu0
  • We got
  • Ey ?0 ?1 Ex1 ?2Ex2 ?kExk

32
Zero conditional mean condition
  • Use x to predict the average value of y that
    is, take expectations conditional on x
  • Eyx E?0 ?1x1 ?2x2 ?kxk u x
  • ?0 ?1x1 ?2x2 ?kxk Eu x1,x2,,xk
  • Zero conditional mean assumption
  • Eu x1,x2,,xk 0
  • So we get Eyx ?0 ?1x1 ?2x2 ?kxk,
  • the population regression function

33
Zero conditional mean assumption
  • Eux1,x2,,xk0 the expected (or average) value
    of error term is zero, for any slice of the
    population described by the values of the
    regressors.
  • Example no matter how many yrs of schooling or
    your gender or work experience, the error term
    must equal zero. We say schooling, gender, and
    work experience are all uncorrelated with the
    error term denoted by
  • Cov(x1,u)Ex1u0
  • Cov(x2,u)Ex2u0
  • Cov(xk,u)Exku0

34
Derive the OLS Estimators
  • The error term must satisfy 2 conditions
  • Zero mean Eu0
  • Zero conditional mean Exju0 for j1,,k
  • Rewrite in terms of parameters of interest and
    observed variables (e.g. x and y)
  • Ey -?0- ?1x1- ?2x2-- ?kxk 0
  • Exj(y -?0- ?1x1-- ?kxk )0 for j1,,k
  • The sample counter part
  • We have k1 equations and k1 unknowns

35
  • Consider a case where only k2 explanatory
    variables
  • Using those k13 equalities, we can derive the
    OLS estimators for parameters ?0, ?1, ?2

36
Comparison of Simple and Multiple Regression
Estimators
  • Wrong model - suppose we omit x2 using simple
    regression of y on x1
  • True model now we include x2 using multiple
    regression of y on x1 and x2
  • We can show that a simple relationship ()

37
Example (STATA Practice)
  • Determination of College GPA (GPA1.dta)
  • Suppose that the true model is
  • Consequence of omitting an important variable
  • Regress colGPA on ACT (ignoring ACT) verify ()
  • Regress ACT on hsGPA (ignoring colGPA).Can you
    verify ()?

38
  • Intuition about the OLS estimator formula
  • Regress ACT on hsGPA
  • Get the residual (the part of ACT that
    cannot be explained by hsGPA)
  • Regress colGPA on

39
Goodness of Fit
  • Total sum of square (SST)
  • explained sum of square (SSE)
  • residual sum of square (SSR)
  • R-squared SSE/SST 1-SSR/SST

40
Properties of OLS Estimators (1)
  • Unbiasedness
  • Required conditions
  • Linear in parameters
  • Random sampling
  • Enough variations in each regressor x1,,xk
  • Zero conditional mean

41
Omitted variable biases
  • Example (estimate the return to education)
  • True model
  • wage ?0 ?educeduc ?abilabilu
  • Let be the estimators of
    from regressing wage on educ and abil,
    respectively. We know both are unbiased
    estimators.
  • Incomplete model
  • wage ?0 ?educeduc v
  • where v ?abilabilu. Let
    be the estimators from regressing wage on educ,
    ignoring abil. They are biased.

42
  • We have shown that
  • We say that omission of the ability variable lead
    to an overstatement of the return to education.
    Or, say we have a positive bias.

43
Important Fact about Omitted Variable Biases
  • Bias in when x2 is omitted
  • Example (suppose we dont observed povrate)
  • avescore ?0 ?1expend ?2povrateu

44
Omitted variable bias more general cases
  • Example
  • wage ?0 ?1educ ?2exper?3abilu
  • Suppose we omit abil.
  • Can you predict the direction of bias in ?1 when
    we omit abil?
  • Its hard to obtain a clear direction of bias in
    because educ, exper, and abil are
    pairwise correlated.

45
Properties of OLS Estimators (2)
  • Best Linear Unbiased Estimator (BLUE)
  • The best the most efficient i.e. The variances
    of the OLS estimators are the smallest among all
    linear estimators
  • Also called the Gauss-Markov theorem

46
  • Required conditions for BLUE
  • Linear in parameters
  • Random sampling
  • Enough variations in each regressor
  • Zero conditional mean
  • Constant variance the variance of the error
    term does not varying with regressors
  • The above assumptions are Gauss-Markov
    assumptions.

47
Variance of OLS Estimators in Mutiple Linear
Regressions (1)
  • By the assumption 5, we have
  • Var(yx) Var(?0 ?1x ?2x u x)
  • Var(ux) ?2
  • We call ?2 the error variance and call ? the
    standard deviation of the error term.
  • An unbiased estimator of the error variance
  • We call the standard error of the
    regression.

48
Variance of OLS Estimators (2)
  • Variance of the OLS estimator
  • If there are more variations in x1 that cannot be
    explained by x2, the estimate is more accurate.
  • But we dont know ?, replace ? with standard
    error of regression
  • The standard error of the slope estimator is

49
  • Note that
  • Thus we also write

50
Multicollinearity
  • If x1 and x2 are perfectly correlated (so R121),
    then
  • Example
  • In estimating the effect of various school
    expenditure categories (e.g. teacher salaries,
    instructional material, athletics,) on student
    performance.
  • But wealthier schools tend to spend more on
    everything. I.e. the covariates are highly
    correlated.

51
  • Solutions to multicollinearity
  • Collecting more data to get more variation in
    covariates
  • Drop covariates from the model (but may lead to
    biased results)
  • Lumping variables togethers (e.g. all expenditure
    categories lumped into one variable)

52
Variances in Misspecified Models
  • True model
  • y ?0 ?1x1 ?2x2u
  • Incomplete model
  • y ?0 ?1x1 v

53
Hypothesis Testing (Ch4)
  • In small sample, the normalized slope estimator
    follow Student-t distribution
  • By Central Limit Theorem, the normalized slope
    estimator is standard normal in large sample

54
Examples
  • Example 4.1 (WAGE1.dta)
  • Is the return to work experience positive?
  • Construct hypotheses
  • Small sample (using student-t table)
  • Large sample (using standard normal table)

55
  • Example 4.2 (MEAP93.dta)
  • Does School Size Matter?
  • Math10 ?0 ?1teachSal ?2staff?3enrollu
  • Construct hypotheses
  • Small sample (using student-t table)
  • Large sample (using standard normal table)

56
  • Changing functional form (taking log)
  • Math10 ?0
  • ?1log(teachSal) ?2log(staff)?3log(enroll)u
  • ?3 -1.3 suggests that every one percent decrease
    in enrollment will decrease the average math
    score by 1.3 points.
  • Construct hypotheses
  • Small sample (using student-t table)

57
Computing and Using P-Values(Appendix C p 794)
  • Example (fig C.7) Suppose that we have a
    t-statistics 1.52 for the one-sided alternative
    ?gt0. Then
  • p-valuePrTgt1.52 ?01-?(1.52).065 gt.05
  • So we cannot reject the null hypothesis.
  • Example (fig C.8) Suppose that we have a
    t-statistics -2.13 for the one-sided
    alternative ?lt0. Then
  • p-valuePrTlt-2.13 ?0?(-2.13).025 lt.05
  • So we reject the null hypothesis.

58
  • Example (two-sided) Suppose that we have
    t-statistics ?1.52 for the both sides for
    alternative hypothesis ??0. Then
  • p-value PrTgt1.52 or Tlt-1.52 ?0
  • 2 PrTgt1.52 ?0
  • 2?(1.52).13 gt.05
  • So we cannot reject the null hypothesis.

59
Testing one-sided alternatives
  • H0 ?10
  • H1 ?1gt0 (fig 4.2 rejection region on the right)
  • The rejection rules (for the 5 percent
    significance level)
  • Use t statistics
  • Use confidence interval
  • p-value on one side lt.05
  • Example 4.1 (WAGE1)

60
Testing one-sided alternatives
  • H0 ?10
  • H1 ?1lt0 (fig 4.3 rejection region on the left)
  • The rejection rules (for the 5 percent
    significance level)
  • Use t statistics
  • Use confidence interval
  • p-value on one side lt.05
  • Example 4.2 (MEAP93)

61
Testing 2-sided alternatives(testing for
significance)
  • H0 ?10
  • H1 ?1?0 (fig 4.4 rejection regions on both
    sides)
  • The rejection rules (for the 5 percent
    significance level)
  • Use t statistics
  • Use confidence interval
  • p-value on one side lt.05
  • Example 4.3 (GPA1)

62
Testing for other hypothesis about coefficients
  • H0 ?11
  • H1 ?1gt1
  • Use t statistics
  • Use confidence interval
  • p-value on one side lt.05
  • Example 4.4 (CAMPUS) Example 4.5 (Homework)

63
Testing Hypothesis about a Single Linear
Combination of ?
  • Example (compare the returns to education at
    junior colleges and four-yr colleges)
  • log(wage) ?0 ?1jc ?2univ ?3exper u
  • H0 ?1 ?2
  • H1 ?1 lt ?2

64
  • where
  • STATA (TWOYEAR) After running the regression,
    type
  • test jcuniv
  • test univ1
  • STATA reports p-value based on F-test (see below).

65
Testing Multiple Linear Restrictions the F Test
  • Test a group of variables has no effect on y.
  • Example (athlete's salary)
  • log(salary) ?0 ?1years ?2gamesyr ?3bavg
    ?4hrunsyr ?5rbisyr u
  • H0 ?3 0, ?4 0, ?5 0
  • H1 H0 is not true
  • (This is call a joint hypothesis test)
  • We use F-test.
  • In STATA (MLB1), this is very simple. After
    running the regression
  • test bavg hrunsyr rbisyr

66
Ideas
  • Unrestricted model
  • SSRur (sum of squared residuals)
  • Rur2
  • Restricted model (?3 0, ?4 0, ?5 0)
  • SSRr
  • Rr2
  • Restrictions increase SSR. In the case where SSR
    almost unchanged by restrictions, we can safely
    say that the its all right to say ?3 0, ?4 0,
    ?5 0.
  • This suggests that we can use the difference in
    SSR to test the hypothesis.

67
Numerator degree of freedomq Denominator degree
of freedomn-k-1
  • F-Statistics (or F-ratio)
  • SSRr sum of squared residuals from restricted
    model (?3 0, ?4 0, ?5 0)
  • SSRursum of squared residuals from unrestricted
    model
  • qrestrictions (q3 in example)
  • kcovariates in unrestricted model (k4 in
    example)
  • F-statistics is always nonnegative because
  • SSRr gt SSRur

68
Rejection rules of F-test
  • If Fgtcritical value, then we reject the null
    hypothesis.(See Table G3 for critical values) .
  • P-valueP(FgtF) - see Fig 4.7
  • STATA (MLB1 p156)
  • Derive SSRr and SSRur
  • Derive the degrees of freedom
  • Derive the value of F-statistic
  • Compared with the critical value
  • Conclusion

69
F-test is very general
  • It can provide the same results as t-test (single
    parameter test)
  • Example
  • It can do a joint hypothesis test
  • It can test the significance of the entire
    regression model (multiple parameters test)
  • Example

70
  • It can test general linear restrictions (p162)
  • log(price) ?0 ?1log(assess)?2log(lotsize)?3log
    (sqrft)
  • ?4bdrms u
  • H0 ?1 1, ?2 0, ?3 0, ?4 0
  • Restricted model
  • log(price) ?0 log(assess) u
  • log(price)- log(assess) ?0 u
  • Derive SSRr and SSRur
  • Derive the degrees of freedom
  • Derive the value of F-statistic
  • Compared with the critical value
  • Conclusion

71
R-Squared Form of the F-Statistics
  • Note that SSRSST(1-R2)
  • We can rewrite the F-statistics as follows

72
Reporting Regression Results
  • Interpreting the estimated coefficient
  • link to economic models
  • reporting standard errors
  • R-squared goodness-of-fit measure
  • Summary in a table
  • Example (4.10, p163)-homework

73
OLS Asymptotics (Ch5)
  • Large Sample Properties of OLS
  • Consistency
  • Asymptotic normal
  • Recall the Small Sample Properties of OLS include
  • Unbiasedness (Conditions 1-4)
  • Gauss-Markov Theorem (BLUE) (Conditions 1-4 and
    Condition 5- constant variance)

74
Consistency
  • Consistency is considered the minimum requirement
    for an estimator.
  • An estimator is consistent if
  • the distribution of the estimator becomes more
    and more tightly distributed around the true
    value as the sample size n grows.
  • As n tends to infinity, the distribution of the
    estimator collapses to the point of the true
    value. (fig 5.1)

75
Consistency of OLS
  • Consider a simple regression model
  • y ?0 ?1xu
  • The formula for the OLS estimator is
  • where

76
Conditions for consistency
77
A case of inconsistency
  • When x and u are correlated (I.e. Cov(x,u) does
    not equal zero), we have problems of
    inconsistency.
  • Example
  • in wage regression, what if educ and u
    (containing unobserved ability)? The OLS
    estimators are inconsistent.
  • I.e. even in large sample, the distribution of
    the coefficients will not collapse to the true
    value. This estimator is not very useful.
  • Asymptotic bias Cov(x,u)/Var(x)

78
Example 5.1
  • Housing pries and distance from an incinerator
  • If the incinerator depresses house price, its
    coefficient should be positive.
  • If higher quality of house increases the house
    price, the coefficient of quality should be
    positive.
  • When quality of house is not fully measured or
    observed, the effect of distance from an
    incinerator would overstate the effect of the
    incinerator on housing price because

79
Large Sample Inference
  • Consistency of an estimator is an important
    property but it does not provide information
    about the accuracy of the estimator.
  • In sample sample, we have had Gauss-Markov
    theorem to tell us the degree of accuracy (I.e.
    variance) of an OLS estimator. In fact the OLS
    estimator is the most accurate one among linear
    unbiased estimators (I.e. BLUE).
  • In large sample, we can do even better! The
    distribution of the OLS estimator look almost
    like a normal distribution. So we can use
    standard normal table to do hypothesis testing.

80
Theorem 5.2 (Asymptotic Normality of OLS)
  • Under the Gauss-Markov Assumptions
  • OLS estimator is asymptotically normally
    distributed

81
  • Replace the parameter ?2 by a consistent
    estimator of ?2, the distribution of the
    estimator is asymptotic normal
  • In small sample, the above normalization follows
    the Student-t tn-k-1 by any sample size.
  • The s.e. of the OLS estimator shrinks at a rate
    of the inverse of the square root of the sample
    size.

82
Example 5.2
  • Standard error in a birth weight equation (BWGHT)
  • log(birthweight)
  • ?0 ?1cigs?2log(fincome)u
  • Using the first half of the data (n694), the
    s.e.(cigs) .0013
  • Using the full sample (1388), the
    s.e.(cigs).00086
  • .0013/.00086 is almost equal to square root of
    1388/694.
Write a Comment
User Comments (0)
About PowerShow.com