Week 4 - PowerPoint PPT Presentation

About This Presentation
Title:

Week 4

Description:

Week 4 Bivariate Regression, Least Squares and Hypothesis Testing – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 64
Provided by: odu72
Learn more at: https://ww2.odu.edu
Category:
Tags: keohane | king | week

less

Transcript and Presenter's Notes

Title: Week 4


1
Week 4
  • Bivariate Regression,
  • Least Squares and
  • Hypothesis Testing

2
Lecture Outline
  • Method of Least Squares
  • Assumptions
  • Normality assumption
  • Goodness of fit
  • Confidence Intervals
  • Tests of Significance
  • alpha versus p

3
Recall . . .
  • Regression curve as line connecting the mean
    values of y for a given x
  • No necessary reason for such a construction to be
    a line
  • Need more information to define a function

4
Method of Least Squares
  • Goal describe the functional relationship
    between y and x
  • Assume linearity (in the parameters)
  • What is the best line to explain the
    relationship?
  • Intuition The line that is closest or fits
    best the data

5
Best line, n 2
6
Best line, n 2
7
Best line, n gt 2
?
8
Best line, n gt 2
9
Least squares intuition
10
Least squares, n gt 2
11
Why sum of squares?
  • Sum of residuals may be zero
  • Emphasize residuals that are far away from
    regression line
  • Better describes spread of residuals

12
Least-squares estimates
Intercept
Residuals
Effect of x on y (slope)
13
Gauss-Markov Theorem
  • Least-squares method produces best, linear
    unbiased estimators (BLUE)
  • Also most efficient (minimum variance)
  • Provided classic assumptions obtain

14
Classical Assumptions
  • Focus on 3, 4, and 5 in Gujarati
  • Implications for estimators of violations
  • Skim over 1, 2, 6 through 10

15
3 Zero mean value of ui
  • Residuals are randomly distributed around the
    regression line
  • Expected value is zero for any given observation
    of x
  • NOTE Equivalent to assuming the model is fully
    specified

16
3 Zero mean value of ui
17
3 Zero mean value of ui
18
3 Zero mean value of ui
19
3 Zero mean value of ui
20
3 Zero mean value of ui
21
3 Zero mean value of ui
22
3 Zero mean value of ui
23
3 Zero mean value of ui
24
Violation of 3
  • Estimated betas will be
  • Unbiased but
  • Inconsistent
  • Inefficient
  • May arise from
  • Systematic measurement error
  • Nonlinear relationships (Phillips curve)

25
4 Homoscedasticity
  • The variance of the residuals is the same for all
    observations, irrespective of the value of x
  • Equal variance
  • NOTE 3 and 4 imply (see Normality Assumption)

26
4 Homoscedasticity
27
4 Homoscedasticity
28
4 Homoscedasticity
29
4 Homoscedasticity
30
4 Homoscedasticity
31
Violation of 4
  • Estimated betas will be
  • Unbiased
  • Consistent but
  • Inefficient
  • Arise from
  • Cross-sectional data

32
5 No autocorrelation
  • The correlation between any two residuals is zero
  • Residual for xi is unrelated to xj

33
5 No autocorrelation
34
5 No autocorrelation
35
5 No autocorrelation
36
5 No autocorrelation
37
Violations of 5
  • Estimated betas will be
  • Unbiased
  • Consistent
  • Inefficient
  • Arise from
  • Time-series data
  • Spatial correlation

38
Other Assumptions (1)
  • Assumption 6 zero covariance between xi and ui
  • Violations cause of heteroscedasticity
  • Hence violates 4
  • Assumption 9 model correctly specified
  • Violations may violate 1 (linearity)
  • May also violate 3 omitted variables?

39
Other Assumptions (2)
  • 7 n must be greater than number of parameters
    to be estimated
  • Key in multivariate regression
  • King, Keohane and Verbas (1996) critique of
    small n designs

40
Normality Assumption
  • Distribution of disturbance is unknown
  • Necessary for hypothesis testing of I.V.s
  • Estimates a function of ui
  • Assumption of normality is necessary for
    inference
  • Equivalent to assuming model is completely
    specified

41
Normality Assumption
  • Central Limit Theorem MMs
  • Linear transformation of a normal variable itself
    is normal
  • Simple distribution (mu, sigma)
  • Small samples

42
Assumptions, Distilled
  • Linearity
  • DV is continuous, interval-level
  • Non-stochastic No correlation between
    independent variables
  • Residuals are independently and identically
    distributed (iid)
  • Mean of zero
  • Constant variance

43
If so, . . .
  • Least-squares method produces BLUE estimators

44
Goodness of Fit
  • How well the least-squares regression line fits
    the observed data
  • Alternatively how well the function describes
    the effect of x on y
  • How much of the observed variation in y have we
    explained?

45
Coefficient of determination
  • Commonly referred to as r2
  • Simply, the ratio of explained variation in y to
    the total variation in y

46
Components of variation
explained
total
residual
47
Components of variation
  • TSS total sum of squares
  • ESS explained sum of squares
  • RSS residual sum of squares

48
Hypothesis Testing
  • Confidence Intervals
  • Tests of significance
  • ANOVA
  • Alpha versus p-value

49
Confidence Intervals
  • Two components
  • Estimate
  • Expression of uncertainty
  • Interpretation
  • Gujarati, p. 121 The probability of
    constructing an interval that contains Beta is
    1-alpha
  • NOT The p that Beta is in the interval is
    1-alpha

50
C.I.s for regression
  • Depend upon our knowledge or assumption about the
    sampling distribution
  • Width of interval proportional to standard error
    of the estimators
  • Typically we assume
  • The t distribution for Betas
  • The chi-square distribution for variances
  • Due to unknown true standard error

51
Confidence Intervals in IR
  • Examples?

52
The worst weatherman in the world
  • Three-degree guarantee
  • If his forecast high is off by more than three
    degrees, someone wins an umbrella
  • Woo hoo

53
How Many Umbrellas?
  • Data mean daily temperature in February for
    Washington, DC
  • Daily observations from 1995 to 2005 (n 311)
  • Mean 47.91 degrees F
  • Standard deviation 10.58
  • The interval /- 3.5 degrees F
  • Due to rounding
  • Note spread of seven (eight?) degrees

54
The t value
  • We dont know alpha level of confidence
  • Assume t distribution

55
The answer
  • From the t table
  • Tom will give away an umbrella on
  • average about once every 26,695,141 days.
  • Thanks, Tom.

56
Tests of Significance
  • A hypothesis about a point value rather than an
    interval
  • Does the observed sample value differ from the
    hypothesized value?
  • Null hypothesis (H0) no difference
  • Alternative hypothesis (Ha) significant
    difference

57
Regression Interpretation
  • Is the hypothesized causal effect (beta)
    significantly different than zero?
  • Ho no effect (ß 0)
  • Ha effect (ß ? 0)
  • The zero null hypothesis

58
Two-tail v. One-tail tests
  • Two-tail
  • Ha is not concerned with direction of difference
  • Exploratory
  • Theory in disagreement
  • Critical regions on both ends
  • One tailed
  • Ha specifies a direction of effect
  • Theory well developed
  • Critical regions only on one end

59
The 2-t rule
  • Gujarati, p. 134 zero null hypothesis can be
    rejected if t gt 2
  • D.F. gt 20
  • Level of significance 0.05
  • Recall Weatherman Tom t 5.62!

60
Alpha versus p-values
  • Alpha
  • Conventional
  • Findings reported at 0.5, 0.1, 0.01
  • Accessible, intuitive
  • Arbitrary
  • Makes assumptions about Type I, II errors
  • P-value
  • The lowest significance at which a null
    hypothesis can be rejected
  • Widely accepted today
  • Know your readers!

61
ANOVA
  • Intuitively similar to r2
  • Identical output for bivariate regression
  • A good test of the zero null hypothesis
  • In multivariate regression, tests the null
    hypotheses for all betas
  • Check F statistic before checking betas!

62
Limits of ANOVA
  • Harder to interpret
  • Does not provide information on direction or
    magnitude of effect for independent variables

63
ANOVA output from SPSS
Write a Comment
User Comments (0)
About PowerShow.com