Welcome to Econ 420 Applied Regression Analysis - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Welcome to Econ 420 Applied Regression Analysis

Description:

I asked you to report your heights and weights before Sunday September 2 ... People with low levels of income will probably spend most of their income. ... – PowerPoint PPT presentation

Number of Views:114
Avg rating:3.0/5.0
Slides: 22
Provided by: khora
Category:

less

Transcript and Presenter's Notes

Title: Welcome to Econ 420 Applied Regression Analysis


1
Welcome to Econ 420 Applied Regression Analysis
  • Study Guide
  • Week Two
  • Ending Sunday, September 9
  • (Note You must go over these slides and complete
    every task outlined here by the end of the day on
    September 8)

2
Last week
  • I asked you to report your heights and weights
    before Sunday September 2
  • That meant by the end of the day on Saturday,
    September 1.
  • I did not hear from 4 of the students who are
    registered in this class
  • Remember that this affects your grade

3
Here is our sample data on height and weight.
4
Assignment 1(Carries 30 points and is due before
noon on Thursday, September 6)
  • Use the data set on the previous slide and the
    formulas on Page 8 (1-5 and 1-6) to estimated the
    coefficients ß0 and ß1 in the equation below
  • W ß0 ß1 H
  • Make sure to show your work.
  • Do the estimated coefficients make sense to you?
  • What is the meaning of the estimated
    coefficients?

5
Assignment 1 continued
  • 2. Answer Question 5 on Page 15
  • 3. Answer Question 8 on Page 15
  • Type your answers and send them to me as an email
    attachment. Remember that I have an old version
    of word (2003). If you are using a newer version
    of word, you will need to save your work in the
    old format.

6
Note
  • The following notes are not going to take the
    place of the discussions covered in your text
    books
  • First read the book
  • Then look at the notes

7
Total, Explained and Residual Sum of Squares
(PP11-13)
  • Remember our height/weight example
  • What is the average weight of the class?
  • Duplicate the graph on Page 12 where Y is the
    weight and X is the height
  • The Fitted Line will be upward sloping
  • The Average Line (average weight) will be
    horizontal

8
Suppose instead of using the fitted line to
predict someones weight we use the average line
  • Y is the actual weight of a person.
  • Y is the predicted weight according to the
    fitted line.
  • Y bar is the average weight in the sample.
  • (Y Ybar) is how much the weight of a given
    individual is different from the average.
  • (Y - Ybar) is how much our fitted line is closer
    to the actual weight than the average weight.
  • (Y Y) is our residual
  • The portion of the weight that was not predicted
    (explained) by our fitted line

9
Remember we have 8 observations in our sample
  • Some of our weights are below average and some
    are above average.
  • Look at Equation 1-8, Page 12
  • The reason why we square (Y Ybar), (Y - Ybar)
    and (Y Y) is because we do not want the
    positive differences to cancel the negative
    differences
  • Note the best fitted line will be the one with
    the lowest (Y Y) 2

10
Multiple Regression Model (Chapter 2, PP20-29)
  • Is height the only factor affecting weight?
  • Of course not.
  • What are some other factors affecting an
    individuals weight?
  • Age
  • Calorie in take per day

11
So a better model will be
  • Y ß0 ß1 X1 ß2 X2 ß3 X3 e
  • Where Y is weight and X1 through X3 are Wight,
    Age, and Calorie intake.
  • We will use EViews to estimate the coefficients
    of the a multiple regression model.

12
The meaning of the estimated coefficients
  • Our estimated equations will be
  • Y ß0 ß1 X1 ß2 X2 ß3 X3
  • Bonus Can someone tell me why didnt I put an
    e at the end of the above equation?
  • ß1 measures the effect of one more inch of
    height on weight, holding the age and the calorie
    intake constant and ignoring the effect of all
    other variables on weight.
  • Similarly ß2 measures the effect of one more
    year of age on weight , holding the weight and
    the calorie intake constant and ignoring the
    effect of all other variables on weight.

13
How big should the sample be?
  • The bigger the sample the closer the ß will be
    to ß.
  • Rule of thumb Degrees of Freedom gt30
  • Degrees of Freedom n- k-1
  • Where n is the sample size and k is the number
    of independent variables.

14
The Classical Assumption
  • Assumptions that have to be met in order for OLS
    to give us the best estimators.

15
Assumption 1
  • The regression equation
  • Is linear in coefficients (not linear in
    variables)
  • Is correctly specified (right functional form, no
    omitted variables, no irrelevant variables)
  • Has additive error term

16
Assumption 2
  • Two or more independent variables are not
    perfectly correlated with each other.
  • If violated ? Perfect Multicollinearity
  • Example
  • Consumption f (inflation, real interest rate,
    nominal interest rate, .)
  • Since real interest nominal interest
    inflations,
  • The 3 independent variables are perfectly and
    linearly correlated with each other. When one
    independent variable changes, the others change
    too. OLS can not capture the effect of one
    variable in isolation

17
Assumption 3
  • No correlation between the explanatory
    (independent) variables and the error term
  • What if it is violated?
  • Example Salary f (Education,.,GPA)
  • What if people with low GPA lie about their GPAs?
  • When GPA is low, the error is always positive
  • Problem OLS attributes the variation in salary
    to the variation in GPA while it is in part
    caused by the variation in error.

18
Assumption 4
  • The error terms are uncorrelated with each other
  • What if it is violated?
  • Then we have autocorrelation (serial correlation)
    problem
  • Example Consumption f (., income)
  • Suppose we use time series data on the US economy
    to estimate the above model.
  • Suppose that in 5 years of our study there was a
    war and consumption dropped significantly even
    though income didnt. So, we will get negative
    errors during those years and they all seem to be
    correlated with each other.

19
Assumption 5
  • The error term must have a zero mean
  • What if this assumption is violated
  • This is not a big deal the intercept will pick
    up the mean of the error term

20
Assumption 5
  • The error term has a constant variance
  • What if it is violated?
  • Problem of Heteroskedasticity
  • Example Consumption f (., income)
  • Suppose we use cross section data on various
    individuals to estimate the above model.
  • People with low levels of income will probably
    spend most of their income. (The variance of the
    error is small)
  • People with high levels of income may spend
    anywhere between 10 to 99 of their income. (The
    variance of the error is high.) (Figure 2-1)

21
Assumption 7 (Not Necessary)
  • The error term is normally distributed
  • What is a normal distribution?
  • Symmetric, continuous, bell shaped
  • Can be characterized by its mean and variance
  • Must know if it is violated
  • If violated, some statistical tests are not
    applicable
  • As the size of sample goes up ? the distribution
    becomes more normal
Write a Comment
User Comments (0)
About PowerShow.com