Class 4 Ordinary Least Squares - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Class 4 Ordinary Least Squares

Description:

Scatter Plot of Fertilizer and Production. Objective of Regression ... Scatter Plot of Fertilizer and Production. Scatter Plot of R&D and Patents (log) ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 40
Provided by: hpGred
Category:

less

Transcript and Presenter's Notes

Title: Class 4 Ordinary Least Squares


1
Class 4Ordinary Least Squares
CERAM February-March-April 2008
  • Lionel Nesta
  • Observatoire Français des Conjonctures
    Economiques
  • Lionel.nesta_at_ofce.sciences-po.fr

2
Introduction to Regression
  • Ideally, the social scientist is interested not
    only in knowing the intensity of a relationship,
    but also in quantifying the magnitude of a
    variation of one variable associated with the
    variation of one unit of another variable.
  • Regression analysis is a technique that examines
    the relation of a dependent variable to
    independent or explanatory variables.
  • Simple regression y f(X)
  • Multiple regression y f(X,Z)
  • Let us start with simple regressions

3
Scatter Plot of Fertilizer and Production
4
Scatter Plot of Fertilizer and Production
5
Scatter Plot of Fertilizer and Production
6
Scatter Plot of Fertilizer and Production
7
Scatter Plot of Fertilizer and Production
8
Objective of Regression
  • It is time to ask What is a good fit?
  • A good fit is what makes the error small
  • The best fit is what makes the error smallest
  • Three candidates
  • To minimize the sum of all errors
  • To minimize the sum of absolute values of errors
  • To minimize the sum of squared errors

9
To minimize the sum of all errors
Y



X
10
To minimize the sum of absolute values of errors
Y
1
2
1
X
11
To minimize the sum of squared errors
Y



X
12
To minimize the sum of squared errors
  • Overcomes the sign problem
  • Goes through the middle point
  • Squaring emphasizes large errors
  • Easily Manageable
  • Has a unique minimum
  • Has a unique and best - solution

13
Scatter Plot of Fertilizer and Production
14
Scatter Plot of RD and Patents (log)
15
Scatter Plot of RD and Patents (log)
16
Scatter Plot of RD and Patents (log)
17
Scatter Plot of RD and Patents (log)
18
The Simple Regression Model
  • yi Dependent variable (to be explained)
  • xi Independent variable (explanatory)
  • a First parameter of interest
  • Second parameter of interest
  • ei Error term

19
The Simple Regression Model
20
To minimize the sum of squared errors
21
To minimize the sum of squared errors
22
Application to CERAM_BIO Data using Excel
23
Application to CERAM_BIO Data using Excel
24
Interpretation
  • When the log of RD (per asset) increases by one
    unit, the log of patent per asset increases by
    1.748
  • Remember! A change in log of x is a relative
    change of x itself
  • A 1 increase in RD (per asset) entails a 1.748
    increase in the number of patent (per asset).

25
Application to Data using SPSS
Analyse ? Régression ? Linéaire
26
Assessing the Goodness of Fit
  • It is important to ask whether a specification
    provides a good prediction on the dependent
    variable, given values of the independent
    variable.
  • Ideally, we want an indicator of the proportion
    of variance of the dependent variable that is
    accounted for or explained by the statistical
    model.
  • This is the variance of predictions (y) and the
    variance of residuals (e), since by construction,
    both sum to overall variance of the dependent
    variable (y).

27
Overall Variance
28
Decomposing the overall variance (1)
29
Decomposing the overall variance (2)
30
Coefficient of determination R²
  • R2 is a statistic which provides information on
    the goodness of fit of the model.

31
Fishers F Statistics
  • Fishers statistics is relevant as a form of
    ANOVA on SSfit which tells us whether the
    regression model brings significant (in a
    statistical sense, information.

Model SS df MSS F
(1) (2) (3) (2)/(3)
Fitted p
Residual Np1
Total N1
p number of parameters N number of observations
32
Application to Data using SPSS
Analyse ? Régression ? Linéaire
33
What the R² is not
  • Independent variables are a true cause of the
    changes in the dependent variable
  • The correct regression was used
  • The most appropriate set of independent variables
    has been chosen
  • There is co-linearity present in the data
  • The model could be improved by using transformed
    versions of the existing set of independent
    variables

34
Inference on ß
  • We have estimated
  • Therefore we must test whether the estimated
    parameter is significantly different than 0, and,
    by way of consequence, we must say something on
    the distribution the mean and variance of the
    true but unobserved ß

35
The mean and variance of ß
  • It is possible to show that is a good
    approximation, i.e. an unbiased estimator, of the
    true parameter ß.
  • The variance of ß is defined as the ratio of the
    mean square of errors over the sum of squares of
    the explanatory variable

36
The confidence interval of ß
  • We must now define de confidence interval of ß,
    at 95. To do so, we use the mean and variance of
    ß and define the t value as follows
  • Therefore, the 95 confidence interval of ß is

If the 95 CI does not include 0, then ß is
significantly different than 0.
37
Student t Test for ß
  • We are also in the position to infer on ß
  • H0 ß 0
  • H1 ß ? 0

Rule of decision Accept H0 is t lt
ta/2 Reject H0 is t ta/2
38
Application to Data using SPPS
Analyse ? Régression ? Linéaire
39
Assignments on CERAM_BIO
  • Regress the number of patent on RD expenses and
    consider
  • The quality of the fit
  • The significance and direction of RD expenses
  • The interpretation of the result in an economic
    sense
  • Repeat steps 1 to 3 using
  • RD expenses divided by one million (you need to
    generate a new variable for that)
  • The log of RD expenses
  • What do you observe? Why?
Write a Comment
User Comments (0)
About PowerShow.com