Introduction: The General Linear Model - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Introduction: The General Linear Model

Description:

The General Linear Model is a phrase used to indicate a ... Best parabola...? (i.e. nonlinear or curvilinear relationships) Best maximum likelihood model ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 36
Provided by: rdu9
Category:

less

Transcript and Presenter's Notes

Title: Introduction: The General Linear Model


1
Introduction The General Linear Model
  • The General Linear Model is a phrase used to
    indicate a class of statistical models which
    include simple linear regression analysis.
  • Regression is the predominant statistical tool
    used in the social sciences due to its simplicity
    and versatility.
  • Also called Linear Regression Analysis.

2
Simple Linear Regression The Basic Mathematical
Model
  • Regression is based on the concept of the simple
    proportional relationship - also known as the
    straight line.
  • We can express this idea mathematically!
  • Theoretical aside All theoretical statements of
    relationship imply a mathematical theoretical
    structure.
  • Just because it isnt explicitly stated doesnt
    mean that the math isnt implicit in the language
    itself!

3
Simple Linear Relationships
  • Alternate Mathematical Notation for the straight
    line - dont ask why!
  • 10th Grade Geometry
  • Statistics Literature
  • Econometrics Literature

4
Alternate Mathematical Notation for the Line
  • These are all equivalent. We simply have to live
    with this inconsistency.
  • We wont use the geometric tradition, and so you
    just need to remember that B0 and a are both the
    same thing.

5
Linear Regression the Linguistic Interpretation
  • In general terms, the linear model states that
    the dependent variable is directly proportional
    to the value of the independent variable.
  • Thus if we state that some variable Y increases
    in direct proportion to some increase in X, we
    are stating a specific mathematical model of
    behavior - the linear model.

6
Linear RegressionA Graphic Interpretation
7
The linear model is represented by a simple
picture
8
The Mathematical Interpretation The Meaning of
the Regression Parameters
  • a the intercept
  • the point where the line crosses the Y-axis.
  • (the value of the dependent variable when all of
    the independent variables 0)
  • b the slope
  • the increase in the dependent variable per unit
    change in the independent variable (also known as
    the 'rise over the run')

9
The Error Term
  • Such models do not predict behavior perfectly.
  • So we must add a component to adjust or
    compensate for the errors in prediction.
  • Having fully described the linear model, the rest
    of the semester (as well as several more) will be
    spent of the error.

10
The Nature of Least Squares Estimation
  • There is 1 essential goal and there are 4
    important concerns with any OLS Model

11
The 'Goal' of Ordinary Least Squares
  • Ordinary Least Squares (OLS) is a method of
    finding the linear model which minimizes the sum
    of the squared errors.
  • Such a model provides the best explanation/predict
    ion of the data.

12
Why Least Squared error?
  • Why not simply minimum error?
  • The errors about the line sum to 0.0!
  • Minimum absolute deviation (error) models now
    exist, but they are mathematically cumbersome.
  • Try algebra with Absolute Value signs!

13
Other models are possible...
  • Best parabola...?
  • (i.e. nonlinear or curvilinear relationships)
  • Best maximum likelihood model ... ?
  • Best expert system...?
  • Complex Systems?
  • Chaos models
  • Catastrophe models
  • others

14
The Simple Linear Virtue
  • I think we over emphasize the linear model.
  • It does, however, embody this rather important
    notion that Y is proportional to X.
  • We can state such relationships in simple
    English.
  • As unemployment increases, so does the crime rate.

15
The Notion of Linear Change
  • The linear aspect means that the same amount of
    increase unemployment will have the same effect
    on crime at both low and high unemployment.
  • A nonlinear change would mean that as
    unemployment increased, its impact upon the crime
    rate might increase at higher unemployment levels.

16
Why squared error?
  • Because
  • (1) the sum of the errors expressed as deviations
    would be zero as it is with standard deviations,
    and
  • (2) some feel that big errors should be more
    influential than small errors.
  • Therefore, we wish to find the values of a and b
    that produce the smallest sum of squared errors.

17
Minimizing the Sum of Squared Errors
  • Who put the Least in OLS
  • In mathematical jargon we seek to minimize the
    Unexplained Sum of Squares (USS), where

18
The Parameter estimates
  • In order to do this, we must find parameter
    estimates which accomplish this minimization.
  • In calculus, if you wish to know when a function
    is at its minimum, you take the first
    derivative.
  • In this case we must take partial derivatives
    since we have two parameters (a b) to worry
    about.
  • We will look closer at this and its not a pretty
    sight!

19
Why squared error?
  • Because
  • (1) the sum of the errors expressed as
    deviations would be zero as it is with standard
    deviations, and
  • (2) some feel that big errors should be more
    influential than small errors.
  • Therefore, we wish to find the values of a and b
    that produce the smallest sum of squared errors.

20
Decomposition of the error in LS
21
Sum of Squares Terminology
  • In mathematical jargon we seek to minimize the
    Unexplained Sum of Squares (USS), where

22
The Parameter estimates
  • In order to do this, we must find parameter
    estimates which accomplish this minimization.
  • In calculus, if you wish to know when a function
    is at its minimum, you take the first derivative.
  • In this case we must take partial derivatives
    since we have two parameters to worry about.

23
Tests of Inference
  • t-tests for coefficients
  • F-test for entire model

24
T-Tests
  • Since we wish to make probability statements
    about our model, we must do tests of inference.
  • Fortunately,

25
Goodness of Fit
  • Since we are interested in how well the model
    performs at reducing error, we need to develop a
    means of assessing that error reduction. Since
    the mean of the dependent variable represents a
    good benchmark for comparing predictions, we
    calculate the improvement in the prediction of Yi
    relative to the mean of Y (the best guess of Y
    with no other information).

26
Sums of Squares
  • This gives us the following 'sum-of-squares'
    measures
  • Total Variation Explained Variation
    Unexplained Variation

27
Sums of Squares Confusion
  • Note Occasionally you will run across ESS and
    RSS which generate confusion since they can be
    used interchangeably. ESS can be error
    sums-of-squares or estimated or explained SSQ.
    Likewise RSS can be residual SSQ or regression
    SSQ. Hence the use of USS for Unexplained SSQ in
    this treatment.

28
This gives us the F test
29
Measures of Goodness of fit
  • The Correlation coefficient
  • r-squared

30
The correlation coefficient
  • A measure of how close the residuals are to the
    regression line
  • It ranges between -1.0 and 1.0
  • It is closely related to the slope.

31
R2 (r-square)
  • The r2 (or R-square) is also called the
    coefficient of determination.

32
Tests of Inference
  • t-tests for coefficients
  • F-test for entire modelSince we are interested
    in how well the model performs at reducing error,
    we need to develop a means of assessing that
    error reduction. Since the mean of the dependent
    variable represents a good benchmark for
    comparing predictions, we calculate the
    improvement in the prediction of Yi relative to
    the mean of Y (the best guess of Y with no other
    information).This gives us the following
    'sums-of-squares' measures

33
Goodness of fit
  • The correlation coefficient
  • A measure of how close the residuals are to the
    regression lineIt ranges between -1.0 and 1.0
  • r2 (r-square)
  • The r-square (or R-square) is also called the
    coefficient of determination

34
Extra Material on OLS The Adjusted R2
  • Since R2 always increases with the addition of a
    new variable, the adjusted R2 compensates for
    added explanatory variables.

35
Extra Material on OLS The F-test
  • In addition, the F test for the entire model must
    be adjusted to compensate for the changed degrees
    of freedom.
  • Note that F increases as n or R2 increases and
    decreases as k increasesAdding a variable will
    always increase R2, but not necessarily adjusted
    R2 or F. In addition values of R2 below 0.0 are
    possible.
Write a Comment
User Comments (0)
About PowerShow.com