MULTIPLE REGRESSION - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

MULTIPLE REGRESSION

Description:

... variables is now p, denoted by X1, X2, X3, ... ,Xp and the model is given by ... by bj /cjjs has a t distribution with (n-p-1) degrees of freedom if j = 0 ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 38
Provided by: omarels
Category:

less

Transcript and Presenter's Notes

Title: MULTIPLE REGRESSION


1
MULTIPLE REGRESSION
2
The Multiple Regression Model
  • The multiple regression model is an extension of
    the simple linear regression model introduced in
    bivariate analysis
  • The number of explanatory variables is now p,
    denoted by X1, X2, X3, ,Xp and the model is
    given by
  • y ?0 ?1x1 ?2x2 ?3x3 ?pxp u
  • where VU and EU 0

3
Example
  • The data for this example was derived from an air
    pollution study in forty cities. The variables
    are defined as follows
  • TMR Total mortality rate
  • SMIN, SMEAN, SMAX biweekly sulphate reading,
    smallest annual, average annual and largest
    annual respectively
  • PMIN, PMEAN, PMAX biweekly suspended
    particulate reading, smallest annual, average
    annual and largest annual

4
Example
  • GE65 percent of population at least 65
    multiplied by 10
  • NONPOOR percent of families above poverty level
  • PERWH percent of whites in population
  • LPOP logarithm of population
  • PM2 population density

5
Example
6
Correlation Matrix
  • Shows simple correlation relationship among all
    possible pairs of variables
  • Useful for understanding how the explanatory
    variables influence the dependent variable
  • Useful for showing the correlation among the
    explanatory variables

7
Example
8
Least Squares Estimation
  • As in the case of the simple linear regression
    model least squares is used to estimate the
    unknown parameters by minimizing the expression
  • with respect to the parameters ?0, ?1, ?2, , ?p

9
Least Squares Estimation
  • The least squares estimators can be defined by
    the matrix expression given by the vector of
    coefficients
  • b (X? X)-1 X? y and the equation
  • i 1, 2, , n

10
Least Squares Estimation
  • Where b y

11
Least Squares Estimation
  • X

12
Least Squares Estimation
  • The estimator of is given by

13
Example
14
Example
15
Properties of Estimators of The Coefficients
  • Least squares estimators are unbiased and minimum
    variance
  • The standard error of the coefficient estimator
    bj is estimated by cjjs where cjj is the
    diagonal element of (X? X)-1 corresponding to bj
  • The statistic given by bj /cjjs has a t
    distribution with (n-p-1) degrees of freedom if
    ?j 0

16
Inference for Regression Coefficients
  • A 100(1-?) confidence interval for ?j is given
    by

17
Inference for Regression Coefficients
  • To test the null hypothesis H0 ?j 0 we employ
    the test statistic
  • Which has a t distribution with (n-p-1) degrees
    of freedom if the null hypothesis is true

18
Example
19
Multiple Coefficient of Determination
  • An extension of the coefficient of determination
    goodness of fit measure introduced for the simple
    linear regression model is the multiple
    coefficient of determination
  • The definition is identical to the coefficient of
    determination definition given by
  • R2 SSR/SST

20
Multiple Coefficient of Determination
  • The sums of squares have the same definitions as
    in simple linear regression
  • SST
  • SSR
  • SSE SST - SSR

21
F Test of Goodness of Fit
  • A test for the overall goodness of fit is given
    by the F-test similar to the F- test used in
    simple linear regression
  • H0 ?1 ?2 ?p 0
  • The test statistic is given by
  • F

22
Analysis of Variance Table
  • The information is usually summarized in the
    analysis of variance table given by

23
Example
24
Reduced Models
  • In practice it is of interest to study reduced
    models which contain only a subset of the set of
    possible explanatory variables
  • In order to compare various models in which one
    is a subset of another we require a statistical
    test which will indicate whether there has been a
    loss of explanatory power by reducing the number
    of explanatory variables
  • The partial F- test is outlined below for this
    purpose

25
Example
26
Reduced Models
  • The two models being compared are called the full
    model and the reduced model
  • The full model contains all p explanatory
    variables and is given by
  • y ?0 ?1x1 ?qxq ?q1xq1 ?pxp u
  • The reduced model eliminates the first q
    explanatory variables and is given by
  • y ?0 ?q1xq1 ?pxp u

27
Reduced Models
  • We wish to test the null hypothesis that the
    reduced model is as good as the full model for
    explaining the variation in y
  • Hence we have H0 Reduced model as good as Full
    model
  • Equivalently we are testing that the coefficients
    for the first q explanatory variables are zero
  • Hence H0 ?1 ?2 ?q 0
  • Note that the order of the variables in the model
    is arbitrary so we assume the first q

28
Comparing Full and Reduced Models
  • Denote the sums of squares and coefficient of
    multiple determination for the full model by
    SST, SSR, SSE and R2
  • Denote the reduced model sums of squares and
    coefficient of multiple determination by SSRR
    and
  • Note that the total sum of squares remains fixed

29
Test Statistic For Comparing Full and Reduced
Models
  • The test statistic is given by
  • which has an F distribution with q and (n-p-1)
    d.f. of H0 is true

30
Examples
31
Examples
32
Examples
H0 Reduced model as good as Full model or the
extra variables are superfluous HA Full model
superior or at least one of the 6 variables is
important
  • Also,

33
Examples
  • Mean of F distribution is usually near 1
  • Since F is much less than 1 obviously cannot
    reject H0
  • Note F0.05,6,28 2.45, F0.01,6,28 3.53,
    F0.10,6,28 2.00

34
Confidence Interval For The Mean
  • At X xj (a particular value of X) denote the
    estimator of y by
  • A confidence interval for the mean value of y at
    X xj is given by

35
Confidence Interval For Individual Predictions
  • A confidence interval for a particular value of y
    at X xj is given by
  • Note the extra term of 1 which includes the
    variation around the mean

36
Examples
37
Examples
Write a Comment
User Comments (0)
About PowerShow.com