Chapter 11 Multiple Linear Regression - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 11 Multiple Linear Regression

Description:

Chapter 11 Multiple Linear Regression Our Group Members: Content: Multiple Regression Model -----Yifan Wang Statistical Inference ---Shaonan Zhang & Yicheng Li ... – PowerPoint PPT presentation

Number of Views:462
Avg rating:3.0/5.0
Slides: 98
Provided by: Dogg2
Category:

less

Transcript and Presenter's Notes

Title: Chapter 11 Multiple Linear Regression


1
Chapter 11 Multiple Linear
Regression
2
Our Group Members
3
Content
  • Multiple Regression Model -----Yifan Wang
  • Statistical Inference ---Shaonan Zhang Yicheng
    Li
  • Variable Selection Methods SAS

  • ---Guangtao Li Ruixue Wang
  • Strategy for Building a Model and Data
    Transformation
  • ---
    Xiaoyu Zhang Siyuan Luo
  • Topics in Regression Modeling

  • ----Yikang Chai Tao Li
  • Summary -----Xing Chen


4
Ch 11.1-11.3 Introduction to Multiple Linear
Regression
  • Yifan Wang
  • Dec. 6th, 2007

5
  • Based on Chapter 10, we studied how to fit a
    linear relationship between a response variable y
    and a predictor variable x.
  • But, sometimes we cannot handle a problem using
    simple linear regression, when there are two or
    more predictor variables.
  • For Example
  • The salary of a company employee may depend on
  • job category
  • years of experience
  • education
  • performance evaluations

6
?
What do we need to do
  • Extend the simple linear regression model to the
    case of two or more predictor variables.
  • Multiple Linear Regression (or simply Multiple
    Regression) is the statistical methodology used
    to fit such models.

7
Multiple Linear Regression
  • In multiple regression we fit a model of the
    form (excluding the error term)
  • Where are
    predictor variables and are k1
    unknown parameters.

linear
For example This model includes the kth degree
polynomial model in a single variable x,
namely, Since we can put
.
8
11.1 A Probabilistic Model For Multiple Linear
Regression
  • Regard the response variable as random
  • Regard the predictor variables as nonrandom.
  • The data for multiple regression consist of n
    vectors of observations (
    ) for i 1,2,,n.
  • Example 1
  • The response variable the salary of the i
    th person in the sample
  • The predictor variables his/her years of
    experience

  • his/her years of education.

9
Example 2
is the observed value of the r.v..
depends on fixed
predictor values
according to the following
model
Where is a random error with 0, and
are unknown parameters. Assume are independent
random variables. Then the are independent
random variables with
10
11.2 Fitting the Multiple Regression Model
11.2.1 Least Squares (LS) Fit
  • The LS estimates of the unknown parameters
    minimize
  • The LS can be found by setting the first partial
    derivatives of Q with respect to
    equal to zero.
  • The result is a set of simultaneous linear
    equations in (k1) unknowns. The resulting
    solutions, are the least squares
    (LS) estimates of , respectively

11
11.2.2 Goodness of Fit of the Model
  • To access the goodness of fit of the LS model, we
    use the residuals defined by
  • Where the are the fitted values
  • An overall measure of the goodness of fit is the
    error sum of squares (SSE)
  • Compare it to the total sum of squares (SST)
  • As in Chapter 10, define the regression sum of
    squares (SSR) given by

12
  • the coefficient of multiple determination
  • , values closer to 1 represent
    better fits
  • Adding predictor variables generally increases
    , thus can be made to approach 1 by
    increasing the number of predictors.
  • Multiple correlation coefficient (the positive
    square root of )
  • only positive square root is used
  • r is a measure of the strength of the
    association between the predictor variables and
    the one response variable

13
11.3 Multiple Regression Model in Matrix Notation
  • The multiple regression model can be presented
    in a compact form by using matrix notation. Let

be the n x 1 vectors of the r.v.s , their
observed values , and random errors ,
respectively. Next let
be the n x (k1) matrix of the values of
predictor variables.
14
Finally Let
and
  • be the (k 1) x 1 vectors of unknown parameters
    and their LS estimates, respectively
  • The model can be rewritten as
  • The simultaneous linear equations whose
    solutions yields the LS estimates can be written
    in matrix notation as
  • If the inverse of the matrix exists,
    then the solution is given by

15
11.4 Statistical Inference
  • Shaonan Zhang Yicheng Li

16
Statistical Inference on ßs
----General Hypothesis Test
  • Determining the statistical significance of
    predictor variables
  • we test the hypotheses
  • if we cant reject ,
    can be dropped from the model

17
Statistical Inference on ßs
----General Hypothesis Test
  • Pivotal Quantity
  • recall
  • unbiased estimate of


error degrees of freedom
18
Statistical Inference on ßs
----General Hypothesis Test
  • Confidence Interval for
  • Noted that

  • So, the CI is

  • where

19
Statistical Inference on ßs
----General Hypothesis Test
  • Hypothesis Test
  • Specially, when 0, we reject H0 if

P (Reject H0 H0 is true) ?
20
Statistical Inference on ßs
----Another Hypothesis Test
  • Hypothesis
  • Pivotal Quantity
  • also,
  • P-value
  • If P-value is less than a, we reject H0. And
    we use the previous test in this case.

21
Statistical Inference on ßs
----Another Hypothesis Test
  • ANOVA Table for Multiple Regression

Source of Variation (Source) Sum of Squares (SS) Degrees of Freedom (d.f.) Mean Square (MS) F
Regression Error SSR SSE k n - (k1)
Total SST n - 1
22
Statistical Inference on ßs
----Test Subsets of Parameters
  • Full Model
  • Partial Model
  • Hypothesis
  • test statistics
  • reject H0 when

23
Prediction of Future Observations
  • Let and
  • Whatever CI (Confidence Interval) or PI
    (Prediction Interval)
  • we have
  • and
  • Pivotal Quantity
  • a (1-) level CI to estimate ?
  • a (1-) level PI to predict Y

24
11.7Variable Selection Methods
Guangtao Li, RuiXue Wang
25
1. Why do we need variable selection
methods?
  • 2. Two methods are introduced
  • Stepwise Regression
  • Best Subsets Regression

26
11.7.1 STEPWISE REGRESSION
  • Guangtao Li

27
Recall Test for Subsets of Parameters in 11.4
  • Full model

(i1,2,n)
  • Partial model

(i1,2,n)
vs.
  • Hypotheses

for at least one
  • We test
  • Reject H0 when

28
  • (p-1)-variable model
  • P-variable model

29
Partial F-test
  • Reject H0p if

30
Partial Correlation Coefficients
  • We should add to the regression equation
    only if is large enough, i.e., only if
    is statistically significant.

31
Stepwise Regression Algorithm
32
SAS Program for the Algorithm
  • Example The Director of Broadcasting Operations
    for a television station wants to study the issue
    of standby hours, which are hours where
    unionized graphic artists at the station are paid
    but are not actually involved in any activity. We
    are trying to predict the total number of Standby
    Hours per Week (Y). Possible explanatory
    variables are Total Staff Present (X1), Remote
    Hours(X2), Dubner Hours (X3) and Total Labor
    Hours (X4). The results for 26 weeks are given
    below.

33
(No Transcript)
34
Data test input y x1 x2 x3 x4 datalines 245 338
414 323 2001 177 333 598 340 2030 271 358 656 340
2226 211 372 631 352 2154 196 339 528 380 2078 13
5 289 409 339 2080 195 334 382 331 2073 118 293 39
9 311 1758 116 325 343 328 1624 147 311 338 353 18
89 154 304 353 518 1988 146 312 289 440 2049 115 2
83 388 276 1796
35
  • 161 307 402 207 1720
  • 274 322 151 287 2056
  • 245 335 228 290 1890
  • 201 350 271 355 2187
  • 183 339 440 300 2032
  • 237 327 475 284 1856
  • 175 328 347 337 2068
  • 152 319 449 279 1813
  • 188 325 336 244 1808
  • 188 322 267 253 1834
  • 197 317 235 272 1973
  • 261 315 164 223 1839
  • 232 331 270 272 1935
  • run
  • proc reg datatest
  • model y x1 x2 x3 x4 /SELECTION stepwise
  • run

36
  • Selected SAS Output
  • Stepwise
    Selection Step 1
  • Variable x1 Entered
    R-Square 0.3660 and C(p) 13.3215

  • Analysis of Variance

  • Sum of Mean
  • Source DF
    Squares Square F Value Pr gt F
  • Model 1
    20667 20667 13.86 0.0011
  • Error 24
    35797 1491.55073
  • Corrected Total 25 56465
  • Parameter
    Standard

37
Stepwise
Selection Step 2
Variable x2 Entered R-Square 0.4899 and C(p)
8.4193
Analysis of Variance
Sum of Mean
Source DF
Squares Square F Value Pr gt F
Model 2 27663
13831 11.05 0.0004
Error 23 28802
1252.26402 Corrected Total 25
56465
Parameter Standard Variable
Estimate Error Type II SS
F Value Pr gt F Intercept
-330.67483 116.48022 10092
8.06 0.0093 x1
1.76486 0.37904 27149
21.68 0.0001 x2
-0.13897 0.05880 6995.14489
5.59 0.0269
38
SAS Output(cont)
  • All variables left in the model are significant
    at the 0.1500 level.
  • No other variable met the 0.1500 significance
    level for entry into the model.
  • Summary of
    Stepwise Selection
  • Variable Variable Number
    Partial Model
  • Step Entered Removed Vars In
    R-Square R-Square C(p) F Value Pr
    gt F
  • 1 x1 1
    0.3660 0.3660 13.3215 13.86
    0.0011
  • 2 x2 2
    0.1239 0.4899 8.4193 5.59
    0.0269

39
  • Ruixue Wang

40
11.7.2 Best Subsets Regression
41
11.7.2 Best Subsets Regression
  • In practice there are often several almost
    equally good models, and the choice of the final
    model may depend on side considerations such as
    the number of variables, the ease of observing
    and/or controlling variables, etc. The best
    subsets regression algorithm permits
    determination of a specified number of best
    subsets of size p1,2,,k from which the choice
    of the final model can be made by the
    investigator.

42
11.7.2 Best Subsets Regression
  • Optimality Criteria
  • rp2-Criterion
  • Adjusted rp2-Criterion

43
Cp-Criterion (recommended for its ease of
computation and its ability to judge the
predictive power of a model)
  • The sample estimator, Mallows Cp-statistic, is
    given by
  • is an almost unbiased estimator of

44
PRESS p Criterion The total prediction
error sum of squares (press) is This
criterion evaluates the predictive ability of a
postulated model by omitting one observation at a
time, fitting the model based on the remaining
observations and computing the predicted value
for the omitted observation.The PRESS p
criterion is intuitively easier to grasp than the
Cp-Criterion , but it is computationally much
more intensive and is not available in many
packages.
45
SAS PRGRAM
  • Data test
  • input y x1 x2 x3 x4
  • datalines
  • 245 338 414 323 2001
  • 177 333 598 340 2030
  • 271 358 656 340 2226
  • 211 372 631 352 2154
  • 196 339 528 380 2078
  • 135 289 409 339 2080
  • 195 334 382 331 2073
  • 118 293 399 311 1758
  • 116 325 343 328 1624
  • 147 311 338 353 1889
  • 154 304 353 518 1988
  • 146 312 289 440 2049
  • 115 283 388 276 1796

46
SAS PRGRAM
  • 161 307 402 207 1720
  • 274 322 151 287 2056
  • 245 335 228 290 1890
  • 201 350 271 355 2187
  • 183 339 440 300 2032
  • 237 327 475 284 1856
  • 175 328 347 337 2068
  • 152 319 449 279 1813
  • 188 325 336 244 1808
  • 188 322 267 253 1834
  • 197 317 235 272 1973
  • 261 315 164 223 1839
  • 232 331 270 272 1935
  • run
  • proc reg datatest
  • model y x1 x2 x3 x4 /SELECTION RSQUARE
    adjrsq CP mse
  • run

47
Results
  • Number in Adjusted
  • Model R-Square R-Square
    C(p) MSE Variables in Model
  • 1 0.3660 0.3396
    13.3215 1491.55073 x1
  • 1 0.1710 0.1365
    24.1846 1950.27491 x4
  • 1 0.0597 0.0205
    30.3884 2212.24598 x3
  • 1 0.0091 -.0322
    33.2078 2331.30545 x2
  • ----------------------------------------
    ------------------------------------------
  • 2 0.4899 0.4456
    8.4193 1252.26402 x1 x2
  • 2 0.4499 0.4021
    10.6486 1350.49234 x1 x3
  • 2 0.4288 0.3791
    11.8231 1402.24672 x3 x4
  • 2 0.3754 0.3211
    14.7982 1533.34044 x1 x4
  • 2 0.2238 0.1563
    23.2481 1905.67595 x2 x4
  • 2 0.0612 -.0205
    32.3067 2304.83375 x2 x3
  • ----------------------------------------
    ------------------------------------------
  • 3 0.5378 0.4748
    7.7517 1186.29444 x1 x3 x4
  • 3 0.5362 0.4729
    7.8418 1190.44739 x1 x2 x3
  • 3 0.5092 0.4423
    9.3449 1259.69053 x1 x2 x4

48
11.7.2 Best Subsets Regression SAS
  • The resource of the example is http//www.math.ude
    l.edu/teaching/course_materials/m202_climent/Multi
    ple20Regression20-20Model20Building.pdf

49
11.5, 11.8 Building A Multiple Regression
Modelby SiYuan Luo Xiaoyu Zhang
50
Introduction
  • Building a multiple regression model consists of
    7 steps.
  • Though it is not necessary to follow each and
    every step in exact sequence shown on the next
    slide, the general approach and major steps
    should be followed.
  • The model is an iterative process, it may take
    several cycles of the steps before arriving at
    the final model.

51
The 7 steps
1.Decide the type
3.Explore the data
2.Collect the data
5.Fit candidate models
4.Divide the data
6.Select and evaluate
7.Select the final model
52
Step 1 Decide the type
  • Decide the type of model needed, different types
    of models includes
  • Predictive a model used to predict the response
    variable from a chosen set of predictor
    variables.
  • Theoretical a model based on a theoretical
    relationship between a response variable and
    predictor variables.
  • Control a model used to control a response
    variable by manipulating predictor variables.
  • Inferential a model used to explore the
    strength of relationships between a response
    variable and individual predictor variables.
  • Data summary a model used primarily as a device
    to summarize a large set of data by a single
    equation.
  • Often a model can be used for multiple purposes.
  • The type of model dictates the type of data
    needed.

53
Step 2 Collect the data
  • Decide the variables (predictor and response) on
    which to collect data. Measurement of the
    variables should be done the right way depending
    on the type of subject.
  • See chapter 3 for precautions necessary to obtain
    relevant, bias-free data.

54
Step 3 Explore the data
  • The data should be examined for outliers, gross
    errors, missing values, etc. on a univariate
    basis using the techniques discussed in chapter
    4. Outliers cannot just be omitted because much
    useful information can be lost. See chapter 10
    for how to deal with outliers.
  • Scatter plots should be made to study bivariate
    relationships between the response variable and
    each of the predictors. They are useful in
    suggesting possible transformations to linearize
    the relationships.

55
Step 4 Divide the data
  • Divide the data into training and test sets only
    a subset of the data, the training set, should be
    used to fit the model (step 5 and 6) the
    remainder, called the training set, should be
    used for cross-validation of the fitted model
    (step 7).
  • The reason for using an independent data set to
    test the model is that if the same data are used
    for both fitting and testing, then an
    overoptimistic estimate of the predictive ability
    of the fitted model is obtained.
  • The split for the two sets should be done
    randomly.

56
Step 5 fit Candidate models
  • Generally several equally good models can be
    identified using the training data set.
  • By conducts several runs by varying FIN and FOUT
    values, we can identify several that fits the
    training set.

57
Step 6 Select and evaluate
  • From the list of candidate models we are now
    ready to select two or three good models based on
    criteria such as the Cp-statistic, the number of
    predictors (p), and the nature of predictors.
  • These selected models should be checked for
    violation of model assumptions using standard
    diagnostic techniques, in particular, residual
    plots. Transformations in the response variable
    or some of the predictor variables may be
    necessary to improve model fits.

58
Step 7 Select the Final model
  • This is the step where we compare competing
    models by cross-validating them against the test
    data.
  • The model with a smaller cross-validation SSE is
    better predictive model.
  • The final selection of the model is based on a
    number of considerations, both statistical and no
    statistical. These include residual plots,
    outliers, parsimony, relevance, and ease of
    measurement of predictors. A final test of any
    model is that it makes practical sense and the
    client is willing to buy it.

59
Regression Diagnostics (Step VI)
  • Graphical Analysis of Residuals
  • Plot Estimated Errors vs. Xi Values
  • Difference Between Actual Yi Predicted Yi
  • Estimated Errors Are Called Residuals
  • Plot Histogram or Stem--Leaf of Residuals
  • Purposes
  • Examine Functional Form (Linearity )
  • Evaluate Violations of Assumptions

60
Linear Regression Assumptions
  • Mean of Probability Distribution of Error Is 0
  • Probability Distribution of Error Has Constant
    Variance
  • Probability Distribution of Error is Normal
  • Errors Are Independent

61
Residual Plot for Functional Form (Linearity)
Add X2 Term
Correct Specification
62
Residual Plot for Equal Variance
Unequal Variance
Correct Specification
Fan-shaped.Standardized residuals used typically
(residual divided by standard error of
prediction)
63
Residual Plot for Independence
Not Independent
Correct Specification
64
Data transformations
  • Why do we need data transformations?
  • Make seemingly nonlinear models linear
  • example
  • Sometimes it gives a better explanation of the
    variation in the data

65
  • How do we do the data transformations?
  • Power family of transformations on the response
    Box-Cox method
  • Requirements
  • all the data is always
    positive
  • The ratio of the largest observed Y to the
    smallest is
  • at least 10

66
  • Transformation form
  • V
  • where is the geometric mean of the

67
  • How to estimate
  • 1.Choose a value of from a selected range.
    Usually we look for it in the range (-1,1),we
    would usually cover the selected range with about
    11-21 values of
  • 2.For each value, evaluate V by applying each
    Y to the formula above. You will create a vector
    V( ), then use it to fit a linear
    model by least squares
    method. Record the residual sum of squares for
    the regression
  • 3. Plot versus .Draw a smooth
    curve through the plotted points, and find at
    what value of the lowest point of the curve
    lies. That , is the maximum likelihood
    estimate of

68
  • Example
  • The data in table are part of a more extensive
    set given by Derringer(1974). This paper has been
    adapted with permission of John Wiley Sons,
    Inc. we wish to find a transformation of the form
    ,
  • or , which will provide a
    good first-order fit to the data. Our model form
    is where f is the filler
    level and p is the plasticizer level.

69
Naphthenic Oil,phr, p Filler, phr, f Filler, phr, f Filler, phr, f Filler, phr, f Filler, phr, f Filler, phr, f
Naphthenic Oil,phr, p 0 12 24 36 48 60
0 26 38 50 76 108 157
10 17 26 37 53 83 124
20 13 20 27 37 57 87
30 ---- 15 22 27 41 63
70
  • Note that the response data range from 157 to 13,
    a ratio of 157/1312.1gt10, hence a transformation
    on Y is likely to be effective. The geometric
    mean is 41.5461 for this set of data.
  • The next table shows a selected values of
  • We pick 20 different values of from (-1,1)
    in this case.

71
-1.0 -0.8 -0.6 -0.4 -0.2 -0.15 -0.10 -0.08 -0.06 -0.05
2456 1453 779.1 354.7 131.7 104.5 88.3 84.9 83.3 83.2
-0.04 -0.02 0.00 0.05 0.10 0.2 0.4 0.6 0.8 1.0
83.5 85.5 89.3 106.7 135.9 231.1 588.0 1222 2243 3821
72
  • A smooth curve through these points is plotted in
    the next figure. We see that the minimum
    occurs at about -0.05. This is close to
    zero, so suggesting that the transformation V
    , or more simply .

73
  • Application of the transformation to the original
    data, then we get a set of data which are better
    linearly related. The best plane, fitted to these
    transformed data by least squares, is
  • 3.2120.03088f-0.03152p.
  • the ANOVA table for this model is

Source Df SS MS F
1 319.44855 -----
, 2 10.51667 5.27583 2045
Residual 20 0.05171 0.00258
Total 23 330.05193
74
  • If we had fitted a first-order model to the
    untransformed data, we will obtain
  • 28.1841.55f-1.717p
  • ANOVA table for this model

Source Df SS MS F
, 2 27842.62 13921.31 72.9
Residual 20 3820.60 191.03
Total, corrected 22 31663.22

75
  • We find out the transformed model has much
    stronger F-value.

76
11.6.1 -11.6.3Topics in Regression Modeling
Yikang Chai Tao Li
77
11.6.1 Multicollinearity
  • Def. The columns of the X matrix are exactly or
    approximately linearly dependent.
  • It means the predictor variables are related.
  • why are we concerned about it?
  • This can cause serious numerical and
    statistical difficulties in fitting the
    regression model unless extra predictor
    variables are deleted.

78
How does the multicollinearity cause
difficulties?
  • The multicollinearity leads to the following
    problems
  • is nearly singular, which makes
    numerically unstable. This reflected in large
    changes in their magnitudes with small changes in
    data.
  • The matrix has very
    large elements. Therefore
    are large, which makes statistically
    nonsignificant.

79
Measures of Multicollinearity
  • Three ways
  • The correlation matrix R. Easy but cant
    reflect linear relationships between more than
    two variables.
  • 2. Determinant of R can be used as measurement
    of singularity of .
  • 3. Variance Inflation Factors (VIF) the
    diagonal elements of . Generally, VIFgt10 is
    regarded as unacceptable.

80
11.6.2 Polynomial Regression
Consider the special case
Problems
  1. The powers of x, i.e., tend
    to be highly correlated.
  1. If k is large, the magnitudes of these powers
    tend to vary over a rather wide range.

These problems lead to numerical errors.
81
How to solve these problems?
  • Two ways
  • 1. Centering the x-variableRemoving the
    non-essential multicollinearity in the data.

2. Standardize the x-variable
Alleviate the problem that x varying over a wide
range.
82
11.6.3 Dummy Predictor Variables
  • Its an method to deal with the categorical
    variables.

1.For ordinal categorical variables, such as the
prognosis of a patient (poor, average, good),
just assign numerical scores to the categories.
(poor1, average2, good3)
2. If we have nominal variable with cgt2
categories. Use c-1 indicator
variables, , called Dummy
Variables, to code.
83
How to code?
set for the ith category,
for the cth category.
Why dont we just use c indicator variables
?
because there will be a linear dependency among
them This will cause multicollinearity.
84
Example
  • The season of a year can be coded with three
    indicators x1(winter),x2(spring),x3(summer).
    With this coding (1,0,0)for Winter ,(0,1,0) for
    Spring, (0,0,1) for Summer and
  • (0,0,0) for Fall

Consider modeling the temperature of a year of an
area as a function of the season (X) and its
latitude (A) , we can get the following model
For winter
For summer
For spring
For fall
85
Logistic Regression Model
  • 1938, By R. A. Fisher and Frank Yates
  • Logistic transform for analyzing binary data.

86
Logistic Regression Model
  • The Importance of Logistic Regression Model
  • Logistic regression model is the most popular
    model for binary data.
  • Logistic regression model is generally used for
    binary response variables.
  • Y 1 (true, success, YES, etc.) , while Y
    0 ( false, failure, NO, etc.)

87
Logistic Regression Model
  • Details of Regression Model
  • Main Step
  • Consider a response variable Y 0 or 1 and a
    single predictor variable x.
  • Model E(Yx) P(Y1x) as a function of x. The
    logistic regression model expresses the logistic
    transform of P(Y1x).

88
Logistic Regression Model
  • Example
  • http//faculty.vassar.edu/lowry/logreg1.html

i Ii iii iv v vi vii
X Instances of YCoded as Instances of YCoded as Totaliiiii Y asObservedProbability Y asOdds Ratio Y as LogOdds Ratio
X  0  Totaliiiii Y asObservedProbability Y asOdds Ratio Y as LogOdds Ratio
282930313233 432241 22771614 65992015 .3333.4000.7778.7778.8000.9333 .5000  .6667  3.5000  3.5000  4.0000  14.0000   -.6931  -.4055  1.2528  1.2528  1.3863  2.6391  
89
Logistic Regression Model
  • A. Ordinary Linear Regression B. Logistic
    Regression

90
Logistic Regression Model
  • Weighted Linear Regression ofObserved Log Odds
    Ratios on X

X Observed Log Weight
28 0.3333 -.6931   6
29 0.4 -.4055   5
30 0.7778 1.2528   9
31 0.7778 1.2528   9
32 0.8 1.3863   20
33 0.9333 2.6391   15
91
Logistic Regression Model
  • Properties of Regression Model
  • E(Yx) P(Y1 x) 1 P(Y0x) 0 P(Y1x)
    is bounded between 0 and 1 for all values of x .
    While, it is not true if we use model
  • In ordinary regression, the regression
    coefficient has the interpretation that it
    is the log of the odds ratio of a success event
    (Y1) for a unit change in x.
  • Extension to Multiple predictor variables

92
Standardized Regression Coefficients
  • Why we need standardize regression coefficients?
  • Recall the regression equation for linear
    regression model
  • The magnitudes of the can not be directly
    used to judge the relative effects of on y.
  • By using standardized regression coefficients, we
    may be able to judge the importance of different
    predictors

93
Standardized Regression Coefficients
  • Standardized Transform
  • Standardized Regression Coefficients

94
Standardized Regression Coefficients
  • Example(Industrial sales data from text book)
  • Linear Model
  • The regression equation
  • Notice
  • but thus has a much larger
    effect than on y

95
Chapter Summary
  • Multiple linear regression model
  • Fitting the multiple regression model
  • Least squares fit
  • Goodness of fit of the model SSE,
    SST, SSR, r2
  • Statistical inference for multiple regression
  • 1. T-test
  • 2. F-test for
    all for at
    least one
  • 3. F-test
    for at
    least one
  • How do we select variables (SAS)?
  • Stepwise regression - its fancy
    algorithm
  • Best subsets regression more
    realistic, flexible
  • How about if the data is not linear?
  • Data transformation
  • Building a multiple regression model 7 steps

96
We very appreciate your attention )
  • Please feel free to ask questions.

97
The End
  • Thank You!
Write a Comment
User Comments (0)
About PowerShow.com