Regression - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Regression

Description:

Usual criterion for 'best fit' is 'least squares' Find b0, b1 to minimize SUM(y-(b0 b1*x))^2 ... Because the two lines are fit by different criteria ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 20
Provided by: johnt1
Category:
Tags: regression

less

Transcript and Presenter's Notes

Title: Regression


1
Regression
  • So far, we have considered single populations or
    have compared 2 or more populations
  • Now we consider relationships between different
    variables

2
Regression
  • The simplest relationship is a linear relation
    between 2 variables
  • Yb0 b1 X e
  • E is normal, mean0, SD unknown
  • Need to estimate b0, b1 from data

3
Regression
  • Usual criterion for best fit is least squares
  • Find b0, b1 to minimize SUM(y-(b0b1x))2
  • Note that this is the vertical sum of squares
  • Same as min SUM e2

4
Regression
  • Matlab has this calculation built in
  • Suppose we have two col vectors, x y
  • Xxones(size(x)) x
  • Puts a column of ones on the left so that XX has
    2 cols
  • BXX\y will be the (two) coefficients
  • YhXXb will the model values
  • plot(x,y,'o',x,yh,'-')grid
  • Will plot the data and the line

5
Regression
  • The LS line minimizes SUM(y-yh)2
  • Call this quantity SSE
  • For a column, SSxx
  • We would like to know if SSE is small
  • Can compare it to the horizontal line (slope0)
    where yhyavg
  • SSRsum(y-yavg)2
  • For SSE, dfN-2
  • For SSR, df1
  • Calculate MS and F and test

6
Regression
  • Note that F measures the relative sizes of SSR
    and SSE
  • SSRSSE SSTotal SUM(y-Yavg)2
  • So we might want to know how much of SSTotal is
    SSR and how much is SSE
  • R2 Index of determination SSR/SSTotal
  • Large is good because we want SSE to be small
  • Not simple to interpret because it is the percent
    of the squared variation
  • (Well be back to R2)

7
Regression
  • We can also test the slope directly
  • The slope is a linear combination of the data and
    so has a normal distn
  • SD(slope) SD(data)/sqrt(Sxx)
  • Sxxsum(x-xavg)2
  • Use sqrt(MSE)RMSE to estimate SD(data)
  • Since we estimated SD, use t distn with dfN-2
  • Note that t2F
  • Pattern for df is dfN- parameters estimated
  • For regression, we estimate slope and intercept
  • For t test, we only estimate the mean

8
Regression
  • We can also find confidence bounds for the slope

9
Regression
  • The precise values of slope and intercept depend
    on the sample that we use
  • If we used different samples (from the same
    population), then we would get some variation in
    slope and intercept
  • (We have already found the SD(slope))

10
Regression
  • For a given value of X, what would Y be?
  • Two answers
  • (1) What about the mean of the Ys for this
    particular X?
  • (2) What about a particular Y for this value of
    X?

11
Regression
  • We would use the regression line to estimate the
    mean value of Y for a particular value of X
  • But the line is based on a sample. We need to
    account for the variability of our estimates.
  • When we let XXi, then the SD of the mean Y is
  • SDsqrt( (Xi-Xavg)2/Sxx)
  • If our Xi is near the middle of our Xs, then our
    estimate is less variable
  • If Xi is near the extremes, then our estimate is
    more variable

12
Regression
  • This only accounts for the variability in our
    estimate of the line
  • If we had a situation where XXi, what might Y
    be?
  • We know that Y might be above or below the line
  • SD(predicted) SDsqrt( 1 (Xi-Xavg)2/Sxx)
  • The 1 is the additional variation about the
    line

13
Correlation
  • Suppose we consider X to be a linear function of
    Y
  • Y b0 b1 X
  • X c0 c1 Y
  • NOT true that c11/b1
  • Because the two lines are fit by different
    criteria
  • The first line minimizes squared differences in
    the Y direction
  • The second line minimizes squared differences in
    the X direction

14
Correlation
  • What if we dont know which model to use?
  • Correlation measures the degree to which X and Y
    are related
  • But not necessarily X as a fn of Y or Y as a fn
    of X
  • Can think of correlation as a generalization of
    slope which does not have units

15
Correlation
  • Define Sxy SUM (x-xavg)(y-yavg) and Syy as we
    did Sxx
  • Then b1Sxy/Sxx
  • Note the units are y/x
  • C1Sxy/Syy (again note units)
  • Define (Pearson) correlation r Sxy/sqrt(Sxx Syy)
  • Note R2 b1c1
  • So, if the slopes are reciprocals, then R2 1

16
Correlation
  • Define Sxy SUM (x-xavg)(y-yavg) and Syy as we
    did Sxx
  • Then b1Sxy/Sxx
  • Note the units are y/x
  • C1Sxy/Syy (again note units)
  • Define (Pearson) correlation r Sxy/sqrt(Sxx Syy)
  • Note R2 b1c1
  • (Here, R2 is either correlation squared or the
    index of determination. They are the same value.)
  • So, if the slopes are reciprocals, then R2 1
  • R21 means that SSE0, so the points fall
    exactly on a line

17
Correlation
  • We can also do tests on r
  • SD(r ) sqrt((1-r2)/(n-2))
  • But this depends on our data
  • So need a t distn
  • DfN-2
  • Same as for slope
  • Same as for SSE

18
Correlation
  • To test if r0, we could compute
  • R-0 / sqrt((1-r2)/(n-2))
  • This is the same as the test for slope0
  • (EITHER slope)
  • And if we square it, we get F

19
Other models
  • Suppose we leave X out of the model
  • Y b0
  • The est is b0Yavg
  • The test for b0 is the same as the t test for the
    mean
Write a Comment
User Comments (0)
About PowerShow.com