Regression and Prediction - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Regression and Prediction

Description:

The z-score regression formula is. where ZY' is the predicted value of Y ... In other words, r is the slope of the regression equation when X and Y are ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 14
Provided by: kensav
Category:

less

Transcript and Presenter's Notes

Title: Regression and Prediction


1
Regression and Prediction
  • Minium, Clarke Coladarci, Chapter 8

2
Regression and Prediction
  • If there is some association (correlation)
    between two variables, this means that knowing a
    persons score on one variable (X) provides
    information about his or her score on the other
    variable (Y)
  • So, we should be able to predict (with some level
    of accuracy) the Y score from knowledge of the X
    score
  • Situations in which wed like to be able to do
    this
  • university performance from high school
    performance
  • job performance from questionnaire score
  • childs depression score from parents depression
    score
  • chance of going to jail from schoolyard behaviour

3
Regression and Prediction
  • Correlation vs Prediction
  • The magnitude of r is related to how well we can
    predict the y-value from the x-value
  • To understand this we need to consider a
    prediction line (what well later call a
    regression line)
  • Remember the definition of a line?
  • Y a b(X)
  • b is the slope of the line and a is the intercept
  • cost of gas
  • e.g., cost of a fillup b(X) where X is the
    number of liters and b is the price per liter
  • cost of a car repair
  • a b(X) where X is time, b is the cost per hour
    and a is a flat rate that you pay just for
    bringing your car to the mechanic
  • e.g., cost of repair 50 70 (X)

4
Determining the Line of Best Fit
  • The Regression Equation
  • The slope
  • The intercept

SAT GPA Mean 545.8 2.57 S 123.2 0.52 r .5
5
Determining the Line of Best Fit
  • The regression equation defines the line of best
    fit
  • The line of best fit minimizes the sum of squared
    prediction errors
  • This is called the least squares criterion, or
    the least squares solution

6
Determining the Line of Best Fit
  • An Example

7
Regression and Sums of Squares
  • The concept of Explained Variance
  • The total variation in the Y scores is the sum of
    squared deviations from the mean (weve seen this
    before)
  • The total variation can be divided into two parts
  • The explained variation is the sum of squared
    deviations for the predicted Y scores (Y) from
    the mean
  • The unexplained variation is the sum of squared
    deviations of the actual Y scores from the
    predicted Y scores (Y)
  • The explained variation can be thought of as the
    variation in the Y scores explained by Ys
    association with X

8
Regression and Sums of Squares
  • The concept of Explained Variance
  • Therefore, the total variation can be expressed
    as the sum of the explained and unexplained
    variation
  • The proportion of explained variation is called
    the coefficient of determination and equals r2
  • we wont try to prove this relationship, well
    just trust the statisticians who assure us that
    its true.

9
(No Transcript)
10
The Regression Equation in terms of z Scores
  • Weve seen before that the correlation
    coefficient can be calculated as follows
  • It can also be calculated this way
  • In words, convert the raw scores in X and raw
    scores in Y to z-scores (ZX and ZY) then compute
    their cross product.

11
The Regression Equation in terms of z Scores
  • The z-score regression formula is
  • where ZY is the predicted value of Y expressed
    as a z score
  • r is the correlation coefficient between X and Y
  • ZX is the z score of X
  • In other words, r is the slope of the regression
    equation when X and Y are transformed into z
    scores.
  • For each standard deviation in X, Y changes by r
    standard deviations p. 140
  • The smaller r is, the closer the predicted value
    of Y is to the mean.
  • This is called regression toward the mean

12
(No Transcript)
13
The Regression Equation in terms of z Scores
  • Note that if we know the mean and standard
    deviation of X and Y, and we know r, we can
    calculate the predicted Y score as follows
  • where
Write a Comment
User Comments (0)
About PowerShow.com