Title: Regression
1Regression
2Outline of Todays Discussion
- Coefficient of Determination
- Regression Analysis Introduction
- Regression Analysis SPSS
- Regression Analysis Excel
- Independent Predictors
3Part 1
Coefficient of Determination
4Coefficient of Determination
In correlational research Researchers often use
the r-squared statistic, also called the
coefficient of determination, to describe the
proportion of Y variability explained by X.
5Coefficient of Determination
What range of values is possible for
the coefficient of determination (the r-squared
statistic)?
6Coefficient of Determination
Example What is the evidence that IQ is
heritable?
7Coefficient of Determination
R-value for the IQ of identical twins reared
apart 0.6. What is the value of r-squared in
this case?
8Coefficient of Determination
So what proportion of the IQ is unexplained
(unaccounted for) by genetics?
9Coefficient of Determination
Different sciences are characterized by the
r-squared values that are deemed
impressive. (Chemists might r-squared to be gt
0.99).
10Coefficient of Determination
As we have already seen r-squared is the same as
eta-squared.
11Part 2
Regression Analysis Introduction
12Regression Analysis Introduction
- Correlation is the process of finding a
relationship between variables. - Regression is the process of finding the
best-fitting trend (line) that describes the
relationship between variables. - So, correlation and regression are very similar!
13Regression Analysis Introduction
- The r statistic can be tested for statistical
significance! - Potential Pop Quiz Question What two factors
determine the critical value (i.e., the number to
beat) when we engage in hypothesis testing?
14Regression Analysis Introduction
DF for Correlation Regression
Here n stands for the number of pairs of
scores. Why would this be n-2, rather than the
usual n-1?
15Regression Analysis Introduction
- In general, the formula for the degrees of
freedom is the number of observations minus the
number of parameters estimated. - For correlation, we have one estimate for the
mean of X, and another estimate for the mean of
Y. - For regression, we have one estimate for the
slope, and another estimate for y intercept.
16Regression Analysis Introduction
Slope can also be though of as rise over run.
17Regression Analysis Introduction
The rise on the ordinate Y2 - Y1. The run
on the abscissa X2 - X1.
18Regression Analysis Introduction
Rise over run in pictures.
19Regression Analysis Introduction
Here, the regression is linear
20Regression Analysis Introduction
Here, the regression is non-linear! What would
the equation look like for this trend?
21Regression Analysis Introduction
- Lets now return to linear regression, and learn
how to manually compute the slope and
y-intercept. - To compute the slope, we need two quantities that
we have already learned. These are SPxy (sums of
products) and SSx (sums of squares for X)
22Regression Analysis Introduction
23Regression Analysis Introduction
Once we have the slope, its easy to get the
y-intercept!
24Part 3
Regression Analysis SPSS
25Regression Analysis SPSS
- Later well go to SPSS and get some practice with
regression. - The steps in SPSS will be Analyze ---gt Regression
--gt Linear. - We will place the criterion (i.e., the Y-axis
variable) in the Dependent box, and the
predictor (i.e., the X-axis variable) in the
Independent(s) box. - Click the Statistics box, and check
estimates, model fit, and descriptives.
26Regression Analysis SPSS
The Coefficients Section In SPSS Output
The Coefficients Section in the SPSS
output contains all the info needed for the
regression equation, the r statistic, and the
evaluation of Ho (retain or reject).
27Regression Analysis SPSS
The Coefficients Section In SPSS Output
The constant is the b in, Y mX b. Here, b
-9923.665
28Regression Analysis SPSS
The Coefficients Section In SPSS Output
The slope is the m in, Y mX b. Here, m
1807.836
29Regression Analysis SPSS
The Coefficients Section In SPSS Output
So, our regression equation is, Y mX b. or Y
1807.836X - 9923.665.
30Regression Analysis SPSS
The Coefficients Section In SPSS Output
The r statistic is the standardized coefficient,
Beta. r .705
31Regression Analysis SPSS
The Coefficients Section In SPSS Output
Lastly, we look at the sig value for the
predictor, (which is EDU in this case) to
determine whether predictor (x-axis variable) is
significantly correlated with the criterion
(y-axis variable). Evaluate Ho do we retain or
reject?
32Part 4
Regression Analysis Excel
33Regression Analysis Excel
- Correlation and regression are very similar.
- If we have a significant correlation, the
best-fitting regression line is said to have a
slope significantly different from zero. - Sometimes it is stated that the slope departs
significantly from zero.
34Regression Analysis Excel
- Note A slope can be very modestly different from
zero, and still be statistically significant if
all data points fall very close to the line. - In correlation and regression, statistical
significance is determined by the strength of the
correlation between two variables (the r-value),
and NOT by the slope of the regression line. - The significance of the r-value, as always,
depends on the alpha level, and the df (which is
n-2). Take a peak at the r-value table.
35Regression Analysis Excel
36Regression Analysis Excel
- Remember The regression line (equation) can help
us predict one score, given another score, but
only if there is a significant r-value. - The terminology w/b the regression line
explains (or accounts for) 42 of the
variability in the scores (if r-squared .42). - To explain or account for does NOT mean to
cause. Correlation does not imply
causation!
37Regression Analysis Continued
- A synonym for regression is prediction! Recall
that prediction is one of the four goals of the
scientific method. What were the others? - A significant correlation implies a significant
capacity for prediction, i.e., a prediction that
is reliably better than chance!
38Regression Analysis Continued
- The equation for a straight line, again, is
- y mx B
- or
-
- Criterion ( slope Predictor)
Intercept - How many parameters in a linear equation?
- How about a quadratic equation?
39Part 5
Independent Predictors
40Independent Predictors
- So far, weve attempted to use regression for
prediction. - Specifically, weve tried to predict one variable
Y (called the criterion), using one other
variable (called the predictor). - Multiple Regression - the process by which one
variable Y (called the criterion) is predicted on
the basis of more than one variable (say, X1, X2,
X3).
41Independent Predictors
Heres the simple case of one predictor
variable. The overlap (in gray) indicates the
predictive strength.
42Independent Predictors
If the overlap in the Venn diagram were to
grow, the r-value would grow, too!
43Independent Prediction
Variable X1
Criterion (Y)
Heres the same thing again but well call the
the predictor variable X1.
44Independent Prediction
Variable X2
Variable X1
Criterion (Y)
By adding another predictor variable X2, we could
sharpen our predictions. Why?
45Independent Prediction
Variable X2
Variable X1
Criterion (Y)
Unfortunately, X1 and X2 provide some
redundant information about Y, so the predictive
increase is small.
46Independent Prediction
Variable X2
Variable X1
Criterion (Y)
Unfortunately, X1 and X2 provide some
redundant information about Y, so the predictive
increase is small.
47Independent Prediction
Variable X3
Variable X2
Variable X1
Criterion (Y)
By contrast, variable X3 has no overlap with
either X1 or X2, so it would add the most new
information.
48Independent Prediction
Variable X3
Variable X2
Variable X1
Criterion (Y)
In short, since all three predictors provide some
unique information, predictions w/b best when
using all three.
49Independent Prediction
Variable X3
Variable X2
Variable X1
Criterion (Y)
If you wanted to be more parsimonious and use
only two of the three, which two would you pick,
and why?
50Independent Predictors
- That was a conceptual introduction to Multiple
Regression (predicting Y scores from more than
one variable). - We will not learn about the computations for
multiple regression in this course (but you will
if you take the PSYCH 370 course). - For our purposes, simply know that predictions
improve to the extent that the various predictors
are independent of each other.
51(No Transcript)