Title: Partial Correlation
1Partial Correlation
- What is it really?
- And
- More casual Causal Models
2Onward (backward) to Partial Correlation
- Correlation
- Degree to which two variables are linearly
related - Partial correlation
- Degree to which two variables are linearly
related - While partialling out the effects of one or
more control variables. - Correlation between two variables when all cases
have the same value on the control variables - Initial Example A correlation was found between
height and salary even when controlling for
gender, age, and weight.
3Why control
- Define explain. http//dictionary.reference.com/
browse/explain - Explanation means seeking ?
- Intro Stats- Caution Correlation does not mean
a causal relationship exists. - The Big Three Condition for Cause
- 1
- 2
- 3
- 3 relates to why we do experiments. Manipulate
one thing (and nothing else, we hope). - Hardest condition to satisfy
4Smoking and Lung Cancer
- Correlated
- Smoking precedes it
- Is the relationship true?
- I recon not Rule out other common causes of
both smoking and lung cancer - (as an aside new research?)
- I recon so Demonstrate how the effect is
mediated. (via weakening of immune system) - Time for the cause to work (produce the
effect). - Cause is Probabilistic - smoking will increase
your probability of developing lung cancer
5Threat to 3Common Cause Models
- Sketch models
- What are possible common causes?
- Shoe size ? Reading Ability
- School Climate ? Student Achievement
6Why control
- Partial correlation is one way to address 3
- To remove what we think could be a common cause
of our observed relationship - Controlling for other variables ????
- Oh, !!, I forgot to control for x.
- What is the control process, really (or simply)?
7Doing Normal Statistics
Correlation
r
x
y
8Doing Normal Statistics
Partial Correlation
r
x
y
Z (control)
9Regression Revisited
- Path diagram of the simple regression model
- Exploded version
- Parts is parts.
- Whats the part thats left after we regress Y on
X? - error ? garbage A better term?
- Path diagram of the partial correlation model
- Exploded version
- What are the parts?
10Definitions Revisited
- Partial Correlation
- Partial correlation is the correlation between
the DV and a particular IV controlling for all
the other predictors on both the DV and the
particular IV. - Exploded path diagram
- Semi-partial Correlation (Part Correlation)
- Semi-partial (Part) correlation is the
correlation between the DV and a particular IV
controlling for all other predictors on just the
particular IV. - Exploded path diagram
- Computational Demonstration
11Computations
- PARTIAL CORR
- /VARIABLES X Y BY Z.
- PROC CORR DATACORR_EG NOSIMPLE
- TITLE "Example of a Partial Correlation"
- VAR HEIGHT WEIGHT
- PARTIAL AGE
- RUN
- To demonstrate residuals approach
- Get residuals for regression of Weight on age ,
and - Height on age, and then correlate the residuals
- Should get the same value as the partial
correlation between Weight and Height controlling
for age.
12Extension to Multiple Regression
- MR context
- Model Weight Height Age
- Path model
- Explode
- Each predictor in a MR is controlling for each
other predictor - Squared semi-partial correlation for each
predictor gives the unique contribution of that
predictor over and above the other predictors.
13Programs
- PartialCorrPlus.sps
- PartialCorrPlus.sas
14DATA LIST Free/ gender (a) height weight
age. Begin data M 68 155 23 F 61 99 20 F 63 115
21 M 70 205 45 M 69 170 . F 65 125 30 M 72 220
48 End data. List. PARTIAL CORR /VARIABLES
height weight BY age /SIGNIFICANCEONETAIL
/STATISTICSCORR /MISSINGLISTWISE
. Verification
that correlation of residuals of the variables of
interest is equivalent to the partial correlation
of the variables of interest with the control
variable included In this case, Weight and
Height controlling for Age.
Regress weight on Age, output
residual.
. REGRESSION /MISSING LISTWISE
/STATISTICS COEFF OUTS R ANOVA
/CRITERIAPIN(.05) POUT(.10) /NOORIGIN
/DEPENDENT weight /METHODENTER age /SAVE
RESID . Get residual for height on
age. REGRESSION /STATISTICS COEFF OUTS R
ANOVA /DEPENDENT height /METHODENTER age
/SAVE RESID . Assign meaningful names to
residuals. Rename Variables
(Res_1rwt_age) (Res_2rht_age). Execute. CORRELAT
IONS /VARIABLESrwt_age rht_age
/PRINTONETAIL NOSIG.
15DATA CORR_EG INPUT GENDER HEIGHT WEIGHT
AGE DATALINES M 68 155 23 F 61 99 20 F 63 115
21 M 70 205 45 M 69 170 . F 65 125 30 M 72 220
48 Proc print data CORR_EG (obs5) Run PROC
CORR DATACORR_EG TITLE "Example of a
Correlation Matrix" VAR HEIGHT WEIGHT
AGE RUN PROC CORR DATACORR_EG NOSIMPLE
TITLE "Example of a Partial Correlation" VAR
HEIGHT WEIGHT PARTIAL AGE RUN
Get the residuals from
the regression of Height and of Weight on
age. Proc reg Data CORR_EG Model Height
age/ r output outrstats
rrht_age Model Weight age/r output
outrstats2 rrwt_age run quit Proc print
data rstats run Proc print data
rstats2 run On-the-Fly lesson in
combining SAS data files data
rstats/Put data files together, that is, by
adding variables./ merge rstats rstats2/merge
adds variables. Set adds observations
(cases)./ run Proc print datarstats run
Verification that correlation of
residuals, created by regressing each variable of
interest on the control variable, is equivalent
to partial correlation between the varialbes of
interest with control vaiable. PROC CORR
DATArstats TITLE "Verification Example of a
Partial Correlation" VAR rht_age
rwt_age RUN
16Next Assignment
- GS Lesson 32, Exs. 1-3
- When you answer specific questions, such as 1.a.,
answer in a complete sentence rather than just
giving the answer. - For example, 1.a. The bivariate correlation
between num_pubs and cites was r whatever. - 1.b. Asks for a p value. Given the number of
correlations computed, it would be good to apply
the Bonferroni correction to get your critical p.
However, 1b just asks for the p for num_pubs
instructor quality correlation, not whether it is
significant. - For 2, include a path diagrams for the
instructor quality num_pubs relationship, with
and without partialling out work ethic. Discuss
whether the common cause hypothesis applies and
to what extent. - At the end, include partial correlation outputs
from both SAS and SPSS plus minimal/critical
procedure syntax for each.