Title: Covariance
1Covariance
(x,y)
x and y axes
2Covariance
(x,y)
x and y axes
3Covariance
Below average values of x are with above average
values of y
Above average values of x are also above average
values of y
So what happens on balance?
Below average values of x are also below average
values of y
Above average values of x are with below average
values of y
4Covariance
What happens on balance?
Calculate the average of the squared deviations.
5Covariance
What happens on balance?
Calculate the average of the squared deviations.
6Covariance Example
Sxy 1.999
Wage
Aptitude
7Correlation
rxy 0.476
Wage
Aptitude
8Perfect Correlation
9Fit That Line !
y2,5001,800x
y10,0001,000x
y13,000750x
10Fit That Line !
y8,135 1,233xminimizes the squared errors
11Word Problem
- Students in a small class were polled by a
researcher attempting to establish a relationship
between hours of study in a week preceding a test
and the result of the test. - If you get data on hours studied and exam
results, which variable is the dependent
variable? why?
12Word Problem
y39.406 2.122x
13Word Problem
Excel Regression Output (Data Analysis Add-In)
14Word Problem
Excel Regression Output (StatPad Add-In)
15The Nine Lives of Goldfish
16Predicting Job Performance
Simple Regression Perform 3.956 0.022 age
17Predicting Job Performance
Perform 4.865 0.037 age 0.011 seniority -
0.032 cognitive
Note importance of ceteris paribus (all else
constant)
18Predicting Job Performance
Perform 4.865 0.037 age 0.011 seniority -
0.032 cognitive And holding seniority constant at
10 and cognitive constant at 1
19Predicting Job Performance
Perform 4.865 0.037 age 0.011 seniority -
0.032 cognitive And holding seniority constant at
20 and cognitive constant at -1
With linear models, other values dont matter
just all else constant
20Predicting Job Perf. With a Dummy Variable
Structured Interview Dummy Variable 1yes, 0no
21Predicting Job Perf. With a Dummy Variable
Perform 4.820 0.037 age 0.010 seniority -
0.025 cognitive 2.850 structured interview
Dummy variable turns on and off with all else
constant.
22Predicting Job Perf. With a Dummy Variable
Perform 4.865 0.037 age 0.010 seniority -
0.025 cognitive 2.850 structured interview And
holding seniority constant at 10 and cognitive
constant at 1
23Predicting Job Perf. With a Dummy Variable
Note new y-intercept
Seniority20, Cognitive0
24Multiple Dummy Variables
- Source SS df MS
Number of obs 3525 - ---------------------------------------
F( 14, 3510) 125.63 - Model 5035.58483 14 359.684631
Prob gt F 0.0000 - Residual 10049.2032 3510 2.86302087
R-squared 0.3338 - ---------------------------------------
Adj R-squared 0.3312 - Total 15084.7881 3524 4.28058685
Root MSE 1.692 - --------------------------------------------------
---------------------------- - perform Coef. Std. Err. t
Pgtt 95 Conf. Interval - -------------------------------------------------
---------------------------- - age -.0301543 .0016933 -17.808
0.000 -.0334742 -.0268344 - seniorty .0016888 .002762 0.611
0.541 -.0037265 .007104 - cognitve .0119113 .0286362 0.416
0.677 -.0442339 .0680565 - strucint 3.665569 .7995184 4.585
0.000 2.098001 5.233137 - job1 1.928286 .1277788 15.091
0.000 1.677758 2.178814 - job2 .426524 .1260009 3.385
0.001 .1794815 .6735664 - job3 .1407506 .1306411 1.077
0.281 -.1153896 .3968908 - job4 .2921016 .1347211 2.168
0.030 .0279621 .5562411 - job5 -1.069262 .1331017 -8.033
0.000 -1.330227 -.8082974
25Interaction Variables
- Source SS df MS
Number of obs 3525 - ---------------------------------------
F( 6, 3518) 121.08 - Model 2581.89927 6 430.316544
Prob gt F 0.0000 - Residual 12502.8888 3518 3.55397635
R-squared 0.1712 - ---------------------------------------
Adj R-squared 0.1697 - Total 15084.7881 3524 4.28058685
Root MSE 1.8852 - --------------------------------------------------
---------------------------- - perform Coef. Std. Err. t
Pgtt 95 Conf. Interval - -------------------------------------------------
---------------------------- - age -.006 .0034204 -1.705
0.088 -.0125379 .0008743 - seniorty .011 .0030589 3.559
0.000 .0048879 .0168827 - cognitve -.005 .0318774 -0.167
0.867 -.0678283 .0571719 - strucint 2.129 .8937022 2.383
0.017 .3770909 3.881545 - manual -1.513 .2391962 -6.327
0.000 -1.982442 -1.044488 - manl_age -.042 .004011 -10.439
0.000 -.0497349 -.0340066 - _cons 6.009 .2354444 25.526
0.000 5.548275 6.471517 - --------------------------------------------------
---------------------------- - Note manual is a dummy variable indicating a
manual occupation manl_age is age interacted
with manual (i.e. manl_age manualage)
26Interaction Variables
Note different slopes, too.
Seniority20, Cognitive0, StrucInt0
27Another Interaction Variable Example
- Source SS df MS
Number of obs 15321 - -------------------------------------------
F( 5, 15315) 800.50 - Model 804247599 5 160849520
Prob gt F 0.0000 - Residual 3.0773e09 15315 200936.252
R-squared 0.2072 - -------------------------------------------
Adj R-squared 0.2069 - Total 3.8816e09 15320 253367.252
Root MSE 448.26 - --------------------------------------------------
---------------------------- - earnwkly Coef.
- -------------------------------------------------
---------------------------- - married 136.003
- female -169.837
- exper 2.946
- parttime -227.716
- exp_pt -1.896
- _cons 700.802
- --------------------------------------------------
---------------------------- - exper is potential labor market experience
(age-educ-6)
28Interaction Variables
Married1, Female1
29Adjusted R2
- Source SS df MS
Number of obs 3525 - ---------------------------------------
F( 14, 3510) 125.63 - Model 5035.58483 14 359.684631
Prob gt F 0.0000 - Residual 10049.2032 3510 2.86302087
R-squared 0.3338 - ---------------------------------------
Adj R-squared 0.3312 - Total 15084.7881 3524 4.28058685
Root MSE 1.692 - --------------------------------------------------
---------------------------- - perform Coef. Std. Err. t
Pgtt 95 Conf. Interval - -------------------------------------------------
---------------------------- - age -.0301543 .0016933 -17.808
0.000 -.0334742 -.0268344 - seniorty .0016888 .002762 0.611
0.541 -.0037265 .007104 - cognitve .0119113 .0286362 0.416
0.677 -.0442339 .0680565 - strucint 3.665569 .7995184 4.585
0.000 2.098001 5.233137 - job1 1.928286 .1277788 15.091
0.000 1.677758 2.178814 - job2 .426524 .1260009 3.385
0.001 .1794815 .6735664 - job3 .1407506 .1306411 1.077
0.281 -.1153896 .3968908 - job4 .2921016 .1347211 2.168
0.030 .0279621 .5562411 - job5 -1.069262 .1331017 -8.033
0.000 -1.330227 -.8082974
30Causality ?
For those who still doubt that Internet-related
investments will pay off, consider this A
PricewaterhouseCoopers study released earlier
this year found that productivity gains in 2000
were 2.7 times greater for Internet-enabled
companies than for businesses that have not
leveraged the Web. http//business.cisco.com/pr
od/tree.taf3Fpublic_viewtruekbns1asset_id669
66.html
31Causality
- Reasons for an estimated statistical relationship
- The explanatory variable is the direct cause of
the response (dependent) variable - The response variable is causing a change in the
explanatory variable (reverse causality) - The explanatory variable is a contributing, but
not sole, cause of the response variable - Confounding variables may exist
- Both variables may stem from a common cause
- Both variables are changing over time
- Coincidence
Source Jessica M. Utts (1999) Seeing Through
Statistics, 2nd ed., Pacific Grove, CA Duxbury,
p. 186.