Title: Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics
1Lecture 4, part 1 Linear RegressionAnalysis
Two Advanced Topics
Karen Bandeen-Roche, PhD Department of
Biostatistics Johns Hopkins University
Introduction to Statistical Measurement and
Modeling
2Data examples
- Boxing and neurological injury
- Scientific question Does amateur boxing lead to
decline in neurological performance? - Some related statistical questions
- Is there a dose-response increase in the rate of
cognitive decline with increased boxing exposure? - Is boxing-associated decline independent of
initial cognition and age? - Is there a threshold of boxing that initiates
harm?
3Boxing data
4Outline
- Topic 1 Confounding
- Handling this is crucial if we are to draw
correct conclusions about risk factors - Topic 2 Signal / noise decomposition
- Signal Regression model predictions
- Noise Residual variation
- Another way of approaching inference, precision
of prediction
5Topic 1 Confounding
- Confound means to confuse
- When the comparison is between groups that are
otherwise not similar in ways that affect the
outcome
6Confounding Example Drowning and Eating Ice
Cream
Drowning rate
Ice Cream eaten
7Confounding
Epidemiology definition A characteristic C is
a confounder if it is associated (related) with
both the outcome (Y drowning) and the risk
factor (X ice cream) and is not causally in
between
8Confounding
Statistical definition A characteristic C is
a confounder if the strength of relationship
between the outcome (Y drowning) and the risk
factor (X ice cream) differs with, versus
without, adjustment for C
Outdoor Temperature
9Confounding Example Drowning and Eating Ice
Cream
Drowning rate
Warm temperature
Cool temperature
Ice Cream eaten
10Effect modification
A characteristic E is an effect modifier if the
strength of relationship between the outcome (Y
drowning) and the risk factor (X ice cream)
differs within levels of E
Outdoor temperature
11Effect Modification Drowning and Eating Ice
Cream
Drowning rate
Warm temperature
Cool temperature
Ice Cream eaten
12Topic 2 Signal/Noise Decomposition
- Lovely due to geometry of least squares
- Facilitates testing involving multiple parameters
at once - Provides insight into R-squared
13Signal/Noise Decomposition
- First step decomposition of variance
- Regression part Variance of s
- Error or Residual part Variance of e
- Together These determine total variance of Ys
- Sums of Squares (SS) rather than variance per
se - Regression SS (SSR)
- Error SS (SSE)
- Total SS (SST)
14Signal/Noise Decomposition
- Properties
- SST SSR SSE
- SSR/SST proportion of variance explained by
regression R-squared - Follows from geometry
- SSR and SSE are independent (assuming A1-A5) and
have easily characterized probability
distributions - Provides convenient testing methods
- Follows from geometry plus assumptions
15Signal/Noise Decomposition
- SSR and SSE are independent
- Define M span(X) and take Y as centered at
- It is possible to orthogonally rotate the
coordinate axes so that first p axes e M
remaining n-p-1 axes e M? - Gram-Schmidt orthogonalization
- Doing this transforms Y into TY Z, for some
orthonormal matrix T with columns e1,...,en-1
-
- Distribution of Z N(TEYX,s2I)
16Signal/Noise Decomposition
- SSR and SSE are independent - continued
- TYZ Y TZ
- SSE squared length of
- SSR squared length of
- Claim now follows SSR SSE are independent
because (Z1,,Zp) and (Zp1,,Zn-1) are
independent
17Signal/Noise Decomposition
- Under A1-A5 SSE, SSR and their scaled ratio have
convenient distributions - Under A1-A2 EYX e M, EZjX 0, all jgtp
- Recall Z1,...,Zn-1 are mutually independent
normal with variances2 - Thus SSE
- s2 ?2n-p-1 under A1-A5
- (a sum of k independent squared N(0,1) is
)
18Signal/Noise Decomposition
- Under A1-A5 SSE, SSR and their scaled ratio have
convenient distributions - For j p EZjX ? 0 in general
- Exception H0 ß1ßp 0
- Then SSR s2 ?2p under A1-A5
- and
-
Fp,n-p-1 - with numerator and denominator independent.
19Signal/Noise Decomposition
- An organizational tool The analysis of variance
(ANOVA) table
SOURCE Sum of Squares (SS) Degrees of freedom (df) Mean square (SS/df)
Regression SSR p SSR/p
Error SSE n-p-1 SSE/(n-p-1)
Total SST SSR SSE n-1
F MSR/MSE
20Global hypothesis tests
- These involve sets of parameters
- Hypotheses of the form
- H0 ßj 0 for all j in a defined subset of
j1,...,p vs. H1 ßj ? 0 for at least one of
the j -
- Example 1 H0 ßLATITUDE 0 and ßLONGITUDE 0
- Example 2 H0 all polynomial or spline
coefficients involving a given variable
0. - Example 3 H0 all coefficients involving a
variable 0.
21Global hypothesis tests
- Testing method Sequential decomposition of sums
of squares - Hypothesis to be tested is H0 ßj1...ßjk 0 in
full model - Fit model excluding xj1,...,xjpj Save SSE
SSEs - Fit full (or larger) model adding xj1,...,xjpj
to smaller model. Save SSESSEL,
oftenoverall SSE - Test statistic S (SSES-SSEL)/pj/SSEL(n-p-1)
- Distribution under null F(pj,n-p-1)
- Define rejection region based on this
distribution - Compute S
- Reject or not as S is in rejection region or not
22Signal/Noise Decomposition
- An augmented version for global testing
SOURCE Sum of Squares (SS) Degrees of freedom (df) Mean square (SS/df)
Regression SSR p SSR/p
X1 SST-SSEs p1
X2X1 SSES-SSEL p2 (SSES-SSEL )/p2
Error SSEL n-p-1 SSEL/(n-p-1)
Total SST SSR SSE n-1
F MSR(21)/MSE
23R-squared Another view
- From last lecture ECDF Corr(Y, ) squared
- More conventional R2 SSR/SST
- Geometry justifies why they are the same
- Cov(Y, ) Cov(Y- , ) Cov(e,
) Var( ) - Covariance inner product first
term 0 - A measure of precision with which regression
model describes individual responses
24Outline A few more topics
- Colinearity
- Overfitting
- Influence
- Mediation
- Multiple comparisons
25Main points
- Confounding occurs when an apparent association
between a predictor and outcome reflects the
association of each with a third variable - A primary goal of regression is to adjust for
confounding - Least squares decomposition of Y into fit and
residual provides an appealing statistical
testing framework - An association of an outcome with predictors is
evidenced if SS due to regression is large
relative to SSE - Geometry orthogonal decomposition provides
convenient sampling distribution, view of R2 - ANOVA