From last time - PowerPoint PPT Presentation

About This Presentation
Title:

From last time

Description:

Title: No Slide Title Author: elizabeth garrett Last modified by: elizabeth garrett Created Date: 3/2/2001 9:58:38 PM Document presentation format – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 40
Provided by: elizabeth514
Learn more at: http://people.musc.edu
Category:

less

Transcript and Presenter's Notes

Title: From last time


1
From last time.

2
Basic Biostats Topics
  • Summary Statistics
  • mean, median, mode
  • standard deviation, standard error
  • Confidence Intervals
  • Hypothesis Tests
  • t-test (paired and unpaired)
  • Chi-Square test
  • Fishers exact test

3
More Advanced
  • Linear Regression
  • Logistic Regression
  • Repeated Measures Analysis
  • Survival Analysis
  • Analyzing fMRI data

4
General Biostatistics References
  • Practical Statistics for Medical Research.
    Altman. Chapman and Hall, 1991.
  • Medical Statistics A Common Sense Approach.
    Campbell and Machin. Wiley, 1993
  • Principles of Biostatistics. Pagano and
    Gauvreau. Duxbury Press, 1993.
  • Fundamentals of Biostatistics. Rosner. Duxbury
    Press, 1993.

5
Lecture 3Linear Regression
Child Psychiatry Research Methods Lecture Series
  • Elizabeth Garrett
  • esg_at_jhu.edu

6
Introduction
  • Simple linear regression is most useful for
    looking at associations between continuous
    variables.
  • We can evaluate if two variables are associated
    linearly.
  • We can evaluate how well we can predict one of
    the variables if we know the other.

7
Motivating Example (Tierney et al. 2001)
  • Is there an association between total sterol
    level and ADI scores in autistic children?
  • Hypothesis Children with lower sterol levels
    will tend to have poorer performance (i.e. higher
    scores) on the following components of the ADI
  • social
  • nonverbal
  • repetitive

8
Preliminary Data
  • 9 individuals with autism
  • Some have been on cholesterol supplementation (7
    out of 9)
  • Mean age 14
  • Age range 8 - 32 years
  • Sterol is a continuous variable
  • ADI scores are continuous variables

9
Statistical Language
  • Need to choose what variable is the predicted (Y)
    and which is the predictor (X).
  • Y outcome, dependent variable, endogenous
    variable
  • X covariate, predictor, regressor, explanatory
    variable, exogenous variable, independent
    variable.
  • Our example?

10
How can we conclude if there is or is not
an association between sterol and the ADI scores?
11
One approach Correlation
  • Correlation is a measure of LINEAR association
    between two variables.
  • It takes values from -1 to 1.
  • Often notated r or ?
  • r 1 ? perfect positive correlation
  • r -1 ? perfect negative correlation
  • r 0 ? no correlation

12
r 0.95
r 0.77
r -0.95
r 0.09
13
Correlation between ADI measures and Sterol
r -0.85
r -0.70
r 0.06
14
Related to r R2
  • R2 of variation in Y explained by X.
  • Example
  • Correlation between nonverbal score and sterol is
    -0.85.
  • R2 is 0.852 0.73
  • 73 of the variation in nonverbal score is
    explained by sterol
  • Gives a sense of the value of sterol in
    predicting nonverbal score
  • Other examples
  • R2 between sterol and social is 0.49
  • R2 between sterol and repetitive is 0.004

15
Simple Linear Regression (SLR) Approach
  • (1) Fits best line to describe the association
    between Y and X (note straight line)
  • (2) Line can be described by two numbers
  • - intercept
  • - slope
  • (3) By-product of regression correlation
    measures how close points fall from the line.
  • (4) Why simple? Only one X variable.

16
Intercept 24.8
Slope -0.01
17
SLR answers two questions.
  • Association?
  • Does nonverbal score tend to decrease on average
    when sterol increases?
  • Is slope different than zero?
  • Prediction?
  • Can we predict nonverbal score if we know sterol
    level?
  • Is the correlation (or R2) high?
  • You CAN have association with low correlation!

18
Equation of a line
  • ?0 Intercept
  • ?0 is the estimated nonverbal score if it were
    possible to have a sterol level of 0 (nonsensical
    in this case).
  • ?0 calibrates height of line
  • ?1 Slope
  • ?1 is the estimated change in nonverbal score for
    a one unit change in sterol
  • ?1 the estimated difference in nonverbal score
    comparing two kids whose sterol levels differ by
    one.
  • We usually use ?1 as our measure of association

19
The slope, ?1
  • Is ?1 different than zero?

Are each of these reasonable given the data that
we have observed?
20
Evaluating Association
  • ?1is a statistic, similar to a sample mean, and
    as such has a precision estimate.
  • The precision estimate is called the standard
    error of ?1. Denoted se(?1).
  • We look at how large ?1 is compared to its
    standard error
  • ?1 is often called a regression coefficient or
    a slope.

21
General Rule
  • If , then we say that
    ?1 is
  • statistically significantly different than
    zero.
  • T-test interpretation
  • H0 ?1 0
  • Ha ?1 ? 0
  • If is true, then p-value less than
    0.05.
  • Intuition
  • ?1 is large compared to its precision ? not
    likely that ?1 is 0.

22
For large samples.
23
ADI Nonverbal and Sterol
Outcome
pvalue
  • --------------------------------------------------
    ----------------------------
  • nonvrb Coef. Std. Err. t
    Pgtt 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------
  • totster -.0099066 .0022804 -4.344
    0.003 -.0152988 -.0045144
  • _cons 24.84349 2.578369 9.635
    0.000 18.74661 30.94036
  • --------------------------------------------------
    ----------------------------

se(?1)
?1
Predictor
?0
R-squared 0.73
24
Interpretation
  • Comparing two autistic kids whose sterol levels
    differ by 1, we estimate that the one with lower
    sterol will have an ADI nonverbal score that is
    higher by 0.01 points.
  • Put it in real units
  • Comparing two autistic kids whose sterol levels
    differ by 200, we estimate that the child with
    the lower sterol level will have an ADI nonverbal
    score that is higher by 2 points.

(Note 200 x 0.01 2.0)
25
A few other details...
  • 95 Confidence interval interpretation
  • ?1 ? 2se(?1) does not include zero.
  • ?1/se(?1) is called the
  • t-statistic
  • Z-statistic
  • If you have small sample (i.e. fewer than 50
    individuals), need to use a t-correction.

26
Relationship between correlation and SLR
  • Testing that correlation is equal to zero is
    equivalent to testing that the slope is equal to
    zero.
  • Can have strong association and low correlation

r 0.93 ?1 1.86 pvalue lt 0.001
r 0.55 ?1 1.88 pvalue lt 0.001
27
Additional Points
  • (1) Association measured is LINEAR

r 0.02
28
Additional Points
  • (2) Difference (i.e. distance) between observed
    data and fitted line is called a residual, ?.

1. 0.74 2. -0.95 3. -2.53 4. 3.01
5. 2.52 6. 0.45 7. -3.15 8.
-0.07 9. 0.59 .
?3
?5
29
Additional Points
  • (3) Often see model equation as

Refers to regression line
Refer to observed data
Generically,
30
Additional Points
  • (4) Spread of points around line is assumed to be
    constant (i.e. variance of residuals is constant)

BAD!
31
Multiple Linear Regression
  • More than one X variable
  • Generally the same, except
  • Cant make plots in multi-dimensions
  • Interpretation of ?s is somewhat different

32
Other ADI and Sterol SLRs
  • How is age when supplementation began related to
    sterol?
  • How is age when supplementation began related to
    nonverbal score?

33
(No Transcript)
34
How might this change our previous result?
  • What if age when cholesterol supplementation
    began is associated with both sterol level and
    nonverbal score?
  • Is it correct to conclude that total sterol level
    is associated with nonverbal score?

Sterol
Nonverbal Score
Supplementation Age
35
We can adjust!
  • ------------------------------------------------
    ------------------------------
  • nonvrb Coef. Std. Err. t
    Pgtt 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------
  • sterol -.0105816 .0022118 -4.784
    0.003 -.0159937 -.0051696
  • agester .1570626 .1158509 1.356
    0.224 -.1264143 .4405394
  • _cons 23.81569 2.551853 9.333
    0.000 17.57153 30.05985
  • --------------------------------------------------
    ----------------------------

36
Interpretation of Betas
  • Now that we have adjusted for age at
    supplementation, we need to include that in our
    result
  • Comparing two kids who began cholesterol
    supplementation at the same age and whose sterol
    levels differ by 250 units, we estimate that the
    child with the lower sterol level will have an
    ADI nonverbal score higher by 2 points.
  • Adjusting for age at supplementation, comparing
    two kids whose sterol levels differ by 250 units,
    we estimate
  • Controlling for age at supplementation ..
  • Holding age at supplementation constant..

37
Collinearity
  • If two variables are
  • correlated with each other
  • correlated with the outcome
  • Then, when combined in a MLR model, it could
    happen that
  • neither is significant
  • only one is significant
  • both remain significant

38
ADI and Sterol
  • Correlation Matrix
  • nonvrb sterol agester
  • ------------------------------------
  • nonvrb 1.0000
  • sterol -0.8541 1.0000
  • agester 0.0531 0.2251 1.0000

We say that cholesterol time and sterol are
collinear.
39
Summing up example.
  • After adjusting for age at supplementation, it
    appears that sterol is still a significant
    predictor of ADI nonverbal score.
  • BUT!
  • Only NINE observations! With more, we would
    almost CERTAINLY see even stronger associations!
  • We havent controlled for other potential
    confounders
  • length of time on supplementation
  • nonverbal score prior to supplementation
Write a Comment
User Comments (0)
About PowerShow.com