Biostatistics in Practice - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Biostatistics in Practice

Description:

1. We often hear news reports of 'seasonally adjusted unemployment rates' ... GLUM -0.00046 0.00018 -2.50 0.0135. SKINF 0.00147 0.00183 0.81 0.4221 ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 32
Provided by: bios62
Category:

less

Transcript and Presenter's Notes

Title: Biostatistics in Practice


1
Biostatistics in Practice
Session 5 Associations and Confounding
Peter D. Christenson Biostatistician http//gcrc
.LABioMed.org/Biostat
2
Session 5 Preparation 1
1. We often hear news reports of "seasonally
adjusted unemployment rates". Can you think of a
logical way that this adjustment could be made?
3
Session 5 Preparation 2
From Table 3
Unadjusted
What does adjusted mean? How is it done?
Adjusted
4
Goal One of Session 5
Earlier Compare means for a single measure among
groups. Use t-test, ANOVA.
Session 5 Relate two or more measures. Use
correlation or regression.
?Y/?X
?
Qu et al(2005), JCEM 901563-1569.
5
Goal Two of Session 5
Try to isolate the effects of different
characteristics on an outcome. Previous slide
Gender
GH Peak
BMI
6
Correlation
Visualize Y (vertical) by X (horizontal) scatter
plot. Pearson correlation, r, is used to measure
association between two measures X and Y Ranges
from -1 (perfect inverse association) to 1
(perfect direct association) Value of r does
not depend on scales (units) of X and Y which
role X and Y assume, as in a X-Y plot Value of
r does depend on the ranges of X and Y values
chosen for X, if X is fixed Y is measured
7
Graphs and Values of Correlations
8
Logic for Value of Correlation

-
-

S (X-Xmean) (Y-Ymean) vS(X-Xmean)2 S(Y-Ymean)2
r
Statistical software gives r.
9
Correlation Depends on Ranges of X Y
B
A
Graph B contains only the graph A points in the
ellipse. Correlation is reduced in graph B. Thus
correlations for the same quantities X and Y may
be quite different in different study
populations.
10
Correlation and Measurement Precision
A
B
overall
12 10
r0 for s
5 6
B
A lack of correlation for the subpopulation with
5ltxlt6 may be due to inability to measure x and y
well. Lack of evidence of association is not
evidence of lack of association.
11
Regression
Again Y (vertical) by X (horizontal)
scatterplot, as with correlation. See next
slide. X and Y now assume unique roles Y
is an outcome, response, output, dependent
variable. X is an input, predictor,
independent variable. Regression analysis is
used to Measure X-Y association, as with
correlation. Fit a straight line through the
scatter plot, for Prediction of Y from X.
Estimation of ? in Y for a unit change in X
(slope effect of X on Y).
12
Regression Example
MinimizesSei2
ei
Range for Individuals
Range for mean
Statistical software gives all this info.
13
X-Y Association
If slope0 then X and Y are not associated. But
the slope measured from a sample will never be 0.
How different from 0 does a measured slope need
to be in order to claim X and Y are
associated? Side note It turns out that
slope0 is equivalent to correlation r 0.
14
X-Y Association
Test slope0 vs. slope?0, with the rule Claim
association (slope?0) if tcslope/SE(slope) gt
t 2. There is a 5 chance of claiming an X-Y
association that really does not exist. Note
similarity to t-test for means tcmean/
SE(mean) Formula for SE(slope) is in
statistics books.
15
Example Software Output
The regression equation is Y 81.6 2.16
X Predictor Coeff StdErr T
P Constant 81.64 11.47 7.12
lt0.0001 X 2.1557 0.1122
19.21 lt0.0001 S 21.72 R-Sq
79.0 Predicted Values X
100 Fit 297.21 SE(Fit) 2.17 95 CI
292.89 - 301.52 95 PI 253.89 - 340.52
19.212.16/0.112 should be between -2 and 2 if
true slope0.
Refers to Intercept
Predicted y 81.6 2.16(100) Range of Ys with
95 assurance for Mean of all subjects with
x100. Individual with x100.
16
Goal Two of Session 5
Try to isolate the effects of different
characteristics on an outcome.
Ethnicity
Outcome
Age
17
Another Study
Potential doping test for athletes.
J Clin Endocrin Metab 2006 Nov 91(11)4424-32.
18
Study Goals Outcomes are IGF-1 and Collagen
Markers
Determine the relative and combined explanatory
power of age, gender, BMI, ethnicity, and sport
type on the markers.
Figure 2.
One conclusion is lack of differences between
ethnic IGF-1 means, after adjustment for age,
gender, and BMI (Fig 2). How are these
adjustments made?

for age, gender, and BMI.
19
Adjustment For a Single Continuous
Characteristic
We simulate data for Caucasians and Africans only
for simplicity, to demonstrate attenuation of a
155-14015 µg/L ethnic difference to a 160-1573
µg/L ethnic difference.
158
160
140
155
20
Adjustment For a Single Continuous
Characteristic
Problem Want to compare groups on IGF-1. Groups
to be compared (ethnicities) have different mean
ages, and IGF-1 tends to decrease with
age. Solution Make groups appear to have the
same mean age.
21
Adjustment For a Single Continuous
Characteristic
Solution Make groups appear to have the same
mean age. To do this, Find regression line
predicting IGF-1 from age. Move each subject
parallel to the regression line to the mean age.
This is the expected IGF-1 if this subject had
been at the mean age. Adjusted means are means of
these adjusted individual values.
22
(No Transcript)
23
Adjustment For a Single Continuous
Characteristic
We have just described a special case of multiple
regression, in which an outcome is estimated by
multiple predictors.
Simple Regression Estimated IGF-1 intercept
slope(age)
Multiple Regression Estimated IGF-1 intercept
slope(age) diff(indicator)
Indicator 0 if African, 1 if Caucasian.
24
Adjustment For a Single Continuous
Characteristic
Software Select Regression or Analysis of
Covariance. Usually menu such as
Output Values of b0,b1,b2 for IGF1b0b1(age)
b2(indicator)
25
Multiple Regression
We have seen the logic of adjusting for a single
characteristic. The next few slides try to give
a geometric view of generalizing adjustment to
account for several factors simultaneously.
26
Multiple Regression Geometric View
Multiple predictors may be continuous. Geometrical
ly, this is fitting a slanted plane to a cloud of
points
www.StatisticalPractice.com
LHCY is the Y (homocysteine) to be predicted from
the two Xs LCLC (folate) and LB12 (B12).
LHCY b0 b1LCLC b2LB12 is the equation of
the plane
27
How Are Coefficients Interpreted?
LHCY b0 b1LCLC b2LB12
Outcome
Predictors
LB12 may have both an independent and an indirect
(via LCLC) association with LHCY
LCLC
b1 ?
LHCY
Correlation
b2 ?
LB12
28
Coefficients Meaning of their Values
LHCY b0 b1LCLC b2LB12
Outcome
Predictors
LHCY increases by b2 for a 1-unit increase in
LB12 if other factors (LCLC) remain constant,
or adjusting for other factors in the model
(LCLC)
May be physiologically impossible to maintain one
predictor constant while changing the other by 1
unit.
29
Another Example HDL Cholesterol
Output
Std Coefficient
Error t Pr gt t Intercept
1.16448 0.28804 4.04 lt.0001 AGE
-0.00092 0.00125 -0.74 0.4602 BMI
-0.01205 0.00295 -4.08 lt.0001 BLC 0.05055
0.02215 2.28 0.0239 PRSSY -0.00041
0.00044 -0.95 0.3436 DIAST 0.00255
0.00103 2.47 0.0147 GLUM -0.00046
0.00018 -2.50 0.0135 SKINF 0.00147
0.00183 0.81 0.4221 LCHOL 0.31109 0.10936
2.84 0.0051 The predictors of log(HDL) are age,
body mass index, blood vitamin C, systolic and
diastolic blood pressures, skinfold thickness,
and the log of total cholesterol. The equation
is Log(HDL) 1.16 - 0.00092(Age)
0.311(LCHOL)
www. Statistical Practice .com
30
HDL Example Coefficients
  • Interpretation of coefficients on previous slide
  • Need to use entire equation for making
    predictions.
  • Each coefficient measures the difference in
    expected LHDL between 2 subjects if the factor
    differs by 1 unit between the two subjects, and
    if all other factors are the same. E.g., expected
    LHDL is 0.012 lower in a subject whose BMI is 1
    unit greater, but is the same as the other
    subject on other factors.

Continued
31
HDL Example Coefficients
  • Interpretation of coefficients two slides back
  • P-values measure the association of a factor with
    Log(HDL) , if other factors do not change.
  • This is sometimes expressed as after accounting
    for other factors or adjusting for other
    factors, and called its independent association.
  • SKINF is probably is associated. Its p0.42 says
    that it has no additional info to predict LogHDL,
    after accounting for other factors such as BMI.
Write a Comment
User Comments (0)
About PowerShow.com