When You Get Hit By a Car - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

When You Get Hit By a Car

Description:

When You Get Hit By a Car ... Constance M. Elson. MGH Biostatistics Center ... 160 trauma patients at 7 hospitals. Study design ensures that most of the ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 32
Provided by: constan1
Category:
Tags: biostatistics | car | hit

less

Transcript and Presenter's Notes

Title: When You Get Hit By a Car


1
When You Get Hit By a Car
  • Constance M. Elson
  • MGH Biostatistics Center

2
  • Data was collected for an NIGMS-funded multi-site
    study Inflammation and the Host Response to
    Injury (U54GM62119).
  • 160 trauma patients at 7 hospitals. Study design
    ensures that most of the injuries result from
    motor vehicle accidents or falls.
  • This talk will demonstrate the use of several SAS
    statistical procedures to explore the clinical
    data for these patients.

3
Clinical Variables For This Talk
  • (Categorical/Ordinal Variables underlined)
  • Demographic age , agecat (under 30, under 45,
    old), male, smoker
  • Injury and Initial Response injury type (eg
    fall, MVC-occupant, motorcycle, MVC-pedestrian),
    delay, admit onvent, apache score, blood
    pressure in ER, pre-hospital heart rate, worst
    base deficit 0-12 hours, transfused blood 0-12
    hours
  • Outcome hospital length of stay, complications,
    multiple organ failure (MOF), death

4
PROC CORR gives a quick overview of relationships
among data fields
  • proc corr database outp corr1
  • var age male smoker agecat delay
  • injtype adonvent apache bp heartrate
  • wbd12 blood12 hosplos complics MMOF
  • diedb
  • run

5
PROC CORR Output
  • Pearson Correlation Coefficients, N 158
  • Prob gt r under H0 Rho0
  •   age male
    smoker etc . . .
  •  
  • age 1.00 0.00023 -0.05896
    lt- correl coeff
  • 0.9977
    0.4618 lt- significance
  • male 0.00023 1.00 -0.03424
  • 0.9977
    0.6693
  • etc ...

6
Visual explanation of correlation
7
Correlations for Demographic Variables
  •  

8
Correlations for Injury Variables
  •  

9
Correlations for Outcome Variables
10
Data Type Suggests Statistical Method
11
Regression Methods
  • Hypothesis There is a significant linear
    relationship between hospital length of stay and
    APACHE injury score.
  • proc glm database
  • model hosplos apache
  • ods output modelANOVA tmp2
  • ods output parameterEstimatestmp3
  • run quit

12
PROC GLM Regression Output
  • Dependent Variable hosplos

  • Sum of
  • Source DF
    Squares Mean Square F Value Pr gt F
  • Model 1
    2294.91662 2294.91662 7.11
    0.0085
  • Error 156
    50383.46945 322.97096
  • Corrected Total 157
    52678.38608
  •  
  • Source DF Type
    III SS Mean Square F Value Pr gt F
  • apache 1
    2294.916622 2294.916622 7.11
    0.0085
  •  

  • Standard
  • Parameter Estimate
    Error t Value Pr gt t
  • Intercept 7.493583706
    6.65862948 1.13 0.2622
  • apache 0.636092377
    0.23862639 2.67 0.0085
  • Conclusion Length of hospital stay is linearly
    related to the APACHE score. Each 1 unit
    increase in score leads to .6 extra day in
    hospital, on average.

13
Linear Regression for Hospital Length of Stay
14
Type I vs Type III Error
  • proc glm database
  • model hosplos blood12 apache etc .. .
  • note more than 1 covariate
  • OUTPUT
  • Source DF Squares
    Mean Square F Value Pr gt F
  • Model 2 2879.01
    1439.50680 4.48 0.0128
  •  
  • Source DF Type I SS
    Mean Square F Value Pr gt F
  • blood12 1 1443.33
    1443.336 4.49 0.0356
  • apache 1 1435.677
    1435.677 4.47 0.0361
  •  
  • Source DF Type III
    SS Mean Square F Value Pr gt F blood12
    1 584.096987
    584.096987 1.82 0.1795
  • apache 1
    1435.677181 1435.677181 4.47
    0.0361
  •  
  • Parameter Estimate Error t Value
    Pr gt t
  • Intercept 7.94911 6.6498 1.20
    0.2338
  • blood12 0.00099 0.0007 1.35
    0.1795
  • apache 0.5299 0.2506
    2.11 0.0361

15
Selection Methods for Linear Regression
  • proc reg database
  • model hosplos age male delay adonvent wbd12
    blood12
  • / selection forward slentry
    .1
  • run
  • OUTPUT
  • Forward Selection Step 2 Variable
    blood12 Entered
  • Parameter
    Standard
  • Variable Estimate Error
    Type II SS F Value Pr gt F
  • Intercept 23.91 2.58
    27651 85.51 lt.0001
  • smoker -5.99 2.93
    1349.87 4.17 0.0427
  • blood12 0.00137 0.000702
    1236.99 3.83 0.0523
  •  
  • No other variable met the 0.1000
    significance level for entry into the model.
  •  
  • Summary of Forward Selection
  • Variable Partial
    Model
  • Step Entered R-Square
    R-Square C(p) F Value Pr gt F
  •  

16
ANOVA Methods
  • Question Does pre-hospital heart rate vary
    significantly by age category?
  • proc glm database
  • class statement makes this an ANOVA model
  • class agecat
  • model heartrate agecat
  • run
  • proc anova database
  • class agecat
  • model heartrate agecat
  • run

17
ANOVA Output
  • Class Levels Values
  • agecat 3 1 2 3 
  • Dependent Variable heartrate
  • Source DF SS Mean
    Square F Value Pr gt F
  • Model 2 5106.91
    2553.45 5.10 0.0073
  • Error 143 71649.33
    501.04
  • Total 145 76756.24
  •  
  • Source DF Type III SS Mean
    Square F Value Pr gt F
  • agecat 2 5106.91
    2553.45 5.10 0.0073
  • Conclusion At least one of the age categories
    had a significantly different pre-hospital heart
    rate.

18
ANOVA PLOT
19
ANOVA vs LINEAR REGRESSION
  • Omitting the class statement in proc glm gives a
    regression model with the same
  • output as the ANOVA model but with estimates
    for model coefficients
  • Standard
  • Parameter Estimate Error t Value
    Pr gt t
  • Intercept 133.53 4.64
    28.77 lt.0001
  • agecat -7.66 2.391
    -3.20 0.0017
  • However if a regression model is your intention,
    using the continuous variable age instead of
    the ordinal variable agecat may give slightly
    better results
  • Standard
  • Parameter Estimate Error t
    Value Pr gt t
  • Intercept 138.018 5.871
    23.51 lt.0001
  • age -0.534 0.164
    -3.25 0.0014

20
Linear Regression Ordinal vs Continuous
21
Contingency Table Methods
  • Question Are the types of injury the same for
    different age groups?
  • proc freq database
  • tables agecatinjtype / nopercent norow
  • where injtype in (1,3,4,6)
  • exact fisher
  • output outtemp exact
  • run
  • Fisher exact statistics are useful for sparse
    contingency tables. They are computed by
    creating all possible urn models of the data,
    using balls of appropriate types. This takes a
    LOOOOOONG time to do for 3 age categories and 7
    injury types so we shortened it. Alternatively,
    we could just replace exact fisher with exact
    chisq.

22
PROC FREQ Output
  • Injtype 1fall, 3MVC Occupant, 4Motorcycle,
    6MVC Pedestrian

23
PROC FREQ Output Statistics
  • Statistics for Table of AGECAT by INJTYPE
  •  
  • Statistic
    DF Value Prob
  • Chi-Square
    6 10.5110 0.1047
  • Likelihood Ratio Chi-Square 6
    12.2874 0.0559
  • Mantel-Haenszel Chi-Square 1
    5.4840 0.0192
  • Phi Coefficient
    0.2683
  •  
  • WARNING 33 of the cells have expected
    counts less
  • than 5. Chi-Square may not be a valid
    test.
  •  
  • Fisher's Exact Test
  • Table Probability (P) 1.366E-06
  • Pr lt P 0.0950
  • Conclusion The distribution of injury type is
    only weakly significantly different for the
    different age groups.

24
Logistic Regression Methods
  • Question Can we predict the probability of
    death based on 0-12 hour worst base deficit?
  • Logistic regression models are more powerful than
    linear models when the independent variable is
    binary or takes only a few values.
  • proc logistic dataepi descending
  • model diedb wbd12
  • run
  • OUTPUT Response Profile
  • Ordered Value
    diedb Total Frequency
  • 1
    1 170
  • 2
    0 693
  • Probability modeled is
    diedb1.
  • Model Convergence Status Convergence criterion
    (GCONV1E-8) satisfied.
  • Model Fit Statistics . . .

25
PROC LOGISTIC Output, contd
  • Testing Global Null Hypothesis BETA0
  • Test Chi-Square
    DF Pr gt ChiSq
  • Likelihood Ratio 141.591 1
    lt.0001
  • Score 148.60
    1 lt.0001
  • Wald 110.334
    1 lt.0001
  •  
  • Analysis of Maximum Likelihood Estimates
  • Standard Wald
  • Parameter DF Estimate Error
    Chi-Square Pr gt ChiSq
  • Intercept 1 -3.6282 0.2463
    217.08 lt.0001
  • wbd12 1 0.2281 0.0217
    110.33 lt.0001
  • Odds Ratio Estimates
  • Point
    95 Wald
  • Effect Estimate Confidence
    Limits
  • wbd12 1.256 1.204
    1.311
  • Conclusion There is a significant relationship
    between probability of death and 0-12 hour worst
    base deficit for every unit increase in the
    worst base deficit, the odds of death increase by
    25.6

26
Mortality by WBD12, with Lin Reg Line
27
Probability of Death by WBD Group
28
Linear Regression for Probability of Death by WBD
Group
29
Logistic Regression for Probability of Death by
WBD Group
30
How Logistic Regression Works
  • Assume covariate (worst base deficit) x.
  • Let p(x) Probability of event (death) when
    covariate x
  • Odds of event are p(x) / (1 p(x)).
  • Log-odds of event L(x) ln p(x)/ (1-p(x)) .
    Also called logit ( p(x) ).
  • Logistic regression model says that the log-odds
    of event are linear L(x) a bx

31
How Logistic Regression Works, contd
  • PROC LOGISTIC computes L(x) and estimates the
    coefficients a and b for the linear model for
    L(x).
  • Example event death, x wbd12.
  • Proc Logistic gave the estimate L(x) -3.6
    .228 x.
  • High school algebra (challenge!) shows that this
    is equivalent to
  • Graph of p(x)
Write a Comment
User Comments (0)
About PowerShow.com