Statistics for the Terrified - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Statistics for the Terrified

Description:

Statistics for the Terrified Paul F. Cook, PhD Center for Nursing Research What Good are Statistics? How big? ( how much? , how many? ) – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 25

Provided by: ucdenverE9

Category:

more less

Transcript and Presenter's Notes

Title: Statistics for the Terrified

1
Statistics for the Terrified

Paul F. Cook, PhD
Center for Nursing Research

2
What Good are Statistics?

How big? (how much?, how many?)
Descriptive statistics, including effect sizes
Describe a population based on a sample
Help you make predictions
How likely?
Inferential statistics
Tell you whether a finding is reliable, or
probably just due to chance (sampling error)

3
Answering the 2 Questions

Inferential statistics tell you how likely
Cant tell you how big
Cant tell you how important
Success is based on a double negative
Descriptive statistics tell you how big
Cohens d
Pearson r (or other correlation coefficient)
Odds ratio

x1 x2 SDpooled
4
How Big is Big?

Correlations
0 no relationship, 1 upper limit
effect, - - effect
.3 for small, .5 for medium, .7 for large
r2 percent of variability accounted for
Cohens d
Means are how many SDs apart? 0 no effect
.5 for small, .75 for medium, 1.0 for large
Odds Ratio
1 no relationship, lt1 - effect, gt1 effect
All effect size statistics are interchangeable!

5
How Likely is Likely? - Test Statistics

A ratio of signal vs. noise

x1 x2 z s12/n1 s22/n2
signal AKA, between-groups variability or
model noise AKA, within-groups
variability or error
6
How do We Get the p-value?

Chebyshevs Theorem

-1.96 SD
1.96 SD
-1.96
1.96
7
Hypothesis Testing 5 Steps

State null and alternative hypotheses
Calculate a test statistic
Find the corresponding p-value
Reject or fail to reject the null hypothesis
(your only 2 choices)
Draw substantive conclusions
Red statistics, blue logic, black theory

8
How Are the Questions Related?

Large z a large effect (d) and a low p
But z depends on sample size d does not
Every test statistic is the product of an effect
size and the sample size
Example f c2 / N
A significant result (power) depends on
What alpha level (a) you choose
How large an effect (d) there is to find
What sample size (n) is available

9
What Type of Test?

N-level predictor (2 groups) t-test or z-test
N-level predictor (3 groups) ANOVA (F-test)
I/R-level predictor correlation/regression
N-level dependent variable c2 or logistic reg.
Correlation answers the how big question, but
can convert to a t-test value to also answer the
how likely question

10
The F test

ANOVA analysis of variance
Compares variability between groups to
variability within groups
Signal vs. noise

MSEb avg. difference among means F
MSEw avg. variability within
each group
11
Omnibus and Post Hoc Tests

The F-test compares 3 groups at once
Benefit avoids capitalizing on chance
Drawback cant see individual differences
Solution post hoc tests
Bonferroni correction for 1-3 comparisons
(uses an adjusted alpha of .025 or .01)
Tukey test for 4 comparisons

12
F and Correlation (eta-squared)
The F-Table SS df MS F p
Between Between SSb / dfb MSb /
MSw .05 Within Within SSw / dfw Total
Total ( SSb SSw) ( dfb dfw)
SSb eta2 of total
variability that is due SStotal
to the IV (i.e., R-squared)
13
Correlation Seen on a Graph
Same Direction, Weak Correlation
Moderate Correlation
Same Direction, Strong Correlation
14
Regression and the F-test
15
Parametric Test Assumptions

Tests have restrictive assumptions
Normality
Independence
Homogeneity of variance
Linear relationship between IV and DV
If assumptions are violated, use a nonparametric
alternative test
Mann-Whitney U instead of t
Kruskal-Wallis H instead of F
Chi-square for categorical data

16
Chi-Square

The basic nonparametric test
Also used in logistic regression, SEM
Compares observed values (model) to observed
minus predicted values (error)
Signal vs. noise again
Easily converts to phi coefficient
f vc2 / N

S
( Fo - Fe )2 c2
Fe
17
2-by-2 Contingency Tables
18
Dependent Observations

Independence is a major assumption of parametric
tests (but not nonparametrics)
Address non-independence by collapsing scores to
a single observation per participant
Change score posttest score pretest score
Can calculate SD (variability) of change scores
Determine if the average change is significantly
different from zero (i.e., no change)
t (average change zero) / (SDchange / v n )
Nonparametric version Wilcoxon signed-rank test

19
ANCOVA / Multiple Regression

Statistical control for confounding variables
no competing explanations
Method adds a covariate to the model
That variables effects are partialed out
Remaining effect is independent of confound
One important application ANCOVA
Test for post-test differences between groups
Control for pre-test differences
Multiple regression Same idea, I/R-level DV
Stepwise regression finds best predictors

20
Independent Variable 1
This circle represents all of the variability
that exists in the dependent variable
Unexplained variability remaining for the
dependent variable
Independent Variable 2
The unique variability is the part of the
variability in the DV that can be accounted for
only by this IV (and not by any other IV)
The shared variability is the part of the
variability in the DV that can be accounted for
by more than one DV.
When two IVs account for the same variability in
the DV (i.e., when there is shared variability),
they are multicollinear with each other.
Whats left over (variability in the DV not
accounted for by any predictor) is considered
errorrandom (i.e., unexplained) variability
21
This graph can also be used to show the
percentage of the variability in the DV that can
be accounted for by each IV.
IV1
If
is 30 of
DV
then the Total R2 is .30
IV2
Total R2 for IV1 IV2 together
The percentage of variability in the DV that can
be accounted for by an IV is the definition of
R2the coefficient of determination.
22
A related concept is the idea of a semipartial
R2, which tells you what of the variability in
the DV can be accounted for by each IV on its
own, not counting any shared variability.
Semipartial R2 for IV1
IV1
If
DV
is 20 of
then the semipartial R2 for IV1 is .20
IV2
The semipartial R2 is the percentage of
variability in the DV that can be accounted for
by one individual predictor, independent of the
effects of all of the other predictors.
23
Lying with Statistics