Effect Size and Statistical Power Analysis in Behavioral and Educational Research - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Effect Size and Statistical Power Analysis in Behavioral and Educational Research

Description:

Psychotherapy. Totals. Not improved. Improved. Condition. Treatment ... Psychotherapy success rate: .50 r/2 = .66. Control success rate: .50 r/2 = .34 ... – PowerPoint PPT presentation

Number of Views:264
Avg rating:3.0/5.0
Slides: 29
Provided by: onghena
Category:

less

Transcript and Presenter's Notes

Title: Effect Size and Statistical Power Analysis in Behavioral and Educational Research


1
Effect Size and Statistical Power Analysis in
Behavioral and Educational Research
  • Effect size 1 (P. Onghena)09.00-10.30 a.m.
  • Effect size 2 (W. Van den Noortgate)10.45-12.15
    a.m.
  • Power 1 (I. Van Mechelen)02.00-03.30 p.m.
  • Power 2 (P. Onghena)03.45-04.30 (A-N) /
    04.30-05.15 (O-Z)

2
SIGNIFICANCE TESTING CRISIS
  • Carver, R. P. (1993). The case against
    statistical significance testing
  • Cohen, J. (1994). The earth is round (p lt .05).
  • Falk, R., Greenbaum, C. W. (1995). Significance
    tests die hard The amazing persistence of a
    probabilistic misconception.
  • Hunter, J. E. (1997). Needed A ban on the
    significance test.

3
CHILDHOOD TRAUMATA
  • Furious parental conflicts
  • Karl Pearson versus Ronald Fisher
  • Ronald Fisher versus Jerzy Neyman (Egon Pearson)
    see Box (1978), Gigerenzer et al. (1990), Oakes
    (1986)
  • Morrison, D. R., Henkel, R. E. (Eds.). (1970).
    The significance test controversy A reader.

4
POSSIBILITY FOR GROWTH
  • APA Task Force on Statistical Inference
  • 1999 American Psychologist article Wilkinson
    the Task Force
  • 2001 Publication Manual (5th ed.)
  • Editorial boards of flagship journals Journal of
    Consulting Clinical Psychology, Journal of
    Counseling and Development, Exceptional Children,
    Journal of Learning Disabilities,

5
GUIDELINES
  • Power and sample size. Provide information on
    sample size and the process that led to sample
    size decisions. Document the effect sizes,
    sampling and measurement assumptions, as well as
    analytic procedures used in power calculations.

6
Because power computations are most meaningful
when done before data are collected and examined,
it is important to show how effect-size estimates
have been derived from previous research and
theory in order to dispel suspicions that they
might have been taken from data used in the study
or, even worse, constructed to justify a
particular sample size. Once the study is
analyzed, confidence intervals replace calculated
power in describing results.
7
GUIDELINES
  • Hypothesis tests. It is hard to imagine a
    situation in which a dichotomous accept-reject
    decision is better than reporting an actual p
    value or, better still, a confidence interval.
    Never use the unfortunate expression "accept the
    null hypothesis." Always provide some effect-size
    estimate when reporting a p value.

8
GUIDELINES
  • Effect sizes. Always present effect sizes for
    primary outcomes. If the units of measurement are
    meaningful on a practical level (e.g., number of
    cigarettes smoked per day), then we usually
    prefer an unstandardized measure (regression
    coefficient or mean difference) to a standardized
    measure (r or d). It helps to add brief comments
    that place these effect sizes in a practical and
    theoretical context.

9
For a simple, general purpose display of the
practical meaning of an effect size, see
Rosenthal and Rubin (1982). Consult Rosenthal and
Rubin (1994) for information on the use of
counternull intervals for effect sizes, as
alternatives to confidence intervals.
10
GUIDELINES
  • Interval estimates. Interval estimates should be
    given for any effect sizes involving principal
    outcomes. Provide intervals for correlations and
    other coefficients of association or variation
    whenever possible.

11
EFFECT SIZE IMPORTANCE
  • For power analysis (Cohen, 1969)
  • For meta-analysis (Glass, 1976)
  • For descriptive statistics

12
EFFECT SIZE WHAT THE HELL?
  • Cohen (1969) By the above route, it can now
    readily made clear that when the null hypothesis
    is false, it is false to some degree, i.e., the
    effect size (ES) is some specific nonzero value
    in the population. (p. 10)

13
EFFECT SIZE WHAT THE HELL?
  • Use of the tables for significance testing
  • Cohen (1969) Accordingly, we refine our ES
    index, d, so that its elements are sample
    results, rather than population parameters, and
    call it ds. (p. 64)

14
EFFECT SIZE WHAT THE HELL?
15
EFFECT SIZE WHAT THE HELL?
  • Glass (1976) uses ds in meta-analysis but only
    uses S of the control group in the denominator.
  • Hedges (1981), Hedges and Olkin (1985)ds is
    called g (with reference to Gene Glass)?
    Hedgess g
  • Hedges (1981), Hedges and Olkin (1985)confusion
    an approximately unbiased estimator called... d!?

16
EFFECT SIZE SUMMARYCOMPARISON OF TWO MEANS
  • Cohens d population value (if you use the
    sample as your population, then use the sample
    size in the denominator)
  • Hedgess g sample estimator (use the degrees of
    freedom in the denominator)
  • Hedgess unbiased estimator is rarely used
    outside meta-analytic contexts
  • point biserial correlation coefficient
    (Rosenthal, 1991)

17
EFFECT SIZE EXAMPLE
Experimental Control
7 4
7 4
6 3
5 2
5 2
Sum 30 15
Mean 6 3
S (?) 1 (0.894) 1 (0.894)
18
EFFECT SIZE EXAMPLE
  • Cohens d (6 3) / .894 3.35
  • Hedgess g (6 3) / 1 3
  • Point biserial correlation coefficient7 7 6
    5 5 4 4 3 2 21 1 1
    1 1 0 0 0 0 0r .86
  • All kinds of transformations possible
    t ? d ? g ? r

19
COUNTERNULL VALUE OFAN ES
  • Tackle the misconceptions
  • that failure to reject the null hypothesis ? ES
    0
  • that finding a statistically significant p value
    implies an ES of important magnitude
  • The counternull value is the nonnull magnitude of
    ES that is supported by exactly the same amount
    of evidence as is the null value of the ES.
  • If the counternull value were taken as H0, then
    the resulting p value would be the same as the
    obtained p for the actual H0

20
COUNTERNULL VALUE OF AN ES
  • For symmetric reference distributions
    EScounternull 2ESobtained ESnull
  • For asymmetric reference distributions
  • transform the ES as to have a symmetric reference
    distribution
  • calculate the counternull on the symmetric scale
  • transform back to obtain the counternull on the
    original scale
  • Example of its use RRR (2000)

21
INTERPRETING EFFECT SIZES
  • Cohens heuristic values
  • small d 0.20 the size of the
    difference between 15- and 16-year-old
    girls
  • medium d 0.50 visible to the naked
    eye 14- and 18-year-old girls
  • large d 0.80 grossly perceptible
    13- and 18-year-old girls

22
INTERPRETING EFFECT SIZES
  • Comparison with other measures
  • small d 0.20 r .10
    r2 .01
  • medium d 0.50 r .243 r2
    .059
  • large d 0.80 r .371 r2 .138

23
BINOMIAL EFFECT SIZE DISPLAY
r .32 Treatment outcome Treatment outcome
Condition Improved Not improved Totals
Psychotherapy 66 34 100
Control 34 66 100
Totals 100 100 200
24
BINOMIAL EFFECT SIZE DISPLAY
  • What is the effect on the success rate of the
    implementation of a certain treatment?
  • Psychotherapy success rate .50 r/2 .66
  • Control success rate .50 r/2 .34
  • Notice .66 .34 .32
  • standardized percentages in order for all
    margins to be equal

25
ASPIRINS EFFECT ONHEART ATTACK
Condition Heart attack No heart attack Total
Aspirin 104 10933 11037
Placebo 189 10845 11034
Totals 293 21778 22071
26
ASPIRINS EFFECT ONHEART ATTACK BESD
Condition Heart attack No heart attack Total
Aspirin 48.3 51.7 100
Placebo 51.7 48.3 100
Totals 100 100 200
27
SMALL EFFECTS MAY BE IMPRESSIVE
  • and vice versa (Prentice Miller, 1992)
  • consider the amount of variation in the
    independent variable
  • consider the importance / the assumed stability
    of the dependent variabele

28
WHAT EFFECT SIZE HAS PRACTICAL SIGNIFICANCE?
  • assess practical significance closely related to
    the particular problems, populations, and
    measures relevant to the treatment under
    investigation
  • Example community mental health studyinpatient
    versus outpatient therapy
  • Example effects of school characteristics on
    reading achievementfifth grade pupils versus
    sixth grade pupils
Write a Comment
User Comments (0)
About PowerShow.com