The Scholarship of Teaching and Learning SOTL - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

The Scholarship of Teaching and Learning SOTL

Description:

West Virginia University School of Medicine. Carol Thrush, Ed.D. ... Examine the calls for reporting effect sizes. Define p values and effect sizes ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 39
Provided by: scottco5
Category:

less

Transcript and Presenter's Notes

Title: The Scholarship of Teaching and Learning SOTL


1
p Values and Effect Sizes Making Sense of
Research Findings
Scott Cottrell, Ed.D. West Virginia University
School of Medicine Carol Thrush,
Ed.D. University of Arkansas for Medical
Sciences Britta Thompson, Ph.D. Baylor College
of Medicine
2
p values and Effect Sizes
  • Workshop Goals
  • Examine the calls for reporting effect sizes
  • Define p values and effect sizes
  • Identify limitations of p values
  • Identify types of effect sizes
  • Discuss how p values and effect sizes influence
    interpretations of findings
  • Offer practical tips for reporting and
    interpreting findings

3

4
Reporting Effect Size
  • APA Task Force on Statistical Inference
    recommends reporting the direction, size, and
    confidence interval of the effect
  • 20 journals require effect size reporting

5
2006
  • For each critical statistical result there should
    be
  • An effect size such as a treatment effect, a
    regression coefficient, or an odds ratio
  • Indication of the uncertainty of effect (std
    error or CI)
  • Interpretation of meaningfulnesse.g., the
    estimated effect is large enough to be
    educationally important but these data do not
    rule out the possibility that the true effect is
    actually quite small.

6
Three Types of Significance
  • Statistical significance (p-value)
  • Measure of probability (how likely)
  • Evaluate if group means or SDs are different
  • Practical significance (effect size)
  • Measure of magnitude of difference (how large)
  • Clinical significance
  • Measure of the value of the intervention to
    individuals

Thompson, 2002
7
A Matter of Estimation
We calculate statistics (e.g., t, r) from
samples. We ask How Rare is the statistic (p
value)? If we had population data, then there is
no need to calculate a statistic.
8
Limitation of Statistical Significance Tests
  • So, statistical testing answers the question
  • How rare (0 to 1.00) is a calculated statistical
    result for a given sample size.
  • A p value is NOT the probability that a result is
    important or practically significant.

9
Effect Size its not about how likely but
about how big
Among lions, gender has a LARGE effect size
Among tigers, gender has a small effect size
10
Different Effect Sizes
  • Over 40 different effect sizes (Kirk, 1996)
  • -Chi square analysis may use an odds ratio
  • -Correlation may use an r 2
  • There are many frameworks for distinguishing
    effect sizes (Thompson, 2007)
  • Authors must explicitly say what the effect size
    is!!!

11
Common Effect Size Measures
  • Standardized Differences
  • Cohens d
  • Glass delta
  • Measures of Association
  • Eta squared (?2)
  • Coefficient of Determination(r 2)

12
Calculation of Effect Sizes
  • Glass delta
  • Independent samples
  • Mexp - Mcontl / SDcontl
  • Dependent samples
  • Mpre - Mpost / SDpre
  • Cohens d
  • Independent samples
  • Mexp - Mcontl / SDpooled

Hojat Xu, 2004
13
Interpretation of Effect Sizes
  • Small
  • d .20
  • r .10
  • Medium
  • d .50
  • r .30

Large d .80 r .50
Hojat Xu, 2004 Kline, 2004
14
Examples
15
Two Independent Samples
  • Hypothesis
  • A new learning method
  • will help students improve their understanding of
    statistics.

16
Research Design
  • Two independent groups
  • Intervention group (Innovative Method)
  • Control group (Lecture only)
  • Measure
  • Quiz students before and after the 2-wk course
  • Calculate difference in quiz score
  • Sample
  • Intervention group 5
  • Control group 5

17
Results
  • Intervention group
  • M 1.14
  • SD .10
  • 95 CI 1.09-1.19

Control group M 1.13 SD .13 95 CI
1.09-1.18
18
Statistical Analysis
  • Set Alpha (by custom) at .05
  • Alpha is the cut-off p-value that is set before
    the study is conducted
  • Run an independent samples t-test
  • Result t 1.40 p value .09
  • Conclusion
  • Not statistically significant (.09 gt .05)

19
t Distribution


(I need a t statistic gt 1.86, which is rare)
20
Implications of My Findings?
  • A p value of .09 is not good.
  • Does that mean my findings are not important?

21
I want to use my Innovative Method!!
  • Through calculations, I determine that I will
    need a total sample size of 42
  • So, I increase my groups from 5 to 21
  • What is my new result with the larger sample
    size?
  • t 2.30 (higher than 1.86)
  • p value of .001

22
What do my results mean?
  • Are my results important or significant?
  • A misconception a p value determines whether a
    result is important and replicable!!
  • The truth Only effect sizes can help us
    determine whether a result offers a practical
    significance.

23
Effect Sizes My Example
  • An observed effect can be statistically
    significant, but yield little significant value.
  • For example, look at my mean scores
  • mean quiz difference for intervention 1.14
  • mean quiz difference for control 1.13

24
Effect Sizes Independent Samples Example
  • Cohens d
  • Mexp - Mcontl / SDpooled
  • 1.14 1.13 / .12 .083
  • Result - small effect

25
Dependent Sample
  • Hypothesis
  • A new palliative care curriculum will improve
    students
  • palliative care attitudes

26
Research Design
  • Dependent group (pre-post)
  • Measure
  • Attitudinal questionnaire administered before and
    6-months after intervention
  • Sample
  • Matched pre-post 300 students

27
Results (Scale 1-7)
  • Pre
  • M 4.00
  • SD 1.02
  • 95 CI 3.41-4.06

Post M 5.50 SD .56 95 CI 5.44-5.55
28
Statistical Analysis
  • Set Alpha (by custom) at .05
  • Run an independent samples t-test
  • Result t 4.30 p value lt.001
  • Conclusion
  • Statistically significant (lt.001 lt .05)

29
Implications of My Findings?
  • A p value of lt.001 is good
  • Does that mean my findings (difference of 1.5
    points) are important?

30
Effect Sizes Dependent sample example
  • Glass delta
  • Mpre - Mpost / SDpre
  • 4.00 - 5.50 / 1.02 1.47
  • Is this a large effect?

31
Take Home Pearls
32
Effect Size is Key in Meta-Analysis
p values cant contribute to answering an
important question Is the magnitude of the
effect stable across studies? Replicable
results? Effect sizes can ascertain whether an
innovative teaching method has a stable,
practical significance across studies.
33
Differences Between Students in Problem-Based and
Lecture-Based Curricula Measured by Clerkship
Performance Ratings at the Beginning of the Third
Year Whitfield, Mauger, Zwicker, Lehman 14(4),
2002, 211-217
  • RESULTS Mean scoresdiffered significantly in
  • some clerkships, but the effect size was small.
    The
  • effect sizes for fund of knowledge ranged from
    0.20 to
  • 0.41 for clinical problem-solving skills, they
    ranged
  • from 0.26 to 0.39. These differences between the
  • problem-based and lecture-based students were of
  • the same magnitude as the difference at the
    start
  • of medical school on the MCAT, namely d 0.31.

34
  • A Result that is Rare
  • may not be significant
  • Rare (plt.05) ? Practical Significance
  • Yes, by itself, statistical significance means
    very little. It merely means that the results
    are rare Carver (1978)

35
Pearls (Cont)
Avoid saying almost significant at
.06 extremely significant at .0001 ..000001
Avoid interpreting p as p ? .000 Surely, God
loves a .06 as much as .05. Rosnow and
Rosenthal
36
Pearls (cont)
  • Common Misconception
  • If your results are not statistically
    significant, then your results are not important
  • As Thompson has noted
  • Another experiment with a larger sample may find
    a statistically significant difference.
  • Its a matter of probability for a given sample
    at a given time.

37
Thank You!!!!
  • Remember
  • Statistics should never replace
  • good judgment

38
References
  • American Educational Research Association (AERA).
    Standards for Reporting on Empirical Social
    Science Research in AERA Publications.
    Educational Researcher, Aug/Sep, 2006, Vol. 35,
    No. 6, 3340.
  • Carver, R. (1978). The case against statistical
    significance testing. Harvard Educational Review
    48(3) 378-99.
  • Colliver, J.A. (2002). Call for greater emphasis
    on effect-size measures in published articles in
    Teaching and Learning in Medicine. Teaching and
    Learning in Medicine 14(4) 206-10.
  • Hojat, M., Xu, G. (2004). A visitor's guide to
    effect sizes Statistical significance versus
    practical (clinical) importance of research
    findings. Advances in Health Sciences Education
    9241-249.
  • Kline, R.B. (2004). Beyond significance testing
    Reforming data analysis methods in behavioral
    research. American Psychological Association
    Washington, DC.
  • Thompson, B. (2006). Foundations of behavioral
    statistics An insight-based approach. Guilford
    New York.
  • Thompson, B. (2002). "Statistical", "practical",
    and "clinical" How many kinds of significance do
    counselors need to consider? J Counseling
    Development 8064-71.
  • Wilkinson, L., Task Force on Statistical
    Inference (1999). Statistical methods in
    psychological journals Guidelines and
    explanations. American Psychologist, 54, 594-604.
Write a Comment
User Comments (0)
About PowerShow.com