Development of a Confidence Interval for Small Sample Expert Review of Item Content Validation - PowerPoint PPT Presentation

About This Presentation
Title:

Development of a Confidence Interval for Small Sample Expert Review of Item Content Validation

Description:

It is evidence-in-waiting (Shepard, 1993; Yalow & Popham, 1983) Unfortunately, in many technical manuals, content representation is dealt with in a paragraph, ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 17
Provided by: Jeffrey481
Learn more at: http://plaza.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: Development of a Confidence Interval for Small Sample Expert Review of Item Content Validation


1
Development of a Confidence Interval for Small
Sample Expert Review of Item Content Validation
  • Jeffrey M. Miller Randall D. Penfield
  • FERA November 19, 2003
  • University of Florida
  • millerjm_at_ufl.edu penfield_at_coe.ufl.edu

2
INTRODUCING CONTENT VALIDITY
  • Validity refers to the degree to which evidence
    and theory support the interpretations of test
    scores entailed by proposed uses of tests
    (AERA/APA/NCME, 1999)
  • Content validity refers to the degree to which
    the content of the items reflects the content
    domain of interest (APA, 1954)

3
THE NEED FOR IMPROVED REPORTING
Content is a precursor to drawing a score-based
inference. It is evidence-in-waiting (Shepard,
1993 Yalow Popham, 1983)
Unfortunately, in many technical manuals,
content representation is dealt with in a
paragraph, indicating that selected panels of
subject matter experts (SMEs) reviewed the test
content, or mapped the items to the content
standards and all is well (Crocker, 2003)
4
QUANTIFYING CONTENT VALIDITY
  • Several indices for quantifying expert agreement
    have been proposed
  • For many, experts quantify the match of the item
    to an objective using a rating scale
  • The mean rating across raters is often used in
    calculations
  • Klein Kosecoffs Correlation (1975)
  • Aikens V (1985)
  • The mean, by itself, does not account for error
    and does not tell us how far it lies from the
    population mean. WE NEED A CONFIDENCE INTERVAL!

5
THE CONFIDENCE INTERVAL
  • The traditional confidence interval assumes a
    normal distribution for the sample mean of a
    rating scale.
  • However, the assumption of population normality
    can not be justified when analyzing the mean of
    an individual scale item because
  • 1.) the outcomes of the items are discrete
  • 2.) the items are bounded by the limits of the
    Likert-scale.
  • 3.) sample size for raters is too small even if
    the above were not problematic

6
SCORE CONFIDENCE INTERVAL FOR RATING SCALES
  • The Score confidence interval (Penfield, 2003)
    treats rating scale variables as outcomes of a
    binomial distribution.
  • This interval is asymmetric
  • Hence, it is based on the actual distribution for
    the item of concern.
  • Further, the limits cannot extend below or above
    the actual limits of the categories.

7
  • 1. Obtain values for n, k, and z
  • n the number of raters
  • k the number of possible ratings
  • The highest rating is scale starts with 0
  • The highest rating minus 1 if scale starts
    greater than 0
  • z the standard normal variate associated with
    the confidence level (e.g., /- 1.96 at 95
    confidence)

8
  • 2. CalculateThe sum of the ratings for an
    item divided by the number of raters

9
  • 3. Calculate p
  • Or if scale begins with 1 then

10
  • 4. Use p to calculate the upper and lower limits
    for the population proportion (Wilson, 1927)

11
  • 5. Calculate the upper and lower limits of the
    Score confidence interval

12
  • Shorthand Example (cont.)
  • Let n 10, k 4, z 1.96, and let the sum of
    the items 31

  • so, the mean equals 31/10 3.100

  • so, p 31 / (104) 0.775

13
  • Shorthand Example (cont.)
  • 3.100 1.96sqrt(0.938/10) 2.500
  • 3.100 1.96sqrt(0.421/10) 3.507

14
  • (65.842 11.042) / 87.683 0.625
  • (65.842 11.042) / 87.683 0.877

15
  • Conclusion

We are 95 confident that the population mean
rating falls somewhere between 2.500 and 3.507
16
     
  • EXAMPLE WITH 4 ITEMS

     
Write a Comment
User Comments (0)
About PowerShow.com