Reliability - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Reliability

Description:

Not consistently good or bad, just consistent. Scores on a ... Kappa. Increasing inter-rater reliability. Sources of Error Variance. Test-retest. Alternate form ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 19
Provided by: JenniferS97
Category:

less

Transcript and Presenter's Notes

Title: Reliability


1
Reliability
  • Schroeder
  • PSY/SPED 572

2
Definition of Reliability
  • Consistency of scores across measurements
  • Not consistently good or bad, just consistent
  • Scores on a test reflect true score plus error

3
Reliability
  • Reliability consistency
  • Affected by true score and amount of error

4
Sources of Error
  • Test Construction
  • Test Administration
  • Test Scoring and Interpretation

5
Correlation Coefficients
  • Describes degree of relationship between two
    variables strength and direction
  • (r)
  • The extent to which changes in one variable
    affect change in another variable
  • Ranges from 1.0 to 1.0
  • Strength of relationship depends on the number
  • Direction of the relationship depends on sign
  • Perfect correlations
  • P. 120-121

6
Correlation Coefficients
  • Multiple ways of computing correlation
    coefficients
  • Choice depends on type of data (nominal, ordinal,
    etc.)
  • Pearson r
  • Most commonly used
  • Looks for line of best fit (regression line)
  • Interval or ratio data

7
Correlation Coefficients
  • Other correlation coefficients
  • Spearman rho (?) ordinal data
  • Phi (?) nominal data dichotomy
  • Interpretation
  • Significance (e.g., .05)
  • Strength of relationship
  • Strong gt .70
  • Moderate .30 - .70
  • Weak lt .30
  • Causality

8
Reliability Coefficients
  • Why do we care?
  • Need to be consistent
  • What would happen if a test produced wildly
    different scores on different administrations?
  • What could we say about what that score means?

9
Reliability Coefficients
  • Indicates the amount of variability in two sets
    of scores that indicates true score differences
  • Ranges from 0.0 to 1.0
  • Desired level of reliability (Saliva Ysseldyke,
    1995)
  • Typically .70 or above
  • .90 or above for decision-making about
    individuals
  • .70 or above for decision-making about groups

10
Types of Reliability
  • Test-retest
  • Same test given at two different time periods
  • How long between testing?
  • Stability over time
  • Stability coefficient
  • Tend to be inflated
  • Concerns

11
Types of Reliability
  • Alternate forms
  • Two different forms of the same test given at
    same or different times
  • Stability of content (forms) and time (if
    different administrations)
  • Concerns
  • Split-half
  • One form, one administration
  • Divide questions to compare against each other
  • Stability of content can you generalize across
    items?
  • Concerns

12
Types of Reliability
  • Kuder-Richardson, Coefficient Alpha
  • One form, one administration
  • Stability of content, content heterogeneity
  • Looks at inter-item consistency
  • How similar are items to each other?
  • How well does a test hang together?
  • Kuder-Richardson for dichotomous items (e.g.,
    right or wrong)
  • Coefficient alpha for multiple-choice items
    (e.g., personality forced choice)

13
Types of Reliability
  • Inter-rater
  • Used when scoring is subjective
  • Has two judges rate same items
  • Consistency across raters
  • Simple agreement
  • Kappa
  • Increasing inter-rater reliability

14
Sources of Error Variance
  • Test-retest
  • Alternate form
  • Split-half
  • Kuder-Richardson Coefficient alpha
  • Inter-rater
  • Time sampling
  • Content (time) sampling
  • Content sampling
  • Content sampling and content heterogeneity
  • Differences between judges

15
Sources of Error Variance
16
Factors affecting reliability
  • Homogeneity vs. heterogeneity
  • Dynamic vs. static characteristics
  • Restriction of range
  • Speed vs. power
  • Test Length
  • Spearman-Brown formula
  • Ability levels
  • Test-retest interval
  • Guessing
  • Cheating
  • Situational variation

17
Standard Error of Measure
  • SEM
  • Another way to describe a tests reliability or
    test error
  • Provides information about confidence in score
    interpretation
  • Used when interpreting scores for individuals
  • As reliability increases, SEM decreases

18
Confidence Intervals
  • Estimates the likelihood that a persons true
    score would be within a given range of scores
    using SEM
  • Need to determine how correct you want to be
    (95, 99)
  • More confidence in being correct larger
    interval
  • Score of 75 SEM 5
  • 68 confidence range 70-80
  • 95 confidence range 65-85
Write a Comment
User Comments (0)
About PowerShow.com