Reliability - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Reliability

Description:

Not consistently good or bad, just consistent. Scores on a ... Kappa. Increasing inter-rater reliability. Sources of Error Variance. Test-retest. Alternate form ... – PowerPoint PPT presentation

Number of Views:104

Avg rating:3.0/5.0

Slides: 19

Provided by: JenniferS97

Category:

more less

Transcript and Presenter's Notes

Title: Reliability

1
Reliability

Schroeder
PSY/SPED 572

2
Definition of Reliability

Consistency of scores across measurements
Not consistently good or bad, just consistent
Scores on a test reflect true score plus error

3
Reliability

Reliability consistency
Affected by true score and amount of error

4
Sources of Error

Test Construction
Test Administration
Test Scoring and Interpretation

5
Correlation Coefficients

Describes degree of relationship between two
variables strength and direction
(r)
The extent to which changes in one variable
affect change in another variable
Ranges from 1.0 to 1.0
Strength of relationship depends on the number
Direction of the relationship depends on sign
Perfect correlations
P. 120-121

6
Correlation Coefficients

Multiple ways of computing correlation
coefficients
Choice depends on type of data (nominal, ordinal,
etc.)
Pearson r
Most commonly used
Looks for line of best fit (regression line)
Interval or ratio data

7
Correlation Coefficients

Other correlation coefficients
Spearman rho (?) ordinal data
Phi (?) nominal data dichotomy
Interpretation
Significance (e.g., .05)
Strength of relationship
Strong gt .70
Moderate .30 - .70
Weak lt .30
Causality

8
Reliability Coefficients

Why do we care?
Need to be consistent
What would happen if a test produced wildly
different scores on different administrations?
What could we say about what that score means?

9
Reliability Coefficients

Indicates the amount of variability in two sets
of scores that indicates true score differences
Ranges from 0.0 to 1.0
Desired level of reliability (Saliva Ysseldyke,
1995)
Typically .70 or above
.90 or above for decision-making about
individuals
.70 or above for decision-making about groups

10
Types of Reliability

Test-retest
Same test given at two different time periods
How long between testing?
Stability over time
Stability coefficient
Tend to be inflated
Concerns

11
Types of Reliability

Alternate forms
Two different forms of the same test given at
same or different times
Stability of content (forms) and time (if
different administrations)
Concerns
Split-half
One form, one administration
Divide questions to compare against each other
Stability of content can you generalize across
items?
Concerns

12
Types of Reliability

Kuder-Richardson, Coefficient Alpha
One form, one administration
Stability of content, content heterogeneity
Looks at inter-item consistency
How similar are items to each other?
How well does a test hang together?
Kuder-Richardson for dichotomous items (e.g.,
right or wrong)
Coefficient alpha for multiple-choice items
(e.g., personality forced choice)

13
Types of Reliability

Inter-rater
Used when scoring is subjective
Has two judges rate same items
Consistency across raters
Simple agreement
Kappa
Increasing inter-rater reliability

14
Sources of Error Variance

Test-retest
Alternate form
Split-half
Kuder-Richardson Coefficient alpha
Inter-rater

Time sampling
Content (time) sampling
Content sampling
Content sampling and content heterogeneity
Differences between judges

15
Sources of Error Variance
16
Factors affecting reliability

Homogeneity vs. heterogeneity
Dynamic vs. static characteristics
Restriction of range
Speed vs. power
Test Length
Spearman-Brown formula
Ability levels
Test-retest interval
Guessing
Cheating
Situational variation

17
Standard Error of Measure

SEM
Another way to describe a tests reliability or
test error
Provides information about confidence in score
interpretation
Used when interpreting scores for individuals
As reliability increases, SEM decreases

18
Confidence Intervals

Estimates the likelihood that a persons true
score would be within a given range of scores
using SEM
Need to determine how correct you want to be
(95, 99)
More confidence in being correct larger
interval
Score of 75 SEM 5
68 confidence range 70-80
95 confidence range 65-85