The Science and Art of Exam Development - PowerPoint PPT Presentation

About This Presentation
Title:

The Science and Art of Exam Development

Description:

The Science and Art of Exam Development Paul E. Jones, PhD Thomson Prometric What is validity and how do I know if my test has it? Validity Validity refers to the ... – PowerPoint PPT presentation

Number of Views:334
Avg rating:3.0/5.0
Slides: 49
Provided by: pjo268
Category:

less

Transcript and Presenter's Notes

Title: The Science and Art of Exam Development


1
The Science and Art of Exam Development
Paul E. Jones, PhD Thomson Prometric
2
What is validity and how do I know if my test has
it?
3
Validity
  • Validity refers to the degree to which evidence
    and theory support the interpretations of test
    scores entailed by the proposed uses of tests.
    Validity is, therefore, the most fundamental
    considerations in developing and evaluating
    tests. (APA Standards, 1999, p. 9)

4
A test may yield valid judgments about people
  • If it measures the domain it was defined to
    measure.
  • If the test items have good measurement
    properties.
  • If the test scores and the pass/fail decisions
    are reliable.
  • If alternate forms of the test are on the same
    scale.
  • If you apply defensible judgment criteria.
  • if you allow enough time for competent (but not
    necessarily speedy) candidates to take the test.
  • If it is presented to the candidate in a
    standardized fashion, without environmental
    distractions.
  • If the test taker is not cheating and the test
    has not deteriorated.

5
Is this a Valid Test?
  • 1. 4 - 3 _____ 6. 3 - 2 _____
  • 2. 9 - 2 _____ 7. 8 - 7 _____
  • 3. 4 - 4 _____ 8. 9 - 5 _____
  • 4. 7 - 6 _____ 9. 6 - 2 _____
  • 5. 5 - 1 _____ 10. 8 - 3 _____

6
The Validity Technical Quality of the Testing
System
Design
Item Bank
7
The Validity Argument is Part of the Testing
System
Design
Item Bank
8
How should I start a new testing initiative?
9
A Testing System Begins with Design
Design
Item Bank
10
Test Design Begins with Test Definition
  • Test Title
  • Credential Name
  • Test Purpose (This test will certify that the
    successful candidate has important knowledge and
    skills necessary to )
  • Intended Audience
  • Candidate Preparation
  • High-Level Knowledge and Skills Covered
  • Products or Technologies Addressed
  • Knowledge and Skills Assumed but Not Tested
  • Knowledge and Skills Related to the Test but Not
    Tested
  • Borderline Candidate Description
  • Testing Methods
  • Test Organization
  • Test Stakeholders
  • Other Information

11
Test Definition Begins with Program Design
12
Test Definition Leads to Practice Analysis
13
Practice Analysis Leads to Test Objectives
14
Test Objectives are Embedded in a Blueprint
15
Once I have a blueprint, how do I develop
appropriate exam items?
16
The Testing System
Design
Item Bank
17
Creating Items
Content Characteristics
Response Modes Choose one
Content Options Choose Many
Text Graphics Audio Video Simulations Application
s
Item
Single M/C Multiple M/C Single PC Multiple
PC Drag Drop Brief FR Essay FR Simulation/App
Scoring
18
Desirable Measurement Properties of Items
  • Item-objective linkage
  • Appropriate difficulty
  • Discrimination
  • Interpretability

19
Item-Objective Linkage
20
Good Item Development Practices
  • SME writers in a social environment
  • Industry-accepted item writing principles
  • Item banking tool
  • Mentoring
  • Rapid editing
  • Group technical reviews

21
How can I gather and use data to develop an item
bank?
22
The Testing System
Design
Item Bank
23
Classical Item Analysis Difficulty and
Discrimination
24
Classical Option Analysis Good Item
n proportion discrim Q1 Q2
Q3 Q4 Q5
gt
25
Classical Option Analysis Problem Item
26
IRT Item Analysis Difficulty and Discrimination
27
Good IRT Model Fit
28
How can I assemble test forms from my item bank?
29
The Testing System
Design
Item Bank
30
Reliability
  • Reliability refers to the degree to which test
    scores are free from errors of measurement. (APA
    Standards, 1985, p. 19)

31
More Reliable Test
32
Less Reliable Test
33
How to Enhance Reliability When Assembling Test
Forms
  • Score reliability/generalizability
  • Select items with good measurement properties.
  • Present enough items.
  • Target items at candidate ability level.
  • Sample items consistently from across the content
    domain (use a clearly-defined test blueprint).
  • Score dependability
  • Same as above.
  • Minimize differences in test difficulty.
  • Pass-Fail consistency
  • Select enough items.
  • Target items at the cut score.
  • Maintain same score distribution shape between
    forms

34
Building Simultaneous Parallel Forms Using
Classical Theory
35
Building Simultaneous Parallel Forms Using IRT
36
What options do I have for setting the passing
score for my exam?
37
The Testing System
Design
Item Bank
38
Setting Cut Scores
Why not just set the cut score at 75 correct?
39
Setting Cut Scores
Why not just set the cut score so that 80 of the
candidates pass?
40
The logic of criterion-based cut score setting
  • Certain knowledge and skills are necessary for
    practice.
  • The test measures an important subset of these
    knowledge and skills, and thus readiness for
    practice.
  • The passing cut score is such that those who
    pass have a high enough level of mastery of the
    KSJs to be ready for practice at the level
    defined in the test definition, while those who
    fail do not. (Kane, Crooks, and Cohen, 1997)

41
The Main Goal in Setting Cut Scores
Meeting the Goldilocks Criteria
We want the passing score to be neither too high
nor too low, but at least approximately, just
right.
Kane, Crooks, and Cohen, 1997, p. 8
42
Two General Approaches to Setting Cut Scores
  • Test-Centered ApproachesModified Angoff
  • Bookmark
  • Examinee-Centered ApproachesBorderline
  • Contrasting Groups

43
The Testing System
Design
Item Bank
44
What should I consider as I manage my testing
system?
45
Security of a Testing System
Design
  • Write more items!!!
  • Create authentic items.
  • Use isomorphs.
  • Use Automated Item Generation.
  • Use secure banking software and connectivity
  • Use in-person development

Item Bank
46
Security of a Testing System
Design
  • Establish prerequisite qualifications.
  • Use narrow testing windows.
  • Establish test/retest restrictions.
  • Use identity verification and biometrics.
  • Require test takers to sign NDAs.
  • Monitor test takers on site.
  • Intervene if cheating is detected.
  • Monitor individual test center performance.
  • Track suspicious test takers over time.

Item Bank
47
Security of a Testing System
  • Perform frequent detailed psychometric review.
  • Restrict the use of items and test forms.
  • Analyze response times.
  • Perform DRIFT analyses.
  • Calibrate items efficiently.

Design
Item Bank
48
Item Parameter Drift
49
Security of a Testing System
Design
Item Bank
  • Many unique fixed forms
  • Linear on-the-Fly testing (LOFT)
  • Computerized adaptive testing (CAT)
  • Computerized mastery testing (CMT)
  • Multi-staged testing (MST)

50
Item Analysis Activity
Write a Comment
User Comments (0)
About PowerShow.com