Title: Constructing a Common Scale
1Constructing aCommon Scale
Robert Schulz Deputy Director Chief Technical
and Operating Officer r.schulz_at_eaa.unsw.edu.au Na
thaniel Lewis Data Analysis Manager n.lewis_at_eaa.u
nsw.edu.au Educational Assessment Australia
2Overview
- Problem Raw scores are specific to test paper
- Use of common items across year levels and
calendar years provides - link information
- Use of psychometric model (Rasch) to calculate
item difficulties - (test population independent)
- Equating places all items onto a common scale
- Performance on subset of items places each
students performance on - the scale
- Basis for growth measure
3Raw scores specific to test paper
- Year 8 better than Year 9?
- Unknown
- Different test papers!
- Maybe Year 9 test was much harder
- Raw scores cannot be use to compare year levels
- Same issue to last years raw scores cannot be
used to track growth - Raw scores ARE useful for relative comparison
(e.g. school vs state) - Raw scores can be a common scale (e.g. ICAS
Writing each year)
4Linking of items
- Theoretical (but not practical) solution one
common paper (each calendar year - and year level) raw scores could be compared
(e.g. Writing Marking) - Better (and practical) solution
- Use psychometric model that calculates item
difficulty independent of - population ability AND
- Use of common items questions linked across
papers - Other approaches Use of common person
equating - Same student different papers on same day or in
same week
Vertical link item
Y6
Y6
Horizontal link item
Y5
Y5
Y4
Y4
Simplified version
2003
2006
5Mapping onto a common scale
Raw Score results from different test Papers
Use information of common items to equate
papers and place all items and all students onto
common scale
Common Scale
e.g. Maths Scale gt1,500 items, gt2mil student
test results
Use common scale for comparative reporting
across test papers (e.g. cohort growth or cohort
ability comparison)
6Assessment
- Why?
- Outcomes
- Internal
- Student Progress
- Teacher Performance
- Comparison of Streams
- External
- Annual Reporting
- Parents
- Publicity
7EAA Instruments
- Tools for Decision Making
- Accuracy Certainty
- Assessment Frameworks
8Assessment Frameworks
- Derived from Australian Curriculum
- Objective Measures
- Mathematical Psychology
- Substantive Theory, Expertise, Experience
9Modern Test Theory
- Latent Traits
- Item Response Theory (IRT)
- Rasch Model
- Student Ability
- Item Difficulty
10Rasch Model
- Example 1
- Student Ability lt Item Difficulty
- Less than 50 Probability of Correct Response
- Example 2
- Student Ability gt Item Difficulty
- More than 50 Probability of Correct Response
- Example 3
- Student Ability Item Difficulty
- Equals 50 Probability of Correct Response
11Measurement Scales
?1
?2
?4
?3
?1
?2
?3
?4
?5
- Student Ability
- Item Difficulty
12Rasch Model
- P Probability of Correct Response
- ? Student Ability
- ? Item Difficulty
-
13Equating
- International Standards
- TIMSS / PISA
- NAEP
- UK National Testing
- Australian State Testing
- Australian NAP
- Link Item Equating (Vertical Horizontal)
14Vertical Equating
15Horizontal Equating
- Across Calendar years
- Y5 2006 to Y6 2007
- Y5 2005 to Y5 2006 to Y5 2007
- Trends
- Growth Focus
- Curriculum Change
- Evaluate Decisions
16EAA Scale Building
- A Common Measure
- Substantive Theory
- Psychology of Measuring Human Ability
- Advanced Statistical Methods
- International Standards
- Potential Uses Of TAP Scales
17Questions and more?