Title: Overview of NAEP Scoring
1Overview of NAEP Scoring
Teresa A. Neidorf NAEP Education Statistics
Services Institute American Institutes for
Research
CCSSO Large-Scale Assessment Conference San
Francisco, CA June 28, 2006
2- Arts
- Civics
- Economics
- Geography
- Foreign Language
- Mathematics
- Reading
- Science
- U.S. History
- World History
- Writing
The National Assessment of Educational Progress
Only nationally representative and continuing
survey of what Americas students know and can do
in various subjects.
NAEP State Assessments
3NAEP Organizational Model
U.S. Department of Education
Institute of Education Sciences
National Assessment Governing Board (NAGB)
National Center for Education Statistics (NCES)
Assessment Division National Assessment of
Educational Progress
NAEP Contractors ETS, AIR, Pearson, Westat, GMRI,
Hager Sharp, HumRRO and NESSI
4Key Contractor Roles in NAEP Scoring
- Educational Testing Service (ETS)
- Item scoring guide development
- Assemble training materials
- Scorer training
- Statistical specifications and monitoring
- Pearson Educational Measurement
- Scoring centers staff facilities for all NAEP
scoring - Hiring/placing, training, supervising, and
monitoring scorers - Develop/maintain electronic scoring system (ePEN)
- Prepare scoring reports
- Create/submit NAEP data files
- NAEP Education Statistics Services Institute
(NESSI) - Technical assistance to NCES in oversight of NAEP
scoring - Human Resources Research Organization (HumRRO)
- Quality assurance contractor
5NAEP Goals
- Measure student achievement in grades 4, 8 and 12
in selected subjects on a regular basis - Relate achievement to educational context
- Track trends in achievement over time
- Ensuring the quality of student achievement data
is a primary goal for NAEP.
6Average scale scores in mathematics, grade 8
Various years, 1990-2005
Reporting NAEP Trend
7Quality of NAEP Trend
- Ensuring the integrity of the trend measure is a
hallmark of NAEP. - Any changes in reported achievement should be a
measure of true student performance differences
and not due to other sources. - Changes in scoring is a potential source of error
that can corrupt the true trend measure and must
be controlled. - Trend scoring monitoring is a critical component
of quality assurance for NAEP. - Trend monitoring is based on the statistical
requirements of the NAEP psychometric model
8NAEP Assessment Design
- Representative student samples (national, state
and district level) - Item samples
- Cover content in NAEP frameworks
- Multiple-choice and constructed-response
- Trend items (repeated across assessment years)
- Released items (replaced after each assessment)
- Matrix sampling
- Each student receives a portion of the total item
pool - No individual student scores are reported!!
9NAEP Analysis Reporting
- Item Response Theory (IRT) scaling used to
produce scale score estimates - Group estimates for populations of students
- National, state or district level
- Reporting groups (e.g., race/ethnicity, gender)
- Reporting
- Average scale scores
- Percentages by Achievement Level
- Trends (changes in achievement over time)
- Trend items permit linking across adjacent years
to report trend
10NAEP Constructed-Response Scoring
- Explicit scoring guides developed for each item
to meet criteria in the NAEP framework - Scored by teams of qualified and trained scorers
- Using an electronic image processing, scoring and
monitoring system (ePEN) - Scoring monitoring reflects the NAEP model
- No individual student scores reported
- Some random fluctuation in item-level scores is
acceptable - Trend measure is paramount
11NAEP Scoring Process
- Phase I Development and Pilot
- Scoring guides developed along with items
- Pilot test administered and scored
- Items selected for operational assessment
- Phase II First Operational Assessment (or field
test) - Scoring guides refined and training packets
assembled - Baseline for trend established
- Phase III Subsequent Operational Assessments
- Scoring guides and training packets finalized
- Trend scoring procedures implemented
12NAEP Scoring Quality Assurance Steps
- Scorer screening placement
- Scorer training (scoring guide, anchor practice
sets) - Scorer qualification
- Trend training
- Ongoing team calibration
- Monitoring scorer accuracy (backreading)
- Monitoring scoring reliability (within-year
agreement) - Second scoring a portion of responses
- Monitoring trend agreement
- Rescoring responses from previous year
- Interspersed with current-year responses
13For further information
- National Center for Education Statistics (NCES)
- http//nces.ed.gov/nationsreportcard
- NAEP Reports
- NAEP Questions Tool
- NAEP Data Explorer
- National Assessment Governing Board (NAGB)
- http//www.nagb.org
- NAEP Frameworks