Title: Benchmarking with National and International Assessments
1Benchmarking with National and International
Assessments
- Larry V. Hedges
- Northwestern University
This paper is intended to promote the exchange of
ideas among researchers and policy makers. The
views expressed in it are part of ongoing
research and analysis and do not necessarily
reflect the position of the National Center for
Education Statistics, the Institute of Education
Sciences, or the U.S. Department of Education.
2Advantages of Benchmarking with International
Assessments
- Assess your population versus other nations
- Permits comparison with an external standard
- However
- Comparisons are informative only if they compare
like with like - Countries may not be the most relevant external
standards (because they differ in too many ways)
3 International Assessments are Limited
- International assessments involve compromises
across nations with very different curricula and
education systems - Compromises are essential to assure broader
relevance - However,
- Compromises limit the assessments relevance for
any local purposes.
4International Assessments are Limited
- Compromises involve
- Content specifications
- Assessment designs
- Sampling of ages/grades
- Background questions
- International assessments are cross sectional
- Assessments (cross sectional or longitudinal) are
not suited for hypothesis testing
5PISA
- PISA was constructed as a measure of literacy in
reading, math, and science - It measures life skills rather than academic
skills specifically taught in school - This has some advantages in constructing a cross
national assessment - This limits its usefulness for monitoring school
policy or the outputs of schools.
6PISA
- Because it is not explicitly tied to school
curriculum, the relation of PISA to school
policies is not obvious - An assessment of 15 year olds is not particularly
policy relevant for measuring the output of US
schools - The background questionnaire material is well
suited to international comparisons, but
imperfect for US purposes - The temptation to use cross sectional comparisons
to draw conclusions is likely to be irresistible
7IEA Studies (TIMSS and PIRLS)
- Another benchmarking possibility is the regular
cycle of IEA international comparative studies
(TIMSS and PIRLS) - These assessment instruments are tied more
closely to academic skills that are explicitly
taught in schools - They still offer the possibility of international
standards, but are likely to be more relevant to
curriculum and instruction and school policies.
8IEA Studies (TIMSS and PIRLS)
- It should be simpler to draw conclusions about
their relation to school policies involving
instruction - Their sampling designs permit international
comparison at more relevant ages - Background material may be more policy relevant,
including extensive material on instruction - The temptation to draw conclusion from cross
sectional data will still be there
9NAEP
- There already is a program of immediately
relevant cross sectional data collection in the
US NAEP - This assessment involves no international
compromises on content specification - It is the de facto standard for measuring
academic achievement in the US - It has a long time series of achievement scores
- However, it does have only very limited
background information.
10NAEP
- It is already in place and is a trusted source
of information about achievement - It is based entirely on US standards
- The sampling is relevant to US policy
- Background material is limited.
- Limited background information means there is
less temptation to draw conclusions
11NCES Longitudinal Studies
- The NCES longitudinal studies are another
potential source of benchmarking opportunities - Arguably they are better suited for hypothesis
generation than are the cross sectional studies - They have far superior background measurement
(for research purposes) - They could provide comparisons over a wide age
range