Title: Standards Based Assessment
1Standards Based Assessment
- Barbara S. Plake
- Buros Center for Testing
- University of Nebraska
2Role of Testing in Education
- Tests for high school graduation eligibility
- Tests for grade promotion/retention decisions
- Tests for Kindergarten readiness
- Bush Education Plan (No Child Left Behind)
3Is all of this testing necessary?
- Used in educational reform movement
- Used in accountability models
- Used to modify/plan instruction
- Knowing that students are learning is important
- Test results can provide useful information for
these purposes
4Usefulness of test information dependent on its
quality
- In the rush to meet reform/accountability goals,
some policy makers have neglected issues of
quality - Concern about the validity of decisions made on
ineffective, unfair, and/or inappropriate tests
5Focus of Presentation
- Need for a list of criteria for technical quality
for tests used for Standards-based Testing
6Six Quality Criteria
- Alignment to Standards
- Opportunity to Learn
- Freedom from Bias/Sensitive Situations
- Developmentally Appropriate
- Score Consistency/Reliability
- Appropriate Cutscores/Mastery Levels
7Important but not Addressed Components
- Retake/due process/appeal policy
- Accommodations
- Score reporting and confidentiality
- Need for multiple indicators
- Administration
- Focus on psychometric, not policy issues
8Presentation
- Discuss each quality criterion separately
- Give illustrations of the kinds of evidence needed
9Alignment to Standards
- Does the assessment measure the standards?
- Requires a look at the match between standards
and the tasks/questions on the test - Content
- Cognitive complexity
- Rigor
10Evidence Needed for Alignment
- Panels of experts (sometimes practicing,
grade/content teachers) - Evaluate content match (Does this item measure
what the standards address?) - Cognitive match (Is the task demanded by the item
cognitively consistent with the cognitive
requirements of the standard?) - Rigor (tasks are non-trivial and appropriate in
difficulty) - Is there sufficient evidence in the assessment to
allow for inferences about mastery of the
standards?
11Opportunity to Learn
- Are students instructed on the content measured
on the test? - Is there content on the test that is consistent
with the standards but not taught prior to the
administration of the test?
12Evidence Needed for Opportunity to Learn
- Requires an evaluation of the match of
instruction to the content of the test - Usually achieved by having teachers report
when/if the content measured on the test is
presented to the students - Some districts require indications in teachers
lesson plans what standard(s) are being addressed
with each lesson - Some districts communicate this match explicitly
to students when introducing a lesson
13Developmentally Appropriate
- Are the cognitive reading demands of the
assessment appropriate for the grade level of the
students? - Requires an evaluation of the cognitive and
reading level demands of the test
14Evidence for Developmental Appropriateness
- Can be achieved by having panels of grade/content
level teachers and educational psychologists
evaluate the match of the test demands to the
developmental level of the students - Readability indices are sometimes used
- If test is measuring Reading Comprehension, then
it is expected that the readability level should
be close to the grade level of the students - If test is measuring non-reading skills, then the
readability level should probably be lower than
the grade level of the test.
15Freedom from Bias/Sensitive Situations
- Are all students treated fairly in the test
questions? Are there some questions that might
disadvantage certain groups of students? - Fairness/validity issues Would some groups of
students, who have the knowledge to answer the
questions correctly, be disadvantaged by the
language in the question?
16Evidence for Freedom from Bias
- Test development teams should have workshops on
bias/sensitivity issues - Items can be reviewed by bias review committees
- Evidence of statistical differential item
functioning should be gathered
17Consistency in Scoring
- Are the results trustworthy?
- Requires evidence of score decision
consistency/reliability - Evidence needs to be provided about score
precision at cutpoint - Test/retest internal consistency agreement in
scoring
18Evidence Needed for Consistency in Scoring
- Classical/IRT models yield slightly different
information - Overall evidence of score consistency/reliability/
generalizability - SEM at Cutscore(s)
- DecisionConsistency/Decision Accuracy
19Appropriate Mastery Levels
- Are the standards for passing appropriate to the
tasks required in the test? - Requires that passing/mastery levels be set by a
process that takes test difficulty into account
20Evidence Needed for Mastery Levels
- Standard setting methods
- Judgmental
- Empirical
- Definitional (through rubric)
- Reasoned, systematic, repeatable procedure
- Errors of measurement considered in policy
decision - measurement error cutsocre
21Putting it all together
- These 6 Technical Quality Criteria address
fundamental precepts of measurement - Validity (Content alignment, curriculum
integration, fairness, cutscore) - Reliability (Consistency in Scoring SEM at
cutscore) - All of these issues are covered in the AERA,
AAPA, NCME Standards for Educational and
Psychological Testing, 1999
22Outcome of Meeting the 6 Quality Criteria
- Test scores should accurately reflect what it was
intended students learn - Test scores should accurately reflect in part
what students have been taught - Test scores should be fair and accurate
indicators of what students know and are able to
do - Test scores should be trustworthy and
classification decisions consistent/accurate
23If this were true
- Have more honest and accurate assessment of how
students are performing - Policy makers goals of reform would more likely
be met - Students would be treated more fairly in
decisions about their educational success
24These Criteria are Achievable
- None of these criteria are much different from
what is currently being done in several states
with their testing programs - If these were clearly articulated to policy
makers and test users, they would know what
evidence is important and should be required
25Goal of Presentation
- Provide a document that could help policy makers
and test users in evaluating their standards
based testing program - Add technical credibility to the educational
reform/accountability movement
26Thank YOU
- I am pleased to be able to share my expertise in
this area - I hope this information will be useful
- The best outcome for me would be if the
information in this presentation resulted in
better standards based tests and testing practices