Standard Setting - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Standard Setting

Description:

Cut Score: Minimum test score defined as 'pass' or 'at-or-above' ... Choose maybe 10 possible cut scores, and tell what each means and implies ... – PowerPoint PPT presentation

Number of Views:298
Avg rating:3.0/5.0
Slides: 27
Provided by: larry112
Category:
Tags: cut | setting | standard

less

Transcript and Presenter's Notes

Title: Standard Setting


1
Standard Setting
Edward Haertel
The Future of Test-Based AccountabilityFestschri
ft in Honor of Robert L. Linn UCLA, Los Angeles,
CA January 22-23, 2007
2
Overview
  • Standards-Based Score Interpretations
  • Problems with Present Practice
  • Promising Future Directions

3
Standards-Based Score Interpretations
4
Score Interpretations
Criterion-Referenced
Norm-Referenced
Standards-Based
5
Standards-Based Interpretations
  • Most twelfth graders lack even basic knowledge of
    U.S. history (NAEP, 2001)
  • 22 of California 8th graders are at or above
    proficient in Mathematics(NAEP, 2005)
  • 32 of California 8th graders are at or above
    proficient in Mathematics(CDE, 2005)

6
What does "Proficient" mean?
7

"Saying that all students must be ... proficient
... by 2014, but leaving the definition of
proficient ... to the states has resulted in so
much state-to-state variability ... that
proficient has become a meaningless
designation."
Linn (2005), From CRESST Tech Report No. 651
8
Content Stds, Perf Stds, Cutpts
  • (Academic) Content Standards Curriculum
    framework what's supposed to be taught
  • Performance (Achievement) Standarddescription
    of what "meeting the standard" is supposed to
    mean
  • Cut ScoreMinimum test score defined as "pass"
    or "at-or-above"
  • "Operational definition" of performance standard

9
Performance Standards
"Proficient represents solid academic performance
for each grade assessed. Students reaching this
level have demonstrated competency over
challenging subject matter, including
subject-matter knowledge, application of such
knowledge to real-world situations, and
analytical skills appropriate to the subject
matter."
Definition used for NAEP, also used by California
10
"Proficient" for 8th grade math
  • Be able to conjecture, defend ideas, give
    examples
  • understand connections among fractions, percents,
    decimals, ... algebra, functions
  • be able to convey underlying reasoning skills
    beyond ... arithmetic
  • accurately use the tools of technology

Excerpts from 150 word definition for NAEP
11
Problems with Present Practice
12
Problems with Perf Stds
  • Vague, ill-defined performance standard
  • No performance standard at all
  • Performance standard not aligned with test
  • Performance standard with excess meaning

13
Common Origin Is Not Enough
Content Standards
TestSpecification
PerformanceStandard
?
Test
14
Interpretive (Validity) Argument
  • Content Standards
  • Is it clear what test is supposed to measure?
  • Alignment
  • Does test measure what it is supposed to?
  • Accuracy and Precision
  • Adequate reliability, freedom from bias, etc.?
  • Performance Standard
  • Clear, appropriate, aligned with test?
  • Cut Score
  • Accurately matched to performance standard?

15
Problems with Cut Scores
  • Test-Centered Methods
  • Angoff, Modified Angoff, Bookmark
  • Person-Centered Methods
  • Contrasting Groups, Borderline Group
  • Performance-Centered Methods
  • Body of Work

16
Test-Centered Methods
... As the Panel's studies demonstrate, the
Angoff ... and other item-judgment methods are
fundamentally flawed. Minor improvements ...
cannot overcome the nearly impossible cognitive
task of estimating the probability that a
hypothetical student at the boundary of a given
achievement level will get a particular item
correct. Shepard, Glaser, Linn, Bohrnstedt
(1993)
17
A word about the "Bookmark"
  • Pros
  • Seems superior to Angoff
  • Cons
  • Judgment locus is still performance of
    hypothetical borderline examinee
  • Relies on arbitrary mastery probability convention

18
Person-Centered Methods
  • Pros
  • Real-world judgment locus
  • "Reality check" as to accuracy of classification
  • Cons
  • Suspect basis of person classifications
  • Weak theoretical foundation
  • Dubious basis for generalization to other
    examinees/times/places

19
Performance-Centered Methods
  • Pros
  • Locus of judgment is direct sample of student
    performance
  • Cons
  • Limits Performance Standard to performance on
    test-like tasks
  • Largely limited to assessments of writing or
    assessments using constructed-response

20
Promising Future Directions
21
Clear communication, modest claims
  • Unelaborated labels like "proficient" invite
    surplus meaning
  • There are no incentives to discourage
    misinterpretation
  • Vivid, real-world examples(e.g., representative
    student work samples) can help

22
Challenging but realistic goals
"Ambitious expectations are desirable to
encourage concentrated effort on the part of
educators and students. However, in order for the
expectations to be met, educators and students
must have the capacity to meet the targets that
are set."
Linn (2005), From CRESST Tech Report No. 650
23
Benchmarks, not "standards"
  • TIMSS Example
  • Benchmarks at 25th, 50th, 75th, 90th iles of
    international achievement distribution
  • E.g., "In 1999, 61 of U.S. 8th graders scored
    above the international median performance level
  • Other options
  • U.S. distribution at fixed point in time
  • "Grade Level" standards (cf. Linn, 2005)

24
Better methods
  • Briefing Book Method?
  • Choose maybe 10 possible cut scores, and tell
    what each means and implies
  • Empirically derived performance standard
  • Projected passing rate
  • Subgroup impacts
  • School-level distribution of passing
  • "Real-world" links if required by language of
    performance standard

25
Conclusions
  • Standards-based score interpretations will be
    around for awhile
  • Vague definitions invite surplus meanings
  • flawed standard-setting methods and overuse of
    "Proficient" label add to confusion
  • Better reporting would help right now
  • Better alternatives (e.g., benchmarks) exist
  • Better methods (e.g., "briefing book") may be
    developed

26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com