Assessment Adjustments Updates - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Assessment Adjustments Updates

Description:

CBT/PBT Comparability issues related to Equating ... Because the CBT and PBT testers are not randomly assigned, we need to adjust for ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 23
Provided by: scottm64
Category:

less

Transcript and Presenter's Notes

Title: Assessment Adjustments Updates


1
Assessment Adjustments Updates
  • Debbie Swensen
  • USOE
  • January 10, 2008

2
The Primary Issues
  • Math CRT Implications resulting from curriculum
    changes
  • CBT/PBT Comparability issues related to Equating

3
Math CRT Implications resulting from curriculum
changes
4
Math
  • Test is based on 2007 content standards (Core),
    while instruction is supposed to be based on 2008
    Core
  • Changes evident in the new Core generally require
    the teaching of key concepts earlier than was the
    case with the old Core
  • NCTM Focal Points guide new core
  • Uncertain about implementation fidelity of the
    new Core however, there is strong anecdotal
    evidence of high implementation

5
Why Not Break the Chain
  • Why not just say the two tests are different and
    reset standards now?
  • Still using old test, so we couldnt even set
    standards on the new Core if we wanted
  • Will have more of a chance to teach the new Core
    with a year of preparation

6
A Bridge
  • Therefore, we must find a way to bridge the 2007
    and 2008 test scores that
  • Is defensible for AYP and UPASS calculations
  • Supports good instruction
  • Validly reflects the general interpretation of
    the results (that the test results reflect the
    new curriculum)

7
Plans for Scoring the Math CRTs
  • Provide scores based only on those items that
    match the new Core

8
Plans for scoring the math CRTsSupport for
Decision
  • The PAC and the district assessment directors
    expressed concern with having the scores for the
    2008 test include content that was not supposed
    to be taught and learned
  • Members of the TAC originally questioned this
    assumed level of implementation fidelity, but the
    district representatives as well as the USOE
    curriculum section feel quite strongly that
    teachers have shifted to the new Core for this
    school year
  • Evidence
  • Core Academy
  • District plans and Professional Development
  • USOE generated old/new curriculum comparison
    tools

9
Plans for scoring the math CRTsSupport for
Decision
  • The TAC suggested several studies to evaluate
    both the instructional sensitivity of the test
    items and the ability of schools to shift
    instruction to fully implement the new Core

10
Scoring Plan Details
  • The TAC recommends only including 2008 items in
    the test scores
  • In other words, those items not matching the new
    Core at all will be deleted from scoring,
  • Why not delete these items from the test overall?
  • Production realities
  • Only a few of these items per test
  • Remaining items will not distract from over all
    test response.

11
Raw score reports
  • The raw score reports generated from both the CBT
    or from the USOE scanned paper tests will be
    based ONLY on the items used to generate the
    student scores
  • For example, 4th grade student scores and raw
    score reports will be based on the 60 remaining
    items from the originally 65 item set
  • Subscore (standard/objective) reports will be
    based on the 2008 Core objectives
  • some objective reports will be based on fewer
    items than we would feel comfortable reporting in
    this manner for validity concerns, but this is
    the best we can do this year
  • This will provide excellent reasons to provide
    education on the appropriately interpreting test
    results

12
CBT/PBT Comparability issues related to Equating
13
Scaling and Equating plans for 2008 Wrestling
with issues of comparability
  • Lords maxim the only valid equating design is
    when the same examinees are taking the same test
    items. In other words, we dont need to equate!
  • Given that the best equating plans violate
    Fredrick Lords maxim about equating we know that
    we are facing an uphill challenge here
  • The USOE national technical advisory committee
    (TAC) recommended the following general approach
    for equating the 2008 and 2007 scales

14
Premises for Decisions
  • We do not expect to find a significant degree of
    variance due to modality
  • We will not disadvantage CBT testers

15
2008 CRT Equating Framework
  • We will conduct our normal equating to place the
    2008 scores on the 2007 scale, except that we
    will base the equating on the 2008 CBT results to
    serve as a basis for going forward to 2009 and
    beyond
  • Based on what we find for mode differences (we do
    not expect to find much variancesee following
    slides), we will adjust the scores such that the
    CBT testers will not be disadvantaged (for this
    year only)

16
2008 CRT Equating Framework
  • USOE technical advisors and the TAC will continue
    to work on the specifics of this approach in
    coming weeks
  • Because the CBT and PBT testers are not randomly
    assigned, we need to adjust for pre-existing
    performance differences
  • Once we adjust for these pre-existing difference
    in student performance, we will then evaluate the
    differences due to test modality

17
Meta-Analysis of Multiple-Choice Comparability
Studies (from Neal Kingston)
  • Looked for K-12 studies of multiple-choice tests
    published or presented 1997-2006
  • Journal tables of contents
  • Conference programs
  • Internet search (Google, Google Scholar)
  • Contacting researchers
  • Found 14 usable articles reporting on 81 studies
  • Sample sizes ranged between 42 and 4,333

18
From Neal Kingston
19
Calculate Effect Sizes (N. Kingston)
Positive number means did better on computer
negative means did better on paper
20
Note This plot indicates that over all of the
studies, there was NO average effect of CBT vs.
PBTfrom N. Kingston
Paper
Computer
median
21
Weighted Mean Effect Size by Grade (Note that all
effects hover close to zero, Kingston, 2007)
22
Weighted Mean Effect Size by Subject (Note all
effects hover around zero, Kingston, 2007)
Write a Comment
User Comments (0)
About PowerShow.com