On-demand learning-embedded benchmark assessment using classroom-accessible technology - PowerPoint PPT Presentation

About This Presentation
Title:

On-demand learning-embedded benchmark assessment using classroom-accessible technology

Description:

Study 1: Use of Rasch model assumes that selection of items is not informative, but here it probably is. Use of online metrics may be making up for this. – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 25
Provided by: MarkW189
Learn more at: https://www.stat.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: On-demand learning-embedded benchmark assessment using classroom-accessible technology


1
On-demand learning-embedded benchmark assessment
using classroom-accessible technology
  • Discussant Remarks
  • Mark Wilson
  • UC, Berkeley

2
Outline
  • What does Validity look like for these papers?
  • What is it that these papers are distinguishing
    themselves from?
  • Where might one go from here?

3
Need for strong concern about validity
  • Effect of NCLB requirements
  • Schools are instituting frequent benchmark
    tests
  • Intended to guide teachers as to students
    strengths abd weaknesses
  • Often just little copies of the State test
  • Teachers are complaining that it puts a vice-like
    grip on the curriculum

4
The Triangle of Learning standard interpretation
5
The vicious triangle
6
Validity
  • 1999 AERA/APA/NCME Standards for educational and
    psychological tests
  • Five types of validity evidence
  • Evidence based on test content
  • Evidence based on response processes
  • Evidence based on internal structure
  • Evidence based on external structure
  • Evidence based on consequences

7
Paper 1 Falmange et al-ALEKS
  • Reliability gt Validity
  • the collection of all the problems potentially
    used in any assessment represents a fully
    comprehensive coverage of a particular
    curriculum, ..hence...arguing that such an
    assessment, if it is reliable, is also
    automatically endowed with a corresponding amount
    of validity is plausible.

8
Paper 1 Falmange et al-ALEKS
  • Test content
  • Theory of the Learning Space
  • inner fringe and outer fringe
  • the summary is meaningful for an instructor
  • Database of Problems
  • a consensus among educators that the database of
    problems is a comprehensive compendium for
    testing the mastery of a scholarly subject. This
    phase is relatively straightforward.
  • Evidence Who were the experts?/What did they
    do?/How much did they agree?

9
Paper 1 Falmange et al-ALEKS
  • Evidence based on response processes
  • E.g., for selected K, Do students in K say things
    that are consistent/inconsistent with that
  • Evidence based on internal structure
  • E.g., for selected K, Do students in K have
    high/low success rates at instances in K
  • Evidence based on external structure
  • E.g., comparison with teacher judgments of
    student ability
  • Evidence based on consequences
  • E.g., use of fringesdoes this help/hinder
    teacher interpretations

10
Paper 2 Shute et al-ACED
  • Two validity studies
  • Study 1 Evidence based on external structure
  • Prediction of residuals from external post-test
    after controlling for pre-test
  • Informative design on conditions elaborated
    feedback better
  • Study 2 Evidence based on response processes
  • Usability study for students with disabilities

11
Paper 2 Shute et al-ACED
  • Evidence based on test content
  • reference to earlier paper
  • Evidence based on internal structure
  • Could easily be investigated, as there is
    interesting internal structure (Fig. 1)
  • Evidence based on consequences
  • Probably not any real consequences yet

12
Paper 3 Heffernan et al -ASSISTment System
  • Evidence based on test content
  • Items coded by 2 experts, 7 hrs.
  • skill of Venn Diagram
  • Evidence based on internal structure
  • Which skill-model fits best--1, 5, 39, 106
    skills?
  • Which number is different?
  • 4.10, 4.11, 4.12, 4.10, 4.10
  • 1, 5, 39, 106 (twice)

13
Paper 3 Heffernan et al -ASSISTment System
  • Evidence based on external structure
  • Prediction of MCAS

23/38 61 dont fit well for the best model
(WPI-39 (B)).
14
Paper 3 Heffernan et al -ASSISTment System
  • Evidence based on response processes
  • ?
  • Evidence based on consequences
  • Probably are real consequences

15
Paper 4 Junker-ASSISTment System
  • Two Validity studies
  • Study 1 Evidence based on external structure
  • Prediction of MCAS scores
  • Study 2 Evidence based on internal structure
  • 4 internal structure patterns
  • 2 questions
  • Q1 Regarding how scaffolds get easier--what
    happens when you get a scaffold wrong?
  • Q2 What about the gap?

16
(No Transcript)
17
Paper 4 Junker-ASSISTment System
  • Rest of types of validity--see Paper 3

18
Looking Beyond
  • What does this group of papers have to offer?
  • What should it be looking out for?

19
Paper 1 Falmange et al-ALEKS
  • Inner and Outer Fringe
  • What do teachers think of them, what do they do
    with them?
  • Standardized tests, psychometrics as straw
    men
  • Alternative compare ones work to the latest
    developments in item response modeling (e.g.,
    EIRM)

20
Paper 2 Shute et al-ACED
  • Weight of Evidence
  • Good alternative to Fisher information
  • Transparent, easily interpretable
  • Models for people with disabilities
  • Most likely going to have different internal
    structure
  • Need to develop broader view of internal
    structure criteria

21
Paper 3 Heffernan et al -ASSISTment System
  • MCAS as starting point for diagnostic testing?
  • Using released items?!?
  • What is unidimensionality

22
Paper 3 Heffernan et al -ASSISTment System
  • In a latent class model, the latent class looks
    like this
  • In an item response model (e.g., Rasch model),
    unidimensionality looks like this


See Karelitz, T.M., Wilson, M.R., Draney, K.L.
(2005). Diagnostic Assessment using Continuous
vs. Discrete Ability Models. Paper presented at
the NCME Annual Meeting in San Francisco, CA.
23
Paper 4 Junker-ASSISTment System
  • What is the effect of assuming MCAR/MAR
    assumptions when neither is true?
  • Relevant to all CAT
  • Or of assuming you know the response under NMAR
  • Is there a discrimination paradox in DINA models?
  • Why do scaffold questions get easier?

24
Future Directions
  • What is a Knowledge State (KS)
  • How do we test if its a unitary thing?
  • What if it isnt?
  • Mixture models--structured KSs
  • Do teachers (and other practitioners) find the
    KSs useful
  • How to adjust if they dont?
  • finer/coarser grained
  • structured
Write a Comment
User Comments (0)
About PowerShow.com