On-demand learning-embedded benchmark assessment using classroom-accessible technology - PowerPoint PPT Presentation

About This Presentation

Title:

On-demand learning-embedded benchmark assessment using classroom-accessible technology

Description:

Study 1: Use of Rasch model assumes that selection of items is not informative, but here it probably is. Use of online metrics may be making up for this. – PowerPoint PPT presentation

Number of Views:116

Avg rating:3.0/5.0

Slides: 25

Provided by: MarkW189

Learn more at: https://www.stat.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: On-demand learning-embedded benchmark assessment using classroom-accessible technology

1
On-demand learning-embedded benchmark assessment
using classroom-accessible technology

Discussant Remarks
Mark Wilson
UC, Berkeley

2
Outline

What does Validity look like for these papers?
What is it that these papers are distinguishing
themselves from?
Where might one go from here?

3
Need for strong concern about validity

Effect of NCLB requirements
Schools are instituting frequent benchmark
tests
Intended to guide teachers as to students
strengths abd weaknesses
Often just little copies of the State test
Teachers are complaining that it puts a vice-like
grip on the curriculum

4
The Triangle of Learning standard interpretation
5
The vicious triangle
6
Validity

1999 AERA/APA/NCME Standards for educational and
psychological tests
Five types of validity evidence
Evidence based on test content
Evidence based on response processes
Evidence based on internal structure
Evidence based on external structure
Evidence based on consequences

7
Paper 1 Falmange et al-ALEKS

Reliability gt Validity
the collection of all the problems potentially
used in any assessment represents a fully
comprehensive coverage of a particular
curriculum, ..hence...arguing that such an
assessment, if it is reliable, is also
automatically endowed with a corresponding amount
of validity is plausible.

8
Paper 1 Falmange et al-ALEKS

Test content
Theory of the Learning Space
inner fringe and outer fringe
the summary is meaningful for an instructor
Database of Problems
a consensus among educators that the database of
problems is a comprehensive compendium for
testing the mastery of a scholarly subject. This
phase is relatively straightforward.
Evidence Who were the experts?/What did they
do?/How much did they agree?

9
Paper 1 Falmange et al-ALEKS

Evidence based on response processes
E.g., for selected K, Do students in K say things
that are consistent/inconsistent with that
Evidence based on internal structure
E.g., for selected K, Do students in K have
high/low success rates at instances in K
Evidence based on external structure
E.g., comparison with teacher judgments of
student ability
Evidence based on consequences
E.g., use of fringesdoes this help/hinder
teacher interpretations

10
Paper 2 Shute et al-ACED

Two validity studies
Study 1 Evidence based on external structure
Prediction of residuals from external post-test
after controlling for pre-test
Informative design on conditions elaborated
feedback better
Study 2 Evidence based on response processes
Usability study for students with disabilities

11
Paper 2 Shute et al-ACED

Evidence based on test content
reference to earlier paper
Evidence based on internal structure
Could easily be investigated, as there is
interesting internal structure (Fig. 1)
Evidence based on consequences
Probably not any real consequences yet

12
Paper 3 Heffernan et al -ASSISTment System

Evidence based on test content
Items coded by 2 experts, 7 hrs.
skill of Venn Diagram
Evidence based on internal structure
Which skill-model fits best--1, 5, 39, 106
skills?
Which number is different?
4.10, 4.11, 4.12, 4.10, 4.10
1, 5, 39, 106 (twice)

13
Paper 3 Heffernan et al -ASSISTment System

Evidence based on external structure
Prediction of MCAS

23/38 61 dont fit well for the best model
(WPI-39 (B)).
14
Paper 3 Heffernan et al -ASSISTment System

Evidence based on response processes
?
Evidence based on consequences
Probably are real consequences

15
Paper 4 Junker-ASSISTment System

Two Validity studies
Study 1 Evidence based on external structure
Prediction of MCAS scores
Study 2 Evidence based on internal structure
4 internal structure patterns
2 questions
Q1 Regarding how scaffolds get easier--what
happens when you get a scaffold wrong?
Q2 What about the gap?

16
(No Transcript)
17
Paper 4 Junker-ASSISTment System

Rest of types of validity--see Paper 3

18
Looking Beyond

What does this group of papers have to offer?
What should it be looking out for?

19
Paper 1 Falmange et al-ALEKS

Inner and Outer Fringe
What do teachers think of them, what do they do
with them?
Standardized tests, psychometrics as straw
men
Alternative compare ones work to the latest
developments in item response modeling (e.g.,
EIRM)

20
Paper 2 Shute et al-ACED

Weight of Evidence
Good alternative to Fisher information
Transparent, easily interpretable
Models for people with disabilities
Most likely going to have different internal
structure
Need to develop broader view of internal
structure criteria

21
Paper 3 Heffernan et al -ASSISTment System

MCAS as starting point for diagnostic testing?
Using released items?!?
What is unidimensionality

22
Paper 3 Heffernan et al -ASSISTment System

In a latent class model, the latent class looks
like this
In an item response model (e.g., Rasch model),
unidimensionality looks like this

See Karelitz, T.M., Wilson, M.R., Draney, K.L.
(2005). Diagnostic Assessment using Continuous
vs. Discrete Ability Models. Paper presented at
the NCME Annual Meeting in San Francisco, CA.
23
Paper 4 Junker-ASSISTment System