Growing Pains: The State of the Art in ValueAdded Modeling presentation

About This Presentation

Transcript and Presenter's Notes

Title: Growing Pains: The State of the Art in ValueAdded Modeling

1
Growing Pains The State of the Art in
Value-Added Modeling

Presentation on March 2, 2005 to
Michigan School Testing Conference
By Joseph A. Martineau
Psychometrician
Office of Educational Assessment Accountability
Michigan Department of Education

2
Why Value Added?

Value Added measures of achievement are being
discussed as a possible addition to the
regulations of No Child Left Behind (NCLB).
Various ways of implementing Value Added in NCLB
are possible
One likely implementation of Value Added is as
another way to make safe harbor if the percent
proficiency targets are not met

3
What is Value Added?

In accountability, Value Added is a term that
describes the part of achievement (or change in
achievement) that is attributable to the
effectiveness of a unit (teacher or school)
Positive estimates indicate units that are above
average, negative estimates indicate that units
are below average
Defining what is attributable to the
effectiveness of a unit is a matter of
philosophical debate

4
The Logic of Value Added

Holding educators accountable for student
performance has many pitfalls
Educators cannot control their students incoming
achievement
Educators cannot control the effectiveness of
their students previous teachers/schools
Educators cannot control the effects of
non-instructional student characteristics such
as
Poverty
Parental education
Mobility
Home environment
Etcetera

5
The Logic of Value Added, Continued

Value Added Models (VAM) attempt to obtain pure
estimates of the contribution of educators to
student achievement and/or growth in achievement
The promise of VAM is that educators are held
accountable only for their impact on student
learning
The idea is not rocket science (Sanders), but the
implementation is (Reckase)

6
The Idea Is Not Rocket Science

For each school
Estimate the expected average achievement or gain
score
Calculate the observed average achievement or
gain score
Subtract the expected from the observed average
score
Define the resulting difference between expected
and observed scores as the value added by the
school

7
The Idea Is Not Rocket ScienceAdjusting
Achievement Targets tobe More Fair to Educators
8
The Idea Is Not Rocket ScienceAdjusting Gain
Targets to be MoreFair to Educators (Tennessee
Model)
9
The Idea Is Not Rocket ScienceAdjusting Gain
Targets to be MoreFair to Educators (Dallas
Model)
10
The Idea Is Getting Closer to Rocket
ScienceAdjusting Yearly Gain Targets to Meeta
Final Achievement Goal (Thum Model)
11
The Implementation IS Rocket ScienceIn a
Growth-Based VAM, For Each School You Must

Specify a Mixed Model (a sophisticated
statistical procedure that accounts for the
structure of data coming from multiple occasions
for each student, and multiple students per unit)
Estimate an overall average gain for each school
year, and for the entire set of students and
schools
Estimate a unique expected average gain for each
school year and school
Estimate the difference between the schools
actual average trajectory and the expected
average trajectory for each school year and
school
Keep track of previous schools effects so that
they dont get counted toward later schools
Estimate a unique expected gain for each school
year, student, and school
Estimate the difference between the expected gain
and the actual gain for each school year,
student, and school
Keep track of all differences across years so
that a students high growth in one year is not
counted toward all subsequent years
Estimate all of these expected and actual gains
together so that they are unbiased and reliable
Do this all using a sparse data matrix, which
causes ordinary software to choke
So, you write your own software, and develop new
applications of statistical theory to make your
idea work
Communicate the results in an understandable
fashion to stakeholders

12
The Problem with Rocket Science

And with rocket science, many things can cause
large distortions in the results of VAM,
including
Small problems with the scales of measurement
Small programming errors
Small errors in assumptions needed for the
statistical models to work appropriately

13
Statistical Issues in VAM

50 years ago, researchers despaired of every
being able to measure growth validly, because the
statistical issues seemed insurmountable
Most of the statistical issues have been solved
by the introduction of Statistical Mixed Models

14
Statistical Issues in VAM, Continued

For VAM, one very significant statistical issue
remains
The parts of the statistical models that produce
estimates of Value Added were originally included
in statistical models with the purpose of
accounting for sources of error so that other
effects were easier to identify. Therefore
Therefore, estimates of value added can also be
classified as error terms
Estimates of Value Added are technically the
portion of achievement or gains that cannot be
explained by anything else included in the model
In effect, the implementation of a Value-Added
Model says whatever portion of achievement
and/or growth we do not know how to explain is to
be attributed to schools

15
Statistical Issues in VAM, Continued

Philosophical, ethical, and political
considerations of attributing to schools all
achievement/gains that cannot be explained any
other way
Do we have to remove differences explained by
ethnicity before we can attribute the rest to
schools?
Do we have to remove differences explained by
poverty before we can attribute the rest to
schools?
Etcetera
Is it possible to ever satisfy the majority of
stakeholders that whats left over is pure enough
to hold schools accountable for?
No matter how we answer these questions, it
raises additional philosophical, ethical, and
political concerns.

16
Ethical Issues in VAM, Continued

VAMs as Currently Implemented
Focus lies squarely on being fair to educators
In TN and OH
All educators are expected to produce the same
average gains in their students
The achievement gap is expected to remain as it
was because educators or lower-achieving groups
of students are not expected to help their
students catch up
In Dallas
All educators are expected to produce gains in
their students that are equivalent to the average
gains achieved by similar groups of students
The achievement gap may be expected to widen
because lower performing groups of students may
achieve lower average gains than other groups of
students

17
Ethical Issues in VAM, Continued

Where does VAM take into account fairness for
low-performing students?
Currently implemented VAMs say basically, I need
to see one years growth for one year of
instruction where (as in the Dallas model), one
years worth of growth can be less for some
groups of students than for others
Because of concerns about being fair to
educators, groups of students that start out
behind are left behind by the same amount (or
even more)
Thum model is a compromise that expects a modest
amount more of educators serving low-achieving
students, but that the gap will be closed over
many grades
Not really a VAM
A mixture of status and growth

18
Political Issues in VAM

Complexity
Rocket Science is a political liability
As more of the statistical and ethical issues of
VAM are addressed, VAMs are likely to become even
more inaccessible to the lay audience
VAM requires an extraordinary amount of trust in
those who implement the system
Ethical issues will be decided by a political
process that does not necessarily account for the
best interest of students and educators, e.g.
Dallas Focus on best interests of educators at
the possible price of increasing achievement gaps
TN, HO Focus on best interests of educators at
the possible price of leaving achievement gaps as
they are
Thum Focus on best interests of low-performing
groups at the possible expense of (1)
high-performing groups of students, and (2)
making low-achieving schools less attractive to
qualified teachers
The state of the art in VAM is incapable of
providing for both high achievement for all
students and fairness in evaluating educators of
lower-performing students

19
Measurement Issues in VAM

Having solved most of the statistical issues in
VAM, the measurement issues have been forgotten
in the excitement

20
Measurement Issues in VAM, Continued

Assumes that the same thing is being measured at
every grade level of the test
Presents a dilemma
In order to measure validly, we have to measure
what is being taught, which changes over grade
levels
In order to calculate growth, gains, and
value-added, we have to measure the same thing
every time we measure
Value added models are being applied to
construct-shifting scales as if the scales were
interval-level measures of student achievement on
unchanging content

21
Cautions in using Vertical Scales

Scholars have been warning against the use of
construct-shifting scales to measure growth for
50 years
However, the use of vertical scales in growth
models has become increasingly prevalent in
scholarly literature with the advent of recent
statistical developments (HLM and SEM)
So am I just straining at gnats?
Cant I just use vertical scales to measure
growth?
What harm can it do?
How big is the effect of changing content on
growth- and growth-based value-added models?

22
Hypothetical example

A vertically scaled mathematics test
Grades 3-8
Composed of only two constructs
Basic Computation (BC)
Problem Solving (PS)
BC is heavily represented in early grades
PS is heavily represented in later grades
Only the single, combined math score is available
(BC and PS are just in the background)

23
Hypothetical example
24
Hypothetical Example
25
Hypothetical Example
26
The Effects of Construct Shift

Construct shift affects
The estimation of educational effectiveness (the
results of Value-Added Models)
Does not accurately identify effectiveness if
student achievement is outside the range measured
well by the grade-level test
Attributes effectiveness of prior
teachers/schools to current teachers/schools
(violates the promise of Value-Added Models)

27
(No Transcript)
28
Reliability

Ratio of construct-related variance to total
variance (construct-related plus
non-construct-related variance)
Extend to Value-Added Models
Ratio of variance in true value added to total
variance (true value-added variance plus variance
of distortions)
How important is this distortion, especially when
the constructs are correlated?

29
Reliability
Martineau (in press) derived an an upper bound
on reliability of VAM

Affected by content balance (more balanced means
lower reliability)
Affected by correlation in value added (higher
correlation means higher reliability)
Affected by grade level (later grades have lower
reliability)
Affected by magnitude of changes in content
across grades (larger changes mean lower
reliability)

30
Reliability of VAM Results
31
Reliability

Only in extraordinary circumstances are the
results reliable enough for high-stakes use
For research use, the results may be reliable
enough in some limited circumstances

32
Alleviating low reliability of value-added
analyses

Twice a year testing
Not politically viable
Completely eliminates low reliability
Once yearly testing, new equating design
Embed the entire set of below-grade items on the
current grade test by including a small portion
of the set on each of multiple test forms
Calibrate a separate vertical scale for each
adjacent pair of grades (e.g. 3/4, 4/5, 5/6)
Concurrent calibration of grade 3 and 4 items
together, 4 and 5 items together, 5 and 6 items
together
Should markedly reduce the amount of construct
shift, and increase the reliability to an
acceptable degree

33
Contact Information

Joseph Martineau
Office of Educational Assessment Accountability
Michigan Department of Education
P.O. Box 30008
Lansing, MI 48909
(517) 241-4710
martineauj_at_michigan.gov

Write a Comment

User Comments (0)

About PowerShow.com

Growing Pains: The State of the Art in ValueAdded Modeling PowerPoint PPT Presentation