NCLB and Growth Models: In Conflict or in Concert? - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

NCLB and Growth Models: In Conflict or in Concert?

Description:

SR: The pressures of accountability have resulted in remarkable successes (Ed ... Become involved in public policy forums as a community lobby in order to promote ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 42

Provided by: Mart315

Category:

more less

Transcript and Presenter's Notes

Title: NCLB and Growth Models: In Conflict or in Concert?

1
NCLB and Growth Models In Conflict or in Concert?

Susan L. Rigney, United States Department of
Education
Joseph A. Martineau, Michigan Department of
Education
Presented at the MARCES conference on
Longitudinal Modeling of Student Achievement
College Park, MD
November 7, 2005

2
Introduction

In response to your concerns about giving
schools credit for improving student achievement,
we are also considering the idea of a growth
model
Margaret Spellings
9/13/05

3
Author Perspectives

Sue Rigney
Education Specialist in the office of Student
Assessment and School Accountability (Title I) at
the U. S. Department of Education.
Primary responsibility monitoring state
compliance with the standards, assessment and
accountability requirements of NCLB
Secondary responsibility contributing to
ongoing discussion, clarification and
implementation of policies related to assessment
and accountability.

4
Author Perspectives

Joseph Martineau
Psychometrician for the Michigan Office of
Educational Assessment and Accountability.
Primary concerns congruence of accountability
systems with values of educational research
adequacy of statistical psychometric
methodology
His secondary concerns philosophy and policy of
accountability in terms of both practicality and
feasibility
Authorship should not be construed as an
endorsement of NCLB as a whole.

5
In conflict?

CRS says
Substantial interestin the possible use of
individual/cohort growth models Such AYP models
are not consistent with certain statutory
provisions of NCLB as currently interpreted by
USED
But, NCLB (Sec 4) says
The Secretary shall take such steps as are
necessary to provide for the orderly transition
to, and implementation of, programs authorized by
this Act

6
In concert?

USED Growth Model Study Group
IES grant for longitudinal data systems
State Accountability Workbook Amendments

7
Types of Models

Definitions developed by a State collaborative
through CCSSO (Goldschmidt et al, 2005)
Definitions
Cross-sectional models
Status Models
Improvement Models
Longitudinal Models
Growth Models
Residual Growth (RG) Models
Commonly labeled Value Added Models
Why we use the term RG

8
The Intersection of Policy and Growth Models

3-8 Assessments Provide Longitudinal Data
Safe Harbor
Use of Improvement Index in AYP
CCSSO SCASS Activities
USED Assistant Secretary Luce

9
Systemic CoherenceA Standard for Evaluating
Models

Three broad principles of systemic coherence
Models are consistent with policy goals
Models are integrated as a part of a consistent
system of content standards, assessments,
performance standards, and accountability
criteria
Models are implemented in a manner consistent
with the values of educational research

10
1. Standards-based

Assessments must cover depth and breadth
Results expressed in terms of performance levels
Proficient is most influential component of AYP

11
2. All Students

Participate (95 rule)
Results reported for all
AYP Not all Visible
Full Academic Year
Minimum n
LEP exemption for ELA test
Held to same standards
Alternate based on alternate achievement standards

12
3. School Improvement

Annual Measurable Objectives
Increased in 2004-05
Adjustment for transition in 2005-06
School accountable for subgroups
More visible in 2005-06
Consequences
Can/should growth moderate consequences?

13
Consistency of Content Standards, Assessments,
Performance Standards, and Accountability Criteria

Accountability based on academic indicators
Peer Review of State Assessment Systems
Alignment
Performance descriptors
Alternate assessments

14
Coherent Assessment System

State assessments
Rational, coherent design
Relative contribution of different tests
Matrix forms equivalent
Comparability
English vs Spanish
Computer vs paper pencil
Local assessments
Aligned, equivalent, comparable results for
subgroups, aggregable

15
Results understandable

Educators know what to do
Articulation across grades
Articulation across performance levels
A progression matrix that show
Proficient is different from basic because
Proficient in third grade is different form
proficient in fourth grade because
Administrators know how to allocate resources

16
Consistency with Values of Educational Research

As defined by Gregory N. Derry1.
Free flow of information Curiosity
Replicability
Thorough peer review
Improvement
Honesty and Open-mindedness
Willingness to consider multiple alternatives
Scrupulous investigations of weaknesses
Flexibility to adopt feasible improvements

1 Professor of Physics at Loyola University and
author of What Science Is and How It Works
(Princeton University Press, 1999)
17
Attributes of Systemic Coherence Applicable in
this Context

Alignment of standards and assessments
The same performance standards for all
Inclusion of all student groups
Explicit tracking of achievement gaps
Appropriate statistical and psychometric models
A program of ongoing research
Consistency of reports with all other attributes

18
1. Alignment of Standards and Assessments

Foundation of validity of school accountability
decisions
USED expects independent verification of
Full range of content standards?
Address content and process skills?
Same degree and pattern of emphasis?
Scores reflect full range of achievement?
Procedures to maintain/improve?

19
Alignment methods

Alignment Methodology
Webb (SCASS TILSA)
Porter (SCASS SEC)
Achieve
Buros
Methods do not address articulation across grades
JM Current instantiations of independent
review may underestimate alignment

20
2. The Same Standards for All Students

Grade-level achievement standards
Except for students with most significant
cognitive disabilities (1)
All students proficient by 2013-14
What about growth toward proficient?
What about length of time in system?
Proposals to balance fairness toward both
educators and student groups should also be a
part of any plan to implement growth models for
accountability purposes. Fairness toward one
should not be sacrificed for fairness toward the
other.

21
2. The Same Standards for All Students

JM The NCLB expectation that all students will
be proficient by a given date seems unreasonable.
The recognition that there will always be
individual differences among students (and
aggregate differences across schools in their
intake populations) should also be incorporated
in setting policy targets.
SR Safe harbor recognizes that adequate yearly
progress may be met with less than 100 meeting
annual and long-range goals.
JM The safe harbor provision of NCLB is a good
beginning, but does not fully account for these
realities.

22
2. The Same Standards for All Students

JM The punitive nature of NCLB consequences can
actually undermine policy objectives by adding
turbulence to schools serving low-achieving
students.
SR The pressures of accountability have resulted
in remarkable successes (Ed Trust), and there are
multiple safeguards to prevent Type I error.
JM The multiple safeguards are an important
starts, but policies encouraging more assistance
in and attraction of highly effective educators
to low-achieving schools is more likely to
support the policy objectives.
SR NCLB funds are available for recruitment and
retention bonuses, and data indicate that states
are beginning to use these funds in this way.

23
Implications for growth model

Expectation of same growth for all maintains
achievement gap
Expectation of 12 months growth in 1 year
maintains achievement gap
Expectation of normative growth maintains
achievement gap

24
3. Inclusion of All Student Groups

Missing data means missing students
How many missing students does it take to
compromise validity?
Robustness to missing data does not imply that it
is OK to leave out data where it can reasonably
be obtained

25
4. Explicitly Tracking Achievement Gaps

Closing the achievement gap is a
Policy objective
Matter of ethics
Attainable
Tracking the achievement gap makes inequities
publicly visible

26
4. Explicitly Tracking Achievement Gaps,
continued

Separate models from those used to track
attainment of growth targets
Include in the model variables defining
policy-defined subgroups
Interaction of grade with subgroup variables
Simple graphical representation of the results

27
5. Appropriate Statistical and Psychometric Models

Statistical concerns
Match of model to data structure
Violations of assumption
Do random effects models cheat?
How do we integrate results from alternate
assessments?
What is the sample, and what is the population?
Different models needed for different purposes
Meeting growth targets
Tracking achievement gaps
Primary research

28
5. Appropriate Statistical and Psychometric Models

Statistical concerns
Are the models correlational or causal? The
mandated data collection is correlations.
JM The mandated policy uses are more causal.
The descriptive statistics are used to label
schools as in need of improvement, and if
students are not achieving reasonable goals, it
is hard to argue with this label. However, the
distinction between schools in need of
improvement and ineffective educators is unlikely
to be either fathomed or appreciated by many
people. The nature of NCLB consequences invites
this unfounded interpretation.
SR The statute provides substantial resources
for professional development and instructional
materials in order to help educators meet the
extraordinary needs of the children they serve.

29
5. Appropriate Statistical and Psychometric
Models, continued

Unwarranted assumptions
No equating error
Vertical Doran (2005)
Horizontal not studied, but most assessments
only have a few anchor items in common across
years
Interval level scale
If using scale scores, most models assume equal
interval measurement
Psychometrically suspect
Effects not well studied

30
5. Appropriate Statistical and Psychometric
Models, continued

Unwarranted assumptions, continued
A single continuous scale on the same construct
across grades (vertical or developmental scales)
Mathematical demonstrations (Martineau, 2004, in
press)
We purposely build content shift into our
assessments across grades
High correlations among sub-constructs do not
take care of the problem
Students where growth is occurring outside the
curriculum-defined range for the grade are not
measured well
Effects of prior schools/grades become attributed
to later schools/grades
Practically significant effects of the
misattributions occur in all reasonably
conceivable assessment scenarios
Empirical validation (Lockwood et al, under peer
review)
Subscales of math assessment, greater variability
within teacher across subscales than across
teachers within subscale.
Low correlations in value added across
subscales
The sub-content matters tremendously

31
5. Appropriate Statistical and Psychometric
Models, continued

Unwarranted assumptions, continued
We need to account for equating error
We need to study the effects of the
interval-level measurement assumption and either
Validate the assumption, or
Not make the assumption
We need to either
Develop psychometric models that can account for
change in content across grades, or
Not assume the same content across grades
Analytical models that avoid scale assumptions
Hills Value Table approach (this conference)
Betebenner transition matrix approach (2005)
Standards-based interpretations, can use baseline
data

32
6. An Ongoing Program of Research

A turbulent field (in its adolescence, to quote
Lissitz)
Large-scale implementation in a turbulent field
requires extraordinary flexibility to keep up
with the state of the art
And yet, too much flexibility can thwart useful
interpretation of trend data

33
7. Consistency of Reports with Other Attributes

Responsive to instruction?
Understandable to stakeholders?
Grounded in policy aims?
Valid reliable?

34
Setting standards for growth

Whats reasonable?
vs
What do we hope to accomplish?
Whats fair?

35
Growth school consequences
36
Conclusions

Can we add growth?
Yes!
Should we add growth?
Yes, where there is an evaluative framework tied
to policy objectives, a systemic approach, and
alignment with the values of educational research
Must we add growth?
An option, not a requirement because of the
extraordinary necessary infrastructure

37
Recommendations for Policymakers

Understand the basic differences between models
Run simulations with real data
Understand the limitations
Listen to practitioners
Listen to methodologists
Anticipate cost/benefits
Lack of stability corrupts meaning
Do not over-specify the details in statute
This field moves ahead quickly
Flexibility to implement advances is key

38
Recommendations for Accountability Implementation
Staff

State Directors give your staff time to write it
up!!
Require greater detail in the Technical Manuals
that allows for comprehensive review of the
procedures
Explain it (as much as you can) to your
legislators and Congresspersons
Challenge assumptions
Status quo is good
Change is good
Resource assumptions
Claims of proponents

39
Recommendations for Technical Researchers

Validity need not conflict with transparency
Validity
Maintain sufficient complexity to produce valid
results
Transparency for non-technical stakeholders
Simple, but accurate reports
Grounded interpretations
Transparency for technical stakeholders
Comprehensive documentation of the entire system,
including psychometric and statistical models
Facilitation of replication
Facilitation of primary research on strengths and
weaknesses

40
Recommendations for Technical Researchers

Pay systemic attention to
Assumptions of psychometric models
Assumptions of content standard models
Assumptions of statistical models
Think carefully about what the models can tell us
and cannot tell us about instruction, curriculum,
and student development
Develop simple graphical representations of the
model and its important concepts for policymaker
consumption
Become involved in public policy forums as a
community lobby in order to promote appropriate
interpretation of data.
We cannot give our cautions, wash our hands of
how the data is used, and stand on the outside of
the political process

41
Recommendations for All Stakeholders

Realize that with all of the high stakes
surrounding accountability uses of student
achievement data, there are forces that can work
against community interests
Economic benefits, reputations, and other
personal investments can cause proponents of
specific systems to avoid scrupulous
investigations of the shortcomings of those
systems and/or the benefits of competing
approaches
Willingness to be and accountability for being
rigorously honest and open-minded about multiple
approaches is an essential part of improving and
evaluating growth-based accountability systems