National Council on Measurement in Education - PowerPoint PPT Presentation

1 / 48

About This Presentation

Title:

National Council on Measurement in Education

Description:

Standard setting to set or reset assessment cut scores ... As part of this process, the ADE created two technical advisory committees (TACs) ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 49

Provided by: kmil72

Category:

more less

Transcript and Presenter's Notes

Title: National Council on Measurement in Education

1
National Council on Measurement in Education
Symposium - Setting Performance Standards for
Schools in Accountability ProgramsPolicy,
Technical, and Operational Issues Thursday,
April 12, 2007 Chicago, Illinois
2
Setting Performance Standards for Schools in
Accountability ProgramsPolicy, Technical, and
Operational Issues
Moderator Anita Rawls, University of South
Carolina
3
Presenters Eugene Kennedy, Louisiana State
University Standard Setting Challenges for School
Performance Rating Systems Charity Smith,
Arkansas Department of Education School
Performance Index The Arkansas Experience from
Act 35 to Field Review and State Board of
Education Robert Kennedy, University of Arkansas
for Medical Sciences Use of Policy-induced and
School Descriptor Methodology Huynh Huynh,
University of South Carolina Validity,
Reliability and Other Technical
Considerations Charity Smith, Arkansas
Department of Education Final Deliberations by
State Board of Education
4
Discussants Peter Behuniak, University of
Connecticut William Schafer, University of
Maryland
5
Standards Setting Challenges for School
Performance Rating Systems
Eugene Kennedy
6
Standard Setting Challenges For School
Performance Rating Systems

Why Rate Schools?
On What Characteristics Should Schools Be Rated?
What Steps Are Involved In Rating Schools?

7
How Do We Define Performance?

For Students Achievement Scores
For Schools Aggregated Achievement Data
Adjusted/Not-Adjusted for Input?
Challenges At The Student Level Special
Populations, Retention, etc.
Challenges At the School Level Grade
Organization, Differential Input, Stakeholders,
etc.

8
Creation of a Performance Index

Students Summary Scores, Item Response Theory
(IRT) Scale, etc.
Schools Weighted Index, IRT, etc.

9
Procedures for Setting Standards

Students Defining Performance Levels, Judges,
etc.
Schools Definitions of Performance Levels,
Judges and Stakeholders, etc.

10
Validity and Reliability of Results

Students Internal Consistency, Classification
Accuracy, Predictive Validity, etc.
Schools Stability, Face Validity, etc.

11
Performance Labels and Their Implications

Students Advanced, Proficient, etc.
Schools High Performing, Low Performing, etc.

12
School Performance Index The Arkansas Experience
from Act 35 to Field Review and State Board of
Education

Charity Smith

13
Act 35The Arkansas Student Assessment and
Accountability Act 0f 2004

Like many other states, Arkansas has experienced
many initiatives designed to improve its public
education system.
Act 35, which was passed in the Second
Extraordinary Session of the 84th General
Assembly in 2003 mandates that the Arkansas State
Board of Education (SBE) adopt content standards
which reflect what students know and should be
able to do
Develop a criterion-referenced test (CRT)
Establish rewards and sanctions
Identify underperforming schools
Assess the annual learning gains of students

14
The Arkansas Comprehensive Assessment Requirements

Act 35 Big Changes in Testing and
Accountability
More grades added
Standard setting to set or reset assessment cut
scores
Vertically scaled the CRT for public school
students 3-8
Specific analyses of student achievement data

15
The Arkansas Comprehensive Accountability
Requirements

Develop a two-tiered annual accountability rating
system approach Performance and growth
Rate schools in five category levels (ranging
from excellent, category 5 to schools in need of
immediate improvement category 1.)
Develop value-added longitudinal calculations for
growth
Ensure that School Ratings are valid, replicable,
transparent, and easily understood
Use a team of relevant technical experts
Ensure that the accountability ratings approach
is approved by the SBE.

16
Timeline

2005-06 School Year
Report spring 2005 test results against newly
adopted standards for grades 3 through 8
Administer the new tests in grades 3 to 8 in
spring 2006
Summer 2006
Report results for grades 3 to 8 against newly
adopted standards
Prepare 2006 School Performance Rating System
Implement School Improvement Rating System
showing growth from 2005-06.

17
Technical Advisory Committees

Implementation of Act 35 and adherence to the
demanding timeline noted above required extensive
work by officials at the Arkansas Department of
Education (ADE).
As part of this process, the ADE created two
technical advisory committees (TACs), one for
assessment and one for accountability. These
TACs act in an advisory capacity for major
aspects of the implementation of Act 35.
They meet as needed and offer advice and
recommendations to the ADE. Given the reliance
of the accountability program on the statewide
assessments, there is considerable overlap
in the composition of the two committees.

18
School Accountability Ratings

The ADE is required to produce an annual report
which will identify schools as being in one of
five categories based on performance outcomes on
the criterion-referenced benchmark examinations.
These categories (levels) and their qualitative
interpretations are
Level 1 Schools in Need of Immediate
Improvement
Level 2 Schools on Alert
Level 3 Schools Meeting Standards
Level 4 Schools Exceeding Standards
Level 5 Schools of Excellence

19
Assignment of School Accountability Ratings

Schools in Arkansas will not be assigned
performance ratings during the period 2004-05
through 2008-09, unless they specifically request
that this be done.
The baseline year for improvement gains will be
the 2006-07 school year. Actual improvement
ratings (growth) will be assigned starting with
the 2007-08 school year.
Once improvement and performance ratings are
assigned, they will carry significant
consequences for schools.

20
Creation of School Weighted Average Index and
General Considerations in Setting Standards for
School Performance

Initially the TAC/Accountability and the ADE
considered three options developing the annual
school performance ratings required by Act 35
quintiles, stanines, and setting cut scores using
a standard setting conference.
Deliberations were also made on how to compute a
school index that would be used for categorizing
schools.
Following are the chronological steps in
TAC/Accountability deliberations and field
presentation to major groups of Arkansas
stakeholders.

21
School Weighted Average Index

The development of a school performance rating
system in
Arkansas involved three distinct steps.
First, the TAC/Accountability and the ADE
examined ways to compute a school index to be
used to assign a performance category to each
school.
Second, the TAC/Accountability then deliberated
on how to set the cut scores for this index in
order to define each of the five performance
categories legislated by Act 35.
Third, the TAC/Accountability made
recommendations to the ADE as to how it could
interact with various stake-holders in order to
get their endorsement of the proposed rating
system for consideration and adoption by the SBE.

Note The ADE conducted awareness training with
more than 1,100 stakeholders.
22
General Considerations in Setting Standards for
School Performance and Adoption of
Criterion-Referenced Approach

The TAC/Accountability and the ADE considered
three options for developing annual school
performance ratings
norm-referenced (quintiles and stanines)
criterion-referenced (expert judgment)
After statewide focus groups and recommendations,
the SBE adopted the third option, the
criterion-referenced approach.

23
Computation of Weighted Average

The weighted average index began with numerical
values, or weights, tentatively assigned to each
student's performance category from ACTAAP
proficiency levels (Advanced 4 Proficient
3 Basic 2 Below Basic 1).
A different set of weights could be assigned if
policy makers decided to value the performance
for each performance level differently.
With these weights assigned to the performance
levels, the performance index for the school
could be computed by multiplying the weights of
the performance levels times the number of
students scoring in the performance category.
This would be done for each grade and subject.
The weighted sum would then be divided by the
total number of students tested in the various
subjects and grades.
The resulting average for the school would range
between 1.0 and 4.0.

24
Preliminary Considerations and Use of School
Descriptor Methodology

Robert Kennedy

25
Preliminary Steps in the Standard Setting Process

Tentative categories
Information provided
Statewide data profile

26
Initial Considerations for Preliminary Cut Scores
27
General Considerations forPreliminary Cut Scores

March 8th, bad weather
March 15th benefited

28
Data for School Profile

Information provided
weighted average index
economically disadvantaged, LEP, and special
education
Adequate Yearly Progress
accreditation
number tested
percentages at each level

29
School Profile in Each Preliminary Level

Level 1 Schools in need of immediate
improvement (42 schools)
Level 2 Schools on alert (117)
Level 3 Schools meeting standards (795)
Level 4 Schools exceeding standards (112)
Level 5 Schools of excellence (24)

Note This preliminary level analysis includes
high schools.
30
School Profile in Each Pairwise Overlapped Schools

Pairwise overlap of school ratings
Levels 1 and 2 1.68 to 1.73
Levels 2 and 3 1.75 to 2.17
Levels 3 and 4 2.68 to 2.92
Levels 4 and 5 2.86 to 3.07
Panelists set cut points where they felt
comfortable.

31
Composition of the Panel

Facilitators black female, black male, Hispanic
male, white female, and white male
Panelists also racially and geographically
diverse PTA, business, AAEA, AEA, ASBA
Each group named 12 representatives, for a total
of 60 panelists (52 actually participated)
Monitored by TAC/Accountability

32
Beginning Plenary Session

Plenary meetings and group sessions.
Purpose of the meeting
Advisory role of the TAC/Accountability
Background, objectives, procedures
The criterion-reference approach explained

33
Group Session (Round 1)

Role alike groups
Panelists discussions
Initial break points
Medians and ranges

34
Round 1 Group Median Cut Scores and All-Group
Results
35
Second Plenary Meeting

Key points
Lunch
Reconsideration
New cut scores
New group means and medians

36
Round 2 Group Median Cut Scores and All-Group
Results
37
Final Plenary Meeting

Maintained individual confidentiality
State Board consideration
Panelists evaluation
Thanks to the panelists

38
Validity Reliability and Other Technical
Considerations

Huynh Huynh

39
Technical Characteristics

Act 35 levels of school performance status are
Level 1 Schools in Need of Immediate
Improvement
Level 2 Schools on Alert
Level 3 Schools Meeting Standards
Level 4 Schools Exceeding Standards
Level 5 Schools of Excellence.
Cut scores on the performance index scale have
been established.
I will now present some major psychometric
characteristics of the Weighted Average Index and
the status classifications.

40
Internal Consistency of Performance Index

The Weighted Average Index (PI) is a (linear)
average of the performance of all students in
that school.
An internal consistency (reliability) of the
index was computed using an analog of the
split-half (Spearman-Brown) reliability in
classical test theory. There are three steps
Step 1 Students in each school were randomly
split into equal (or nearly equal) half groups
and the index was computed for each half.
Step 2 The Pearson correlation (r12) was
computed for the two half-group indices using all
available schools (with at least 40 students).
Step 3 The Spearman-Brown formula used to
compute the reliability (r) of the performance
index for the entire school r 2 r12/
(1 r12).

41
Summary Data for Split-Half Reliability (2005)
42
Summary Data for Split-Half Reliability (2006)
43
Yearly Stability of Weighted Average Index
44
Stability of Act 35 Performance Level

The yearly stability of the performance index was
studied also through the performance level
classification.
We looked at the cross-tabulation data for the
2005 and 2006 performance levels for the large
schools, that is those with at least 40 students
with complete data in both years. There are 854
large schools.
Out of these, a total of 556 (65) retained the
same level from 2005 to 2006.
282 (33) moved up by one level.
One school that moved up two categories and 9
schools moved down one category.

45
Tabulation of Act 35 Performance Category with
AYP Category for 2005
Note AYP categories are coded as No 0 and Yes
1 for correlation calculation.
46
Tabulation of Act 35 Performance Category with
AYP Category for 2006
Note AYP categories are coded as No 0 and Yes
1 for correlation calculation
47
Final Deliberation by the State Board of
Education

Charity Smith

48
State Board of Education Action

The SBE and other stakeholders were kept
informed.
The SBE did the following
Adopted the Weighted Average Index for
calculating performance ratings for schools
Recommended detail communication with
stakeholders to ensure transparency
Approved official cut scores recommended by the
standards setting team
Adopted appropriate ratings criteria through
approved rules and regulations
Reviewed the Standard Setting Technical Report