Title: Interim Assessments
1Interim Assessments
- Marianne Perie
- Scott Marion
- Brian Gong
Presentation for the FAST SCASS Austin,
TX October 12, 2006
Center for Assessment
2Our Goal
- Broaden our perspective to consider interim
assessments - Can these assessments serve formative uses? If
so, how? - What other uses can these assessments serve well?
- What makes sense?
- Develop a framework for evaluating an interim
assessment system.
3Interim Assessment System
- Consider the system as a whole rather than an
individual test - May include several assessments with several
purposes - Instructional (could be formative)
- Predictive
- Evaluative
- Any shorter assessment given during the school
year (often multiple times) can be considered an
interim assessment
4Assessment with Formative Uses
- Some assessments may help form instruction
without meeting all the criteria for formative
assessment - Meet some requirements by
- Providing qualitative insights about
understandings and misconceptions not just a
numeric score - Giving timely feedback on what to do besides
re-teaching every missed item
5Consider these other uses
- Predict student achievement on summative test
(e.g., early warning) - Provide information on how best to target
curriculum to meet student needs - Provide aggregate information on how students in
a school/district/state are doing and where areas
of weakness are - Determine students' ability levels to group them
for instruction - Enrich curriculum
- Encourage students to evaluate their own
knowledge and discover the areas in which they
need to learn more. - Evaluate the effectiveness of various curricular
and/or instructional practices - Reinforce curricular pacing
- Practice for summative test
- Increase teacher knowledge of assessment, content
domain, and student learning
6Varied Uses and Purposes
- All of these purposes may be worthwhile even if
they are not formative - However, we need to establish clear links between
what questions the policymakers and educational
leaders want to answer and the tools they are
using to do so
7Consider these questions
- What do I want to learn from this assessment?
- Who will use the information gathered from this
assessment? - What action steps will be taken as a result of
this assessment? - What professional development or support
structures should be in place to ensure the
action steps are taken?
8Another ConsiderationTiming
- Interim assessments relation to instruction cycle
- Primarily before instruction to diagnose strength
of understanding of a topic to inform upcoming
instruction - Primarily during instruction to set pacing,
strengthen motivation and relations (Good job!)
and therefore to assist with management of
learning (not alter the content) - Primarily during instruction to diagnose and
evaluate how well students are developing
appropriate understandings of content/skills to
perhaps adjust focus of instruction and
curriculum - Primarily after instruction to diagnose what
students did not yet get and therefore to inform
remedial instruction
9What to do next? Learning Goals
- Mastery
- Next in core sequence
- Extension
- More independent, less structured
- Transfer, application
- Motivation and other values
10Decision Tree
11Characteristics of a Good Interim Assessment
System
- Provides valid and reliable results that are easy
to interpret and provide information on next
steps - Includes a rich representation of content with
items linked directly to the content standards
and specific teaching units. - Fits within the curriculum so that the test is an
extension of the learning rather than a time-out
from learning - Three main elements
- Reporting Elements
- Assessment Design
- Administration Guidelines
12Reporting
- Policymakers should consider carefully the
reporting component - Thinking about the end result helps to
conceptualize the design - Reporting translates data into action
- Consider all pieces of reporting
- Qualitative as well as quantitative information
13Reporting Elements
- Type of data summary
- Include normative reference
- Compare against criterion reference
- Aggregate across classroom/school/district
- Type of qualitative feedback
- Information on correct/incorrect responses by
content area - Feedback on what an incorrect answer implies
- Suggestions for next steps
14Assessment Design
- Match item type to purposes
- Predictive Match type to what you are
predicting, perhaps with additional probes - Instructional More open-ended, probes,
performance tasks - Evaluative Can use a combination of
multiple-choice and short-answer items with why
probes - Item type needs to take learning progression into
consideration - Number and length of items will also influence
fit into instruction
15Administration Guidelines
- Flexibility in creating forms
- Administered within instruction or separate from
instruction - Adaptive or not
- Flexibility in when/where the assessment is given
- Computer-based
- Web-based
- Paper-and-pencil
- Turnaround time for results
16Administration Needs x Purpose
17States and Districts can
- Provide policy support for local assessments
- Help create item banks and foster consortia-type
relationships across districts and even with
other states - Support and structure professional learning
opportunities to foster successful implementation
18States and Districts Current Role
- State and districts often purchase commercially
available products (e.g., formative/diagnostic/pre
dictive/benchmark assessments) - How does what we want match what already exists?
- What other options are available?
- Customized assessment
- Locally-designed assessment
19Features of Many Current Systems
- What these systems can do
- Provide an item bank linked directly to state
content standards - Assess students on a flexible time schedule
wherever a computer and internet connection are
available - Provide immediate results
- Highlight content standards in which more items
were answered incorrectly - Link scores of these assessments to the scores of
the end-of-year assessments to predict results on
the end-of-year assessments - Questions these systems can answer
- Is this student on track to score Proficient on
the end-of-year NCLB tests? - Which students are on track to score Proficient
on the end-of-year NCLB tests? - Which content standards are the students
Proficient in and which content standards show
the weakest student performance (for a student,
classroom, school, district, state)? - How does this students performance compare to
the performance of other students in the class?
20What These Systems Lack
- What these systems cannot do
- Provide rich detail about the curriculum assessed
- Provide a qualitative understanding of a
students misconception(s) - Provide full information on the students depth
of knowledge on a particular topic - Further a students understand through the type
of assessment task - Give teachers the information on how to implement
an instructional remedy - Questions these systems cannot answer
- Why did a student answer an item incorrectly?
- What are possible strategies for improving
performance in this content area? - What did the student learn from this assessment?
- What depth of knowledge does a student display in
this content area? - What type of thinking process is this student
using to complete this task?
21An Example
- This example is provided to stretch our thinking
about models of instructionally-supportive
interim assessment systems - This approach would provide
- data to evaluate programs at certain benchmark
points, - information that teachers could use quickly to
improve learning - models for helping teachers learn about creating
learning and assessment tasks designed to foster
deep thinking in students
22ExampleA district with the goals of
- Implementing an assessment system to provide more
in-depth information about student strengths and
weaknesses in specific content domains - Providing additional feedback and instruction for
students with identified weaknesses - Using a set of rich tasks as part of model
instructional units and/or engaging assessment
activities (intended to be embedded in
curriculum) to provide opportunities for deeper
learning and assessment - Gathering benchmark information to help the
school and district evaluate school effectiveness
and instructional programs
23ExampleReporting criteria
- Report on limited number of important fine-grain
benchmarks/indicators for any one occasion - Allow for examination of student work as teachers
score and produce summaries - Identify areas of weakness
- Provide professional development and information
so teachers can determine the next instructional
steps - Aggregate across classrooms, school, and the
district - Disaggregate results by the same reporting
categories used in the end-of-year reports
(racial/ ethnic group, disability status, LEP)
24ExampleAssessment design
- Rich tasksranging in length from one period to a
couple of weeksdeveloped in partnership with
states teachers - Tasks mapped directly to the finer grain units of
the content standards (e.g., indicators) - Most tasks would have multiple scoreable units
- Tasks are designed to be embedded in curriculum
so that activities can be seen as seamless with
instruction - Substantial professional development must be
provided for teacher to learn how to administer,
score, and analyze tasks
25ExampleAdministration requirements
- Teachers should be taught to conduct systematic
observations during task administration - Opportunities (time) should be provided to allow
for relatively quick turnaround of results (e.g.,
within a week or two to allow time for
intervention) - It might be possible, as technology improves, to
have these tasks administered via computer, but
it would likely be administered via
pencil-and-paper in the near future - Districts and schoolsdepending on local
decisionscan determine the most appropriate
timing for administration, but should make sure
that students have a fair opportunity to learn
the knowledge and skills embodied in the tasks
26Examples of Specific Technical Quality
Requirements
- Tasks should be carefully validated regarding the
standards and cognitive processes assessed - The quality and scope of professional needs to be
carefully evaluated (formatively!) to ensure that
teachers can develop the knowledge and skills to
use and learn from these tasks - When the tasks are administered, they must cover
only content that has been instructed user
should be able to evaluate the alignment by
seeing the items (or alignment documents) - The collection of tasks administered through the
year should represent a technically sound range
of difficulty and appropriate breadth
27Technical QualityContinued
- Scores/subscores should be acceptably reliable
for the use (e.g., should be moderately reliable
if using this test for grading, but considerably
higher if used for student accountability) - System should be evaluated for effects on
- student learning, especially in terms of
generalizability and transfer - student motivation as a result of engaging with
these tasks - curricular quality as a result of incorporating
tasks - increases in teacher knowledge of content,
pedagogy, and student learning - manageability, including the quality of
implementation
28How do I know Im getting my moneys worth?
- Validating the evidence will be important to do
over the next couple of years - If the test is used for predictive purposes, do a
follow up study to determine that the predictive
link is reasonably accurate and that the use of
the test contributes to improving criterion
(e.g., end of year) scores - If the test is used for instructional purposes,
follow up with teachers to determine how the data
were used and whether there was evidence of
improved student learning for current students - If the test is used for evaluative purposes,
gather data from other sources to triangulate
results of interim assessment and follow up to
monitor if evaluation decisions are supported
29Our Recommendations
- Avoid mini-summative assessments
- Focus on the reporting elements to clarify
assessment design - Ensure the instructional supports are in place to
allow teachers to use the results effectively - This includes substantial professional
development on assessment literacy, score
interpretation, and necessary instructional
actions - Validate the use of the assessment
30Further Research
- Create a validity argument for how interim
assessments lead to improved student learning - Examine differential effects of interim
assessments on students intrinsic motivation to
learn - Determine requirements for building a system that
provides teachers the information they need but
can still be scaled to compare results across
students, teachers, schools - Analyze the types of professional development
linked to effective use of interim assessments
and important elements of the delivery system
31Conclusion
- There are valid purposes for giving interim
assessments beyond informing instruction at that
point - Examine the purpose of the assessment and what it
can and cannot do - Match the features of the assessment to the
purpose of using it - Further research is needed linking the use of
interim assessments with improved student
performance
32For more information
- Center for Assessment
- www.nciea.org
- Marianne Perie
- mperie_at_nciea.org
- Scott Marion
- smarion_at_nciea.org
- Brian Gong
- bgong_at_nciea.org