Title: New England Common Assessment Program
1New England Common Assessment Program
Item Review Committee Meeting March
30, 2005 Portsmouth, NH
2Welcome and Introductions
- Tim Kurtz Director of Assessment
New Hampshire Department of
Education - Michael Hock Director of Educational Assessment
- Vermont Department of Education
- Mary Ann Snider Director of Assessment
Accountability
Rhode Island Department
of Education - Tim Crockett Assistant Vice President
Measured Progress
3Logistics
- Meeting Agenda
- Committee Member Expense Reimbursement Form
- Substitute Reimbursement Form
- NECAP Nondisclosure Form
- Handouts of presentations
4Morning Agenda
- Test Development Past, Present Future
- How we got here? Tim Kurtz, NH DoE
- Statistical Analyses Tim Crockett, MP
- Bias/Sensitivity Michael Hock, VT DoE
- Depth of Knowledge Ellen Hedlund, RI DoE
- Betsy Hyman, RI DoE
- 2005-2006 Schedule Tim Kurtz, NH DoE
- So, what am I doing here?
5Item Review Committee
- How did we get to where we are today?
- Tim Kurtz
- Director of Assessment
- New Hampshire Department of Education
6NECAP Pilot Review 2004-05
- 1st Bias Committee meeting March
- 1st Item Review Committee meeting April
- 2nd Item Review Committee meeting July
- 2nd Bias Committee meeting July
- Face-to-Face meetings August
- Test Form Production and DOE Reviews
- August
7NECAP Pilot Review 2004-05
- Reading and Mathematics
- Printing and Distribution September
- Test Administration Workshops October
- Test Administration October 25 29
- Scoring December
- Data Analysis Item Statistics January
- Teacher Feedback Review February
- Has affected item review, accommodations, style
guide and administration policies - Item Selection meetings February March
8NECAP Pilot Review 2004-05
- Writing
- Printing and Distribution December January
- Test Administration January 24 - 28
- Scoring March
- Data Analysis Item Statistics April
- Item Selection meetings April May
9NECAP Pilot Review 2004-05
- What data was generated from the pilot and what
do we do with it? - Tim Crockett
- Assistant Vice President
- Measured Progress
10Item Statistics
?The review of data and items is a judgmental
process ?Data provides clues about the
item ?Difficulty ?Discrimination ?Differential
Item Functioning
11 At the top of each page . . .
12The Item and any Stimulus Material
13Item Statistics Information
14Item Difficulty(multiple-choice items)
- ?Percent of students with a correct response.
Range is from .00 to 1.00 - 0.00
1.00 - Difficult
Easy - ?NECAP needs a range of difficulty, but
- below .30 may be too difficult
- above .80 may be too easy
15Item Difficulty(constructed-response items)
- Average score on the item.
- Range is from .00 to 2.00 or 0.00 to 4.00
- On 2-point items
- below 0.4 may be too difficult
- above 1.6 may be too easy
- On 4-point items
- below 0.8 may be too difficult
- above 3.0 may be too easy
16Item Discrimination
?How well an item separates higher performing
students from lower performing students ?Range is
from -1.00 to 1.00 ?The higher the discrimination
the better ?Items with discriminations below .20
may not be effective and should be reviewed
17Other Discrimination Information(multiple-choice
items)
18Differential Item Functioning
? DIF (F-M) females compared to males who
performed the same on the test are compared on
their performance on the item ? positive number
reflects females scoring higher ? negative number
reflects males scoring higher ? NS means no
significant difference
19Item Statistics Information
20- Dorans and Holland, 1993
- For CR items .20 or .20 represents
negligible DIF - gt.30 or .30 represents low DIF
- gt.40 or .40 represents high DIF
21Bias/Sensitivity Review
- How do we insure that this test works well for
students from diverse backgrounds? - Michael Hock
- Director of Educational Assessment
- Vermont Department of Education
22What Is Item Bias?
- Bias is the presence of some characteristic of an
assessment item that results in the differential
performance of two individuals of the same
ability but from different student subgroups - Bias is not the same thing as stereotyping
although we dont want either in NECAP - We need to ensure that ALL students have an equal
opportunity to demonstrate their knowledge and
skills
23How Do We Prevent Item Bias?
- Item Development
- Bias-Sensitivity Review
- Item Review
- Field-Testing Feedback
- Pilot-Testing Data Analysis (DIF)
24Role of the Bias-Sensitivity Review Committee
The Bias-Sensitivity Review Committee DOES need
to make recommendations concerning
- Sensitivity to different cultures, religions,
ethnic and socio-economic groups, and
disabilities - Balance of gender roles
- Use of positive language, situations and images
- In general, items and text that may elicit strong
emotions in specific groups of students, and as a
result, may prevent those groups of students from
accurately demonstrating their skills and
knowledge
25Role of the Bias-Sensitivity Review Committee
The Bias-Sensitivity Review Committee DOES NOT
need to make recommendations concerning
- Reading Level
- Grade Level Appropriateness
- GLE Alignment
- Instructional Relevance
- Language Structure and Complexity
- Accessibility
- Overall Item Design
26Passage Review Rating Form
This passage does not raise bias and/or
sensitivity concerns that would interfere with
the performance of a group of students
27Universal Design
Improved Accessibility through Universal design
28Universal Design
Improved Accessibility through Universal design
- Inclusive assessment population
- Precisely defined constructs
- Accessible, non-biased items
- Amenable to accommodations
- Simple, clear, and intuitive instructions and
procedures - Maximum readability and comprehensibility
- Maximum legibility
29Item Complexity
- How do we control item complexity?
- Ellen Hedlund and Betsy Hyman
- Office of Assessment and Accountability
- Rhode Island Department of Elementary and
Secondary Education
30Depth of Knowledge
- A presentation adapted from Norman Webb for the
NECAP Item Review Committee - March 30, 2005
31Bloom Taxonomy
Knowledge Recall of specifics and
generalizations of methods and
processes and of pattern, structure, or setting.
Comprehension Knows what is being
communicated and can use the material or
idea without necessarily relating
it. Applications Use of abstractions in
particular and concrete
situations. Analysis Make clear the relative
hierarchy of ideas in a body of material or
to make explicit the relations among the
ideas or both. Synthesis Assemble parts into
a whole. Evaluation Judgments about the
value of material and methods
used for particular purposes.
32U.S. Department of Education GuidelinesDimensions
important for judging the alignment between
standards and assessments
- Comprehensiveness Does assessment reflect full
range of standards? - Content and Performance Match Does assessment
measure what the standards state students should
both know be able to do? - Emphasis Does assessment reflect same degree of
emphasis on the different content standards as is
reflected in the standards? - Depth Does assessment reflect the cognitive
demand depth of the standards? Is assessment as
cognitively demanding as standards? - Consistency with achievement standards Does
assessment provide results that reflect the
meaning of the different levels of achievement
standards? - Clarity for users Is the alignment between the
standards and assessments clear to all members of
the school community?
33Mathematical Complexity of ItemsNAEP 2005
Framework
- The demand on thinking the items requires
- Low Complexity
- Relies heavily on the recall and recognition of
previously learned concepts and principles. - Moderate Complexity
- Involves more flexibility of thinking and choice
among alternatives than do those in the
low-complexity category. - High Complexity
- Places heavy demands on students, who must
engage in more abstract reasoning, planning,
analysis, judgment, and creative thought.
34Depth of Knowledge (1997)
- Level 1 Recall
- Recall of a fact, information, or
procedure. - Level 2 Skill/Concept
- Use information or conceptual
knowledge, two or more steps, etc. - Level 3 Strategic Thinking
- Requires reasoning, developing
plan or a sequence of steps,
some complexity, more than
one possible answer. - Level 4 Extended Thinking
- Requires an investigation, time to
think and - process multiple conditions of
the problem.
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40Practice Exercise
- Read the passage, The End of the Storm
- Read and assign a DOK to each of the 5 test
questions - Form groups of 4-5 to discuss your work and reach
consensus of a DOK for each test question
41Issues in Assigning Depth-of-Knowledge Levels
- Variation by grade level
- Complexity vs. difficulty
- Item type (MC, CR, ER)
- Central performance in objective
- Consensus process in training
- Aggregation of DOK coding
- Reliabilities
42Web Sites http//facstaff.wcer.wisc.edu/normw/ A
lignment Tool http//www.wcer.wisc.edu/WAT/index.a
spx Survey of the Enacted Curriculum http//www.S
ECsurvey.org
43NECAP Operational Test 2005-06
- What is the development cycle for this year?
- What is your role in all this?
- Tim Kurtz
- Director of Assessment
- New Hampshire Department of Education
44NECAP Operational Test 2005-06
- 1st Bias Committee meeting March 8-9
- 18 teachers 6 from each state
- 1st Item Review Committee meeting March 30
- 72 teachers 12 from each state in each content
area - 2nd Item Review Committee meeting April 27-28
- Practice Test on DoE website early May
- 2nd Bias Committee meeting May 3-4
- Face-to-Face meetings May 25-27 June 1-3
- Test Form Production and DOE Reviews July
45NECAP Operational Test 2005-06
- Printing August
- Test Administration Workshops Aug Sept
- Shipments to schools September 12-16
- Test Administration Window October 3-21
- 204,000 students and 25,000 teachers from the 3
states - Scoring November
- Standard Setting December
- Teachers and educators from the three states
- Reports shipped to schools Late January
46TIRC So, why are we here?
- This assessment has been designed to support a
quality program in mathematics and English
language arts. It has been grounded by the input
of hundreds of NH, RI, and VT educators. Because
we intend to release assessment items each year,
the development process continues to depends on
the experience and professional judgment and
wisdom of classroom teachers from our three
states.
47TIRC Our role.
- We have worked hard to get to this point.
Today, you will be looking at passages in reading
and some items in mathematics. - The role of Measured Progress staff is to keep
the work moving along productively. - The role of DoE content specialists is to listen
and ask clarifying questions as necessary.
48TIRC Your role?
- You are here today to represent your diverse
contexts. We hope that you - share your thoughts vigorously, and listen just
as intensely we have different expertise and we
can learn from each other, - use the pronouns we and us rather than they
and them we are all working together to make
this the best assessment possible, and - grow from this experience I know we will.
- And we hope that today will be the beginning of
some new interstate friendships.
49 Information, Questions and Comments
- Tim Kurtz Director of Assessment
- NH Department of Education
- TKurtz_at_ed.state.nh.us
- (603) 271-3846
- Mary Ann Snider Director of Assessment and
Accountability - Rhode Island Department of Elementary
and Secondary Education - MaryAnn.Snider_at_ride.ri.gov
- (401) 222-4600 ext. 2100
- Michael Hock Director of Educational Assessment
- Vermont Department of Education
- MichaelHock_at_education.state.vt.us
- (802) 828-3115