Title: Aligning Science Assessment to Content Standards
 1Aligning Science Assessment to Content Standards 
- George DeBoer, Arhonda Gogos, Cari Herrmann 
Abell, Kristen Lennon, An Michiels, Tom Regan, Jo 
Ellen Roseman,  - Paula Wilson 
 - Center for Curriculum Materials in Science 
 - Knowledge Sharing Institute 
 - Ann Arbor, Michigan 
 - July 10-12, 2006 
 - This work is funded by the National Science 
Foundation  - ESI 0352473 
 
  2Thanks to
- Abigail Burrows for organizing the pilot testing 
with schools.  - Ed Krafsur for developing the assessment data 
base.  - Brian Sweeney for developing illustrations for 
test items. 
  3Strand 6 Part I
- Examining the Project 2061 Criteria for Aligning 
Middle School Assessment Items to Learning Goals 
  4Aligning Student Assessment to Content Standards
- What We Are Doing Project Background 
 - Creating a bank of middle and early high school 
science assessment items that are precisely 
aligned with national content standards  - Providing resources to support the creation and 
use of assessment items aligned to content 
standards  - Developing a data base for these resources and a 
user interface to access the resources  - In this session, we will focus on the criteria we 
use for judging alignment of assessment items to 
content standards.  
  5Resources We Will Provide
- Clarifications of the content standards 
(elaboration, boundary setting, i.e., whats in 
and whats out). To add precision to the 
alignment of assessment items.  - Summaries of research on student learning 
(misconceptions and other ideas students hold) 
related to the ideas in the content standards. To 
serve as distractors in assessment items.  - Assessment maps (which include prerequisite 
ideas, related ideas, ideas that come later in 
the learning trajectory). Useful for developing 
test instruments on a specific topic. Also useful 
in item development for deciding what knowledge 
is reasonable to expect students to have (e.g., 
bedrock).  
  6List of Topics
- Atoms, Molecules and States of Matter 
 - Substances, Chemical Reactions and Conservation 
 - Processes that shape the Earth / Plate Tectonics 
 - Weather and Climate 
 - Solar System 
 - Energy Transformations 
 - Force and Motion 
 - Forces of Nature 
 - Sight and Vision 
 - Mathematics Summarizing Data 
 - Mathematics Relationships among Variables 
 
  7List of Topics, Continued
- Basic Functions in Humans 
 - Cells and Proteins 
 - Evolution and Natural Selection 
 - Interdependence, Diversity and Survival 
 - Matter and Energy Transformations in Living 
Systems  - Sexual Reproduction, Genes and Heredity 
 - Cross-cutting Themes Models 
 - Nature of Science Claims of Causal Relationships 
 - Nature of Science Inductive Reasoning 
 - Nature of Science Empirical Validation of Ideas 
about the World  - Nature of Science Uncertainty and Durability
 
  8Examples of
- Clarification statements 
 - Summaries of research on student learning 
 - Assessment maps 
 - How each is used in the item development work.
 
  9Idea B All atoms are extremely small (from BSL 
4D/M1a).
- Students are expected to know that atoms are much 
smaller than very small items with which they are 
familiar, such as dust, blood cells, plant cells, 
and microorganisms, all of which are made up of 
atoms. Students should know that the atoms are 
so small that many millions of them make up these 
small items with which they are familiar. They 
should know that this is true for all atoms. The 
comparison with very small objects can be used to 
test students qualitative understanding of the 
size of atoms in relation to these objects. 
Students will not, however, be expected to know 
the actual size of atoms. 
  10Student Misconceptions Related to the Size of 
Atoms
- Atoms and/or molecules are similar in size to 
cells, dust, or bacteria (Lee et al., 1993 
Nakhleh et al., 1999 Nakhleh et al., 2005).  - Atoms and/or molecules can be seen with 
magnifying lenses or optical microscopes 
(Griffiths et al., 1992 Lee et al., 1993). 
  11(No Transcript) 
 12Steps in the Item Development Procedure
- Select a set of benchmarks and standards to 
define the boundaries of a topic  - Tease apart the benchmarks and standards into a 
set of key ideas  - Create an assessment map showing how the key 
ideas build on each other conceptually  - Review the research on student learning to 
identify ideas students may have about the ideas  - Design items 
 - using student misconceptions as distractors 
 - using the assessment analysis criteria 
 - following a list of design specifications 
 
  13Steps in the Item Development Procedure, cont
- Use open-ended interviewing to supplement 
published research on student learning  - Use mini item camps to get feedback on items 
from staff  - Revise items 
 - Pilot test items and conduct think aloud 
interviews  - Analyze pilot test data 
 - Revise items 
 - Conduct formal reviews of approximately 25 items 
using the assessment analysis criteria  - Revise items 
 - Conduct national field test of items
 
  14Demonstration of the Database and User Interface
- Items 
 - Misconception List 
 - Topics, key ideas, clarifications 
 - Assessment Maps 
 - Item Specifications 
 
  15- The Project 2061 Assessment Analysis Procedure
 
  16There are six parts to the analysis procedure
- Exploring the Learning Goal 
 - Determining Content Alignment 
 - Determining Whether the Task Accurately Reveals 
What Students do or do not Know  - Considering the Tasks Cost Effectiveness 
 - Suggesting Revisions 
 - Assessment Item Rating Form (not included in this 
version) 
  17Reviewers use the following materials
- Assessment Items 
 - The content standard that is being targeted 
 - Clarification statements 
 - Lists of common student misconceptions and other 
ideas students may have.  - Results of student interviews or field test 
results if available  
  18I. Exploration Phase
- Determining the alignment of an assessment task 
to a learning goal requires a precise 
understanding of the meaning of the learning goal 
and what knowledge and skills are needed to 
successfully complete the task.  
  19A. The Learning Goal
- Reviewers carefully read the clarification 
statement written for the targeted learning goal 
(content standard or benchmark).  - Reviewers examine the list of misconceptions 
related to the targeted learning goal. 
  20B. The Assessment Task
- Reviewers 
 - attempt to complete the task themselves. 
 - list the knowledge and skill needed to 
successfully complete the task.  - consider if there are different strategies that 
can be used to successfully complete the task.  - consider which misconceptions might affect 
student answers.  
  21- II. Determining the Content Alignment between the 
Learning Goal and the Assessment Task  
  22A. Necessity
- To be content aligned, knowledge of the ideas 
described in the learning goal or the 
clarification statement, or knowledge that 
certain commonly held misconceptions are not 
true, must be needed to evaluate each of the 
answer choices. 
  23Reviewers are told
- If the knowledge in the learning goal is not 
needed to decide if the answer choices are 
correct or incorrect, explain how the answer 
choices can be evaluated using other knowledge. 
  24Applying the Necessity Criterion
- Which of the following is the smallest? 
 - A.  An atom 
 - B.  A bacterium 
 - C.  The width of a hair 
 - D.  A cell in your body
 
  25Idea B All atoms are extremely small (from BSL 
4D/M1a).
- Students are expected to know that atoms are much 
smaller than very small items with which they are 
familiar, such as dust, blood cells, plant cells, 
and microorganisms, all of which are made up of 
atoms. Students should know that the atoms are 
so small that many millions of them make up these 
small items with which they are familiar. They 
should know that this is true for all atoms. The 
comparison with very small objects can be used to 
test students qualitative understanding of the 
size of atoms in relation to these objects. 
Students will not, however, be expected to know 
the actual size of atoms. 
  26Applying the Necessity Criterion
- The knowledge in the learning goal is needed to 
evaluate each answer choice. 
  27An example of an item for which the targeted 
knowledge is not needed
- Targeted Idea Substances may react chemically 
in characteristic ways with other substances to 
form new substances with different characteristic 
properties (based on NSES 5-8BA2a). 
  28- Which of the following is an example of a 
chemical reaction?  - A piece of metal hammered into a tree. 
 - A pot of water being heated and the water 
evaporates.  - A spoonful of salt dissolving in a glass of 
water.  - An iron railing developing an orange, powdery 
surface after standing in air.  
  29Applying the Necessity Criterion
- The knowledge in the learning goal is not needed. 
 - Answer choice D, the correct answer, is a 
specific instance of a general principle (SIGP). 
The student can get the item correct by knowing 
that rusting is a chemical reaction without 
knowing the general principle that new substances 
are formed that have different characteristic 
properties.  
  30B. Sufficiency
- To be content aligned, knowledge of the ideas 
described in the learning goal or the 
clarification statement, or knowledge that 
certain commonly held misconceptions are not 
true, must be all that is needed to evaluate 
each of the answer choices. Students should not 
need any additional science knowledge.  
  31Reviewers are told
- If the knowledge in the learning goal is not 
enough to evaluate each of the answer choices, 
indicate what additional knowledge is needed. 
(Do not include as additional knowledge those 
things that can be assumed as general knowledge 
and ability of students this age.)  - An example of additional knowledge might include 
science or mathematics terminology that students 
are not expected to know. 
  32Applying the Sufficiency Criterion
- Which of the following is the smallest? 
 - A.  An atom 
 - B.  A bacterium (clarification statement says 
microorganism)  - C.  The width of a hair 
 - D.  A cell in your body 
 
  33Applying the Sufficiency Criterion
- The sufficiency criterion is not met. Students 
need to know the term bacterium, which is 
additional knowledge. Although a listed 
misconception includes the word bacteria, in 
pilot testing, 25 of 193 students indicated that 
they did not know what a bacterium was (even 
though most knew what bacteria were). The item 
should say microorganism or bacteria to match 
the clarification statement and/or misconception 
list. 
  34Applying the Sufficiency Criterion
- Approximately how many carbon atoms placed next 
to each other would it take to make a line that 
would cross this dot ? ?  - A.  6 
 - B.  600 
 - C.  6000 
 - D.  6,000,000 
 - Note This item assumes a 1mm dot and a diameter 
of 1.5Å for a carbon atom.  
  35Applying the Sufficiency Criterion
- The sufficiency criterion is met. Students need 
to know that like the other small things 
mentioned in the clarification statement, e.g., 
dust, plant cells, blood cells, and 
microorganisms, this small visible dot is also 
made of millions of atoms.  - Note This item assumes a 1mm dot and a diameter 
of 1.5Å for a carbon atom.  
  36Idea B All atoms are extremely small (from BSL 
4D/M1a). (Not included in the workshop packet.)
- Students are expected to know that atoms are much 
smaller than very small items with which they are 
familiar, such as dust, blood cells, plant cells, 
and microorganisms, all of which are made up of 
atoms. Students should know that the atoms are 
so small that many millions of them make up these 
small items with which they are familiar. They 
should know that this is true for all atoms. The 
comparison with very small objects can be used to 
test students qualitative understanding of the 
size of atoms in relation to these objects. 
Students will not, however, be expected to know 
the actual size of atoms nor the 
order-of-magnitude relationships to other 
objects. 
  37III. Determining Whether the Task Accurately 
Reveals What Students Do and Do Not Know
- Its a validity issue. Students should choose 
the correct answer when they know the idea and 
they should choose an incorrect answer when they 
do not know the idea.  - Getting rid of factors not related to the 
knowledge being measured (construct irrelevant 
factors)  - Reducing false negatives and false positives 
 
  38A. Comprehensibility
- 1. It is not clear what question is being 
asked. Explain.  - 2. The task uses unfamiliar general vocabulary 
that is not clearly defined. List potentially 
unfamiliar vocabulary and explain. (Note This is 
referring to general language usage, not 
technical scientific or mathematical terminology, 
which is addressed under Sufficiency.)  - The task uses unnecessarily complex sentence 
structure or ambiguous punctuation that makes the 
task difficult to comprehend when plain language 
could have been used. Explain.  - (Note Rebecca Kopriva, C-SAVE, Maryland.)
 
  39Comprehensibility Continued
- The task uses words and phrases that have 
unclear, confusing, or ambiguous meanings. This 
may include commonly used words that have special 
meaning in the context of science. For example 
the word finding could be unfamiliar to 
students when referring to a scientific 
finding. Note all places where words, both 
general and scientific) do not have clear and 
straightforward meanings.  - There is inaccurate information (including what 
is in the diagrams and data tables) that may be 
confusing to students who have a correct 
understanding of the science. Explain.  - The diagrams, graphs, and data tables may not be 
clear or comprehensible. (For example, they may 
include extraneous information, inaccurate or 
incomplete labeling, inappropriate size or 
relative size of objects, etc.) Explain.  - Other. Provide a brief explanation. 
 
  40Comprehensibility
- An item with comprehensibility issues.
 
  41Most sidewalks made out of concrete have cracks 
every few yards as shown in the diagram below.  
These are called expansion joints as labeled in 
the diagram below.  What happens to the width of 
the cracks during a hot day in the summer and 
why?
- A.  The cracks get wider because the concrete 
shrinks.  - B.  The cracks get wider because the concrete 
gets softer.  - C.  The cracks get narrower because the concrete 
expands.  - D.  The cracks get narrower because the ground 
underneath the sidewalk shrinks. 
  42Most sidewalks made out of solid concrete have 
spaces between the sections as shown in the 
diagram below.  What happens to the width of the 
spaces during a hot day in the summer and why?  
- A.  The spaces get wider because the concrete 
shrinks.  - B.  The spaces get narrower because the concrete 
expands.  - C.  The spaces get stay the same because the 
concrete does not shrink or expand.  - D.  Some spaces get narrower and some get wider 
because some concrete expands and some concrete 
shrinks 
. 
 43B. Appropriateness of Task Context 
- a. The context may be unfamiliar to most 
students. Explain.  - b. The context may advantage or disadvantage one 
group of students because of their interest or 
familiarity with the context. Explain.  - c. The context is complicated and not easy to 
understand so that students might have to spend a 
lot of time trying to figure out what the context 
means. Explain. 
  44Appropriateness of Task Context, Continued
- The information and quantities that are used are 
not reasonable or believable. Explain.  - e. The context does not accurately represent 
scientific or mathematical realities or, if 
idealizations are involved, it is not made clear 
to students that it is an idealized situation. 
Explain.  - f. Other. Explain. 
 
  45C. Resistance to Test-Wiseness 
- 1. Some of the distractors are not plausible. 
Explain.  - 2. One of the answer choices differs in length 
or contains a different amount of detail from the 
other answer choices. Explain.  - 3. One of the answer choices is qualified 
differently from the other answer choices, using 
words such as usually or sometimes, or an 
answer choice uses different units of 
measurement. Explain.  - 4. The use of logical opposites may lead 
students to eliminate answer choices. Explain. 
  46Resistance to Test-Wiseness, Continued
- One of the answer choices contains vocabulary at 
a different level of difficulty from the other 
answer choices that may make it sound more 
scientific. Explain.  - 6. The language in one of the answer choices 
mirrors the language in the stem. Explain.  - 7. There are other test-taking strategies that 
may be used in responding to this task. Explain  
  47An item with test-wiseness issues
- This item is targeted to Idea A from Matter and 
Energy Transformations in Living Systems  -  Food is a source of molecules that serve as 
fuel and building material for all organisms.  - Is the oxygen that animals breathe a kind of 
food?  - Yes, because oxygen enters the body. M-A2 
 - Yes, because all animals need oxygen to survive. 
M-A3  - No, because animals do not get energy from 
oxygen. From clarification of Idea A.  - No, because oxygen can enter an animals body 
through its nose. M-A1, M-A2.  
  48Misconceptions and other Ideas students may have 
Matter and Energy Transformations Idea A
- Many children associate the word food with what 
they identify as being edible (Driver, 1984 
Driver, Squires, Rushworth,  Wood-Robinson, 
1994 Lee  Diong, 1999).  - Students see food as substances (water, air, 
minerals, etc.) that organisms take directly in 
from their environment (Anderson, Sheldon,  
Dubay, 1990 Simpson  Arnold, 1982).  - Some students think that food is what is needed 
to keep animals and plants alive (Driver et al., 
1994). 
  49Analyzing test-wiseness issues
- Conclusion Answer choice D (No, because oxygen 
can enter an animals body through its nose), is 
not a plausible explanation for why oxygen is not 
food. The answer choice is likely to be 
eliminated because of its implausibility, which 
is one of the factors (C1) used in assessing 
test-wiseness. (In pilot testing, 5 of 29 
students selected this, thinking that the point 
of entry is what determines if something is food. 
 Many others questioned how the nose is relevant 
in a question about food.)  - The answer choice could be improved by changing 
it to say that oxygen is not food because it is 
not edible (M-A1) or because it does not enter 
through an animals mouth. 
  50IV. Considering the Tasks Cost Effectiveness
- Does the task require an inordinate amount of 
time to complete? Ask whether the time needed 
for students to read the question, make 
calculations, interpret a data table, or read a 
graph is warranted. Provide a brief explanation 
of why the task is not cost effective and how the 
same information might be elicited more 
efficiently.  -  
 
  51V. Suggesting Revisions
- Based on your analysis of the task, make your 
suggested revisions or indicate if you think the 
task should be eliminated from consideration. 
  52Begin Content-Focused Activities 
 53Aligning Science Assessment to Content Standards 
- George DeBoer, Arhonda Gogos, Cari Herrmann 
Abell, Kristen Lennon, An Michiels, Tom Regan, Jo 
Ellen Roseman,  - Paula Wilson 
 - Center for Curriculum Materials in Science 
 - Knowledge Sharing Institute 
 - Ann Arbor, Michigan 
 - July 10-12, 2006 
 - This work is funded by the National Science 
Foundation  - ESI 0352473 
 
  54Thanks to
- Abigail Burrows for organizing the pilot testing 
with schools.  - Ed Krafsur for developing the assessment data 
base.  - Brian Sweeney for developing illustrations for 
test items. 
  55Strand 6 Part II
- Using Student Data to Inform the Design of 
Assessment Items in Middle School Science  
  56Steps in the Item Development Process
- Select a set of benchmarks and standards to 
define the boundaries of a topic  - Tease apart the benchmarks and standards into a 
set of key ideas  - Create an assessment map showing how the key 
ideas build on each other conceptually  - Review the research on student learning to 
identify ideas students may have about the 
content  - Design items 
 - using student misconceptions as distractors 
 - following the assessment analysis criteria 
 - following a list of design specifications 
 
  57Steps in the Item Development Process, cont
- Use open-ended interviewing to supplement 
published research on student learning  - Use mini item camps to get feedback on items 
from staff  - Revise items 
 - Pilot test items and conduct think aloud 
interviews  - Analyze pilot test data 
 -  Revise items 
 - Conduct formal reviews of approximately 25 items 
using the assessment analysis criteria  - Revise items 
 - Conduct national field test of items
 
  58Using Pilot Testing and Think Aloud Interviews
- We use pilot testing and interviewing to probe 
student thinking about the targeted ideas and the 
test items.  - We compare student answer choices to their 
explanations.  - When answer selections and explanations dont 
match, we look for problems with the item that 
could produce these mismatches. 
  59Interviewing Snapshot for 2005 and 2006
- 7 schools (urban, suburban) 200 interviews 
 - Free and reduced lunch ranged from 2 to 78 
 - Some think-aloud some open-ended 
 - Open-ended interviews were used to inform item 
development. Student comments helped in the 
writing of distractors.  - All interviews done by the item writers.
 
  60Think-Aloud Interview Procedure
- Please read the question aloud, think about the 
answer choices, and circle the best one. Feel 
free to write down anything on the test paper 
that helps you to answer the question.  - Could you tell me in your own words what the 
question is asking?  - Why did you choose the answer you chose? 
 - Were there other answer choices that you almost 
chose? (Why?)  -  Continued 
 
  61- Were there any answer choices that you did not 
even consider? (Why?)  - Was there an answer choice you were expecting to 
see but did not? What was it?  - Were there any words or diagrams you did not 
really understand or situations that made the 
question confusing? Do you think anything would 
be confusing to your classmates?  - Are you familiar with the situation that is 
presented in the question?  - Where did you learn about the topic in this 
question? Have you seen a question like this 
before? 
  62Getting permission to conduct interviews
- We inform the school administrators that 
 - The students responses will be used only to 
judge the quality of the test questions and will 
NOT be used as a measure of students knowledge 
or ability, instructional quality, or the quality 
of the school.    - The students are coded to protect their identity. 
  - The parents are asked to sign a permission 
letter.  - Some school districts require Institutional 
Review Board (IRB) approval. 
  63We provide incentives
- The revised versions of the items are made 
available to the teachers and administrators.   - We provide a report on what we learned regarding 
student knowledge of the targeted ideas and 
misconceptions students may have.  - We offer a workshop on developing assessment 
items aligned to content standards to 
volunteering teachers and/or participating 
schools.  - As a token of our appreciation, students receive 
a gift certificate to Borders bookstore for each 
interview. 
  64Limitations
- Considerable time requirement 
 - Small student sample 
 - Hard to get access to students
 
  65Piloting snapshot
- Total of 112 classrooms across 5 content areas. 
 - Atoms and Molecules 726 students 
 - Force and Motion 610 students 
 - Flow of Matter and Energy 312 students 
 - Plate Tectonics 568 students 
 - Control of Variables 462 students
 
  66Pilot Test Schools District-level Demographics
- Northeast Suburban/Small Town. Middle School and 
High School.  -  40 White, 48 African American, 8 Hispanic 
25 Free and Reduced Lunch.  - 2. Northeast Suburban. Middle School. 95 
White 10 Free and Reduced Lunch.  - 3. Northeast Rural. (K-8). 98White 49 Free 
and Reduced Lunch.  - 4. Southern Small Town. Middle School (6-8) 70 
White, 24 African American 33 Economically 
Disadvantaged.  - 5. Southwest Small Town. Middle School (7-8). 
95 Hispanic, 95 Free and Reduced Lunch.  -  
 
  67Teacher Feedback Questionnaire
- Does the class have a special designation (e.g., 
honors, AP, ELL, special needs, etc.)? Please 
describe.  - Please note the approximate number of students in 
this class with Individualized Education Plans 
(IEPs).  - Approximately how much exposure have your 
students had to the topics hat these assessment 
items test?  - How long did it take to administer the test? 
 - Was it difficult for the students to understand 
the instructions? Please document on any 
difficulties they had.  - Please add any comments or suggestions you may 
have.  
  68Pilot-test questions
- Is there anything about this test question that 
was confusing? Explain.  - Circle any words on the test question you dont 
understand or arent familiar with.  - Is answer choice A correct? Yes No Not Sure 
 - Is answer choice B correct? Yes No Not Sure 
  - Is answer choice C correct? Yes No Not Sure 
  - Is answer choice D correct? Yes No Not Sure 
  - For items 3-6, students are asked to explain why 
an answer choice is correct or not. 
  69Pilot-Test Questions, Continued
- Did you guess when you answered the test 
question? Yes No  - Please suggest additional answer choices that 
could be used.  - Was the picture or graph helpful? If there was no 
picture or graph, would you like to see one?  - Have you studied this topic in school? 
Yes No Not Sure  - Have you learned about it somewhere else? Yes 
 No Not Sure  - (TV, museum visit, etc)? Where? 
 
  70Results of Teacher Feedback
- Test took 45min. to an hour to complete on 
average.  - Students sometimes had difficulty providing an 
explanation for each answer choicecognitively 
and motivationally. Not used to doing that.  - Only a very small number of students did not take 
the task seriously for a variety of reasonsend 
of the year, not graded, etc. Most were very 
cooperative.  - Students with learning disabilities expressed 
more difficulty.  - The unfamiliar format was a challenge to some. 
 - Teachers appreciated the depth of understanding 
that was expected.  
  71Examples
- What we learn from pilot testing
 
  72Targeted Idea Substances may react chemically 
in characteristic ways with other substances to 
form new substances with different characteristic 
properties (based on NSES 5-8BA2a).
- Which of the following is an example of a 
chemical reaction?  - A piece of metal hammered into a tree. 
 - A pot of water being heated and the water 
evaporates.  - A spoonful of salt dissolving in a glass of 
water.  - An iron railing developing an orange, powdery 
surface after standing in air.  
  73Students who Selected Each Answer Choice 
 74Results of piloting
- Only 5 of the 43 students who chose the correct 
answer D said that a new substance formed. 
Approximately half of the 43 students who chose D 
said they recognized it as an example of rusting 
or oxidation. Maybe these students know that 
rusting is a chemical reaction that produces new 
substances with different properties, but they 
may also know rusting only as a specific instance 
of a chemical reaction without knowing that 
chemical reactions involve the formation of a new 
substance.  - None of the students chose answer choice A, 
suggesting that hammering a piece of metal into a 
tree is not a plausible answer choice. Similar 
results were found during interviews.  - A significant number of students (42.1) chose 
either B or C. This supports other research that 
shows that students hold the idea that phase 
change and/or dissolving are chemical reactions.  
  75Suggested revisions
- Replace A with a more plausible distractor such 
as Sand being removed from sea water by 
filtration.  - Replace D with a reaction that students are not 
so familiar with, for example, a white solid 
forming when two clear liquids are mixed 
together.  
  76Targeted Idea
- Organisms use molecules from food to make complex 
molecules that become part of their body 
structures.  
  77When a baby chick develops inside an egg, the 
yolk in the egg is its only source of food. As 
the chick grows, the yolk becomes smaller.  Why 
does the yolk become smaller?
- A. The yolk enters the chick, but none of the 
yolk becomes part of the chick.  - B. The yolk is broken down into simpler 
substances, some of which become part of the 
chick.  - C. The yolk is completely turned into energy for 
the chick.  - D. The yolk gets smaller to make room for the 
growing chick.  
  78Students who Selected Each Answer Choice 
 79Results of piloting
- 6 students commented that they did not understand 
the phrase simpler substance in answer choice 
B.  - Only 8 of the 16 students who chose the correct 
answer B explained that yolk is broken down to 
provide building material that becomes 
incorporated into the body of the chick. The rest 
of the students indicated that the yolk is needed 
for the chick to grow or to become bigger. It 
is not clear that these students understand the 
idea that is being assessed, i.e., that food is 
broken down into smaller molecules that provide 
building material for the chick, which become 
part of the body structures of the chick.  - One of the students who selected answer choice A 
commented that Just like humans, pieces of food 
do not become part of us. This student might 
have a correct molecular understanding of how 
food is made part of body structures but got the 
question wrong because of the students focus on 
the yolk as being broken down into pieces of 
food.  
  80Suggested revisions
- Change answer choice A to read The yolk is 
broken down into simpler molecules but none of 
the atoms of these simpler molecules become part 
of the chick.  - Change answer choice B to read The yolk is 
broken down into simpler molecules that are used 
to make the body structures of the chick. 
  81The expansion of alcohol in a thermometer
AM42-4 The level of colored alcohol in a 
thermometer rises when the thermometer is placed 
in hot water.  Why does the level of alcohol 
rise?                    A.  The heat molecules 
push the alcohol molecules upward. B.  The 
alcohol molecules break down into atoms which 
take up more space. C.  The alcohol molecules get 
farther apart so the alcohol takes up more 
space. D.  The water molecules are pushed into 
the thermometer and are added to the alcohol 
molecules. 
 82Student data from pilot testing 
 83Student Responses
- 87 students from grades 7-9 at 3 different 
schools  - 6 students not familiar with alcohol / colored 
alcohol (7)  - 44 chose answer choice A (plausible distractor) 
 - 6 students wrote heat rises as their 
explanation for A.  - 12 students may have the heat molecules 
misconception.  - Answer choice A is the only one that has the word 
heat in it. (Perhaps add as it is heated to 
the end of one or more answer choices.) 
  84Sample student responses
- Answer choice A 
 - No, because heat molecules cant push alcohol 
molecules because alcohol molecules are denser.  - Yes, I remember learning about heat molecules 
and knew they bump other molecules upward.  - Yes, makes sense heat rises. 
 - Yes, because heat rises and it is being heated. 
 - Answer choice B 
 - No "The molecules dont break down they stay the 
same"  - Answer choice C 
 -  Yes "The space between molecules expands with 
the increase in temperature."  - Answer choice D 
 - No "Because there is no way that the water can 
get pushed into the thermometer."  - No "Because how could water get through a glass, 
a solid glass."  
  85- Examples from plate tectonics of 
 - Determining appropriateness of terms used in 
assessment items  - Identifying misconceptions 
 - Identifying implausible ideas for distractors
 
  86Key Idea a The solid crust of the earth - 
including both the continents and the ocean 
basins - consists of separate plates.
- Students are expected to know that the rigid, 
outer layer of the earth is made of separate 
sections that are called plates and that the 
plates fit together so that the edge of one plate 
directly touches an adjacent plate with no gaps 
between them. They should know that plates are 
made of solid rock. Students should know that 
each of the major plates encompasses very large 
areas of the earths surface (e.g., an entire 
continent plus adjoining ocean floor or a large 
part of an entire ocean basin) and that the 
boundaries of continents and oceans are not the 
same as the boundaries of plates.  
  871. Determining appropriateness of terminology in 
items
- Two items were piloted in order to test student 
knowledge of the term bedrock (after typical 
instruction, i.e., not necessarily targeted to 
the meaning of the word bedrock) to determine if 
the word should be used in assessment and thus be 
part of a clarification statement.  - The two items are identical except one uses the 
term bedrock and the other uses the descriptive 
phrase solid rock.  - These items were piloted at two different middle 
schools in two eastern states at grades 7 and 8. 
Interviews of 9th graders (10 students) in a 
third school in a western state where bedrock is 
readily visible are consistent with these 
findings, but are not presented here. 
  88Which of the following are part of earths 
plates? A. Solid rock of continents but not 
solid rock of ocean floors. B. Solid rock of 
ocean floors but not solid rock of continents. 
C. Solid rock of both the ocean floors and the 
continents.D. Solid rock of neither the ocean 
floors or the continents.
- Number of Students  33 (3 classes, two 7th 
grade and one 8th grade) 
  89Student data from pilot testing (solid rock) 
 90Which of the following are part of earths 
plates? A. Bedrock of continents but not 
bedrock ocean floors.  B. Bedrock of ocean 
floors but not bedrock of continents. C. 
Bedrock of the ocean floors and the 
continents.D. Bedrock of neither ocean floors 
nor continents.
- Number of Students  34 (3 classes, one 7th 
grade and two 8th grade) 
  91Student data from pilot testing (bedrock) 
 92Student answers to Bonus Question What is 
bedrock?
- Twenty-one of 34 students responded that they did 
not know.  - Students who attempted to define the term said 
 - The bed of rocks on the ocean floor  
 - The bottom layer of a rock 
 - Like the ocean floor  
 - The bare rock under dirt and sand 
 - The deep rock of the crust  
 - Bedrock is rock that is in the ground 
 - A type of layering of loose pebbles that have 
been fused together  - Rocks and sediments that are on the bottom of 
the continent or ocean  - Rocks on the bottom of the ocean 
 - Rock Maybe 
 - It is the rock that is on the bottom of an ocean 
plate  
  93- Analysis 
 - There is a greater number of unsure responses 
when bedrock is used. The item using bedrock 
has 12 to 15 responses of unsure to each answer 
choice, while the item using solid rock has 4 
unsure responses to each of the answer choices. 
 Uncertainty about the meaning of the term could 
interfere with student thinking about the idea 
being tested.  - Thirty-two out of the thirty-four students wrote 
responses indicating that they do not know what 
bedrock is. Despite this lack of understanding of 
the term, 50 of the students were able to 
correctly answer this item, compared to 57.6 of 
students answering the item using solid rock. 
Students are apparently translating bedrock to 
mean rock without knowing for sure what it is.  - For now, we have decided not to include the term 
bedrock in the clarification of this idea (even 
though the word is used in a grade 3-5 benchmark) 
and not use it for assessment purposes.  
  942. Identifying misconceptions
- In written comments, a number of students 
expressed misconceptions.  - Which of the following are part of earths 
plates?  - Solid rock of continents but not solid rock of 
ocean floors.  - Plates can be seen and aren't under water. 
 - The plates do not go down that far. 
 - Ocean water and solid rock from the bottom is not 
part of a plate.  - B. Solid rock of ocean floors but not solid rock 
of continents.  - Yes, it's only made of rock from the ocean 
surface.  
  953. Identifying implausible distractors
- Which of the following are part of earths 
plates?  - D. Solid rock of neither the ocean floors nor 
the continents.  -  None of the 33 students selected this answer 
choice.  - D.  Bedrock of neither ocean floors nor 
continents.  -  Three of the 34 students selected this answer 
choice.  - Although students have misconceptions about 
either ocean floors or continents being part of 
plates, the idea that neither ocean floors nor 
continents is part of plates is not plausible. 
This distractor is not informative and should be 
replaced. 
  96An example from physics
- Idea d Friction is a force that makes it 
difficult for one object to slide on another 
object (from SFAA 4F-3h).  - From the clarification statement 
 - Students should know that friction is a force 
that acts in the opposite direction to the 
sliding of one surface on another surface.  
  97Alignment/SIGP
FM62-1 (Sixth Grade, n 25, Eighth Grade of 
different school, n18) A box slides across the 
floor. The arrow labeled "Motion" represents the 
box's direction of motion. Which force could be 
the force of friction acting on the box?                                                                                             
 A.  Force A (40 Sixth / 17 Eighth) B.  Force B 
(16 / 0) C.  Force C (40 / 44) D.  Force D 
(0 / 17) 
 98Possible Misconceptions
- Forces always act in the direction of motion 
(Kuiper, 1994). (Answer choice A)  - Friction is a force in the vertical direction, 
holding an object down (Horizon Research, Inc.). 
(Answer choice B)  - Friction is an upward force gravity is a 
downward force. (Answer choice D)  
  99Two routes to the correct answer
- 1. Use targeted learning goal 
 - Friction opposes the sliding of two surfaces. 
 - 2. Combine two other ideas 
 - A backward force slows things down. 
 - Friction slows things down. (This is a specific 
instance of a general principle-SIGP)  - Therefore, friction is a backward force. 
 - If students use 2. they have not demonstrated 
knowledge of the learning goal. 
  100Student Responses
- Sixth Grade Of the 10 students choosing the 
correct answer  - 2 indicated that they used targeted learning goal 
 - 2 indicated that they used the other route (false 
positive)  - Eighth Grade Of the 8 students choosing the 
correct answer  - 4 indicated that they used the targeted learning 
goal  - Zero indicated that they used the other route
 
  101Conclusions 
- Pilot testing can be used successfully to reveal 
what students are thinking about the ideas we are 
testing.  - Pilot testing provides access to a large number 
of students around the country, but what we learn 
is limited by the questions we ask and what 
students choose to write. Follow-up isnt 
possible.  - Student interviews allow for flexibility to 
follow up students comments with more probing 
questions, but one-on-one interviews are limited 
to smaller numbers of students.  - A combination of the two methods is being used to 
provide insights into student thinking and the 
effectiveness of the assessment items that we are 
developing.