Title: TerraNova Evaluation of a Standardized Test Mini-Project 1
1TerraNovaEvaluation of a Standardized
TestMini-Project 1
- Teresa Frields and Mitzi Hoback
2A. General Information
- Title TerraNova
- Publisher CTB/McGraw-Hill
- Date of Publication 1997
3General Information Cost
- Varies as to what is purchased
- 122 per 30 Complete Battery Plus consumable
test booklets - 92.50 per 30 Complete Battery Plus reusable test
booklets
4A. General Information Administration Time
- Varies by test and level
- Typically given over a period of several test
sessions or days - Fall, Winter, and Spring testing periods
available
5B. Brief Description of Purpose and Nature of
TestGeneral Purpose of Test
- Constructed as a comprehensive modular
assessment series of student achievement - Promoted as a device to help diverse audiences
understand student academic achievement and
progress - Reports provide useful and informative data which
allows for national comparison of group and
individual achievement
6B. Brief Description of Purpose and Nature of
Test Population for which test is applicable
- K-12
- Reading/language arts and mathematics available
for K-12 - Science and social studies tests available
1-2
7B. Brief Description of Purpose and Nature of
Test Description of Content
- Multiple choice format
- Generates precise norm-referenced achievement
scores and a full complement of objective mastery
scores - Designed to measure concepts, processes, and
skills taught throughout the nation - Content areas measured are Reading/Language Arts,
Mathematics, Science, and Social Studies
8B. Brief Description of Purpose and Nature of
Test Appropriateness of Assessment Method
- Selected-response items can provide information
on basic knowledge and some patterns of reasoning - Does not provide evidence for performance
standards/targets - Other TerraNova formats provide a combination of
selected-response and constructed-response
9Technical EvaluationNorms/Standards
- Type The battery generates precise
norm-referenced achievement scores and a full
compliment of objective mastery scores. - Types of scores provided
- Scaled Scores
- Grade Equivalents
- National Percentiles
- National Stanines
- Normal Curve Equivalents
- Reports are provided both individually and as
groups of students.
10C. Technical EvaluationNorms/Standards
- Standardization Sample Size The norming sample
was based on a stratified national sample. - 295 schools
- Fall Spring norming studies involved between
860,000 and 1,720,000
11C. Technical EvaluationNorms/Standards
- 2. Standardization Sample Representativeness
- Separate sampling designs were used for
institutions of different types - Public schools stratified by region, community,
type, size, Orshansky Percentile (an indicator
of socioeconomic status)
12C. Technical EvaluationNorms/Standards
- Standardization Sample procedure followed in
obtained sample - Spring Standardization April, 1996
- Fall Standardization October 1996
- Recommended test administration period is five
week window centered on the norming periods
13C. Technical EvaluationNorms/Standards
- 3. Standardization Sample Availability of
subgoup norms - Questionnaire sent to participating schools
- 95 responded in the fall
- 100 responded in the spring
14C. Technical EvaluationNorms/Standards
- 3. Standard setting procedures employed
qualifications and selection of judges - Nominations were made of experienced teachers and
curriculum specialists with national reputations - Judges had to possess deep understanding of one
of the five content areas
15C. Technical EvaluationNorms/Standards
- 3. Standard setting procedures employed number
of judges - 2 committees for each of 5 content areas
- Primary/Elementary and Middle/High School
- 4-5 teachers per committee, one curriculum expert
(external) and one CTB content expert
(approximately 70 people total)
16C. Technical EvaluationReliability
- Types Measure of internal consistency
- Kuder-Richardson Formula 20 (KR20)
- Item pattern KR20 (a unique measure that takes
into account the additional accuracy associated
with IRT item-pattern scoring) - Coefficient alpha
- On individual student score reports, a students
score is reported along with a confidence band.
17C. Technical EvaluationReliability
- 2. Results
- Reliability coefficients were consistently .80s
and .90s - Spelling consistently lower
- Grade 1 and 2 also had slightly lower coefficients
18C. Technical EvaluationValidity
- 1. Types Content-related
- Numerous studies (e.g. classroom pilots,
usability, sensitivity) conducted - Advisory panel of teachers, administrators, and
content specialists from all parts of country - Based on recommendations of SCANS (Secretarys
Commission of Achieving Necessary skills) report
19C. Technical EvaluationValidity
- Types Content-related
- Developers and scorers worked together as
constructed-response items were scored for
consistency and accuracy of scoring guides and
process - Reviewed various informational sources for
children to determine topics of interest
20C. Technical EvaluationValidity
- Types Criterion-related
- Conducted variety of research studies, such as
correlation with SAT and ACT, NAEP, TIMMS
21C. Technical EvaluationValidity
- 1. Types Construct-related
- Careful test development process to support
content validity and comprehensiveness of test - Construct validity for skills, concepts and
processes measured in each subject
22C. Technical EvaluationValidity
- 2. Results
- Provides achievement scores that are valid for
several types of educational decision making - A thorough validity evaluation encompassed
content-, criterion-, and construct-related
evidence
23Bias
- Used the following procedures to reduce the
amount of bias - Ensured valid test plan
- Followed stringent editorial guidelines
- Conducted expert reviews
- Analyzed student data for differential item
functioning - Selected best items
24D. Summary of MMY Reviews
- Reviewed by Judith A. Monsaas, Assoc. Prof. Of
Education, North Georgia College and State
University, Dahlonega, GA - Tests are very engaging and user friendly.
Materials are well-constructed, and attractive, - Addition of performance standards is helpful for
schools moving toward a standards-based
curriculum framework
25D. Review, continued
- Claims to assist in decision making in many
areas, including evaluation of student progress,
instructional program planning, curriculum
analysis, class grouping, etc. This reviewer
believes they can support this claim - Has a particularly useful section for parents on
Using Test Results
26D. Review, continued
- Although these tests are attractive and more
engaging than most achievement tests I have
inspected, I doubt that students will forget that
they are taking a test. - Good section on Avoiding Misinterpretations
when using grade equivalents is helpful
27 D. Review, continued
- Process used to develop the test and ensure
content validity was very thorough and clearly
explained - Norming and score reporting methods are
well-developed - Reviewers only problem is with the mastery
classifications for the criterion-referenced
interpretations. She feels they are arbitrarily
defined.
28D. Review continued
- Reviewed by Anthony J. Nitko, Professor,
Department of Educational Psychology, University
of Arizona, Tucson, AZ - One change in the new edition is that items
within each subtest are organized according to
contextual themes, countering the criticism that
standardized tests assess strictly
decontextualized knowledge and skills
29D. Review Continued
- Developers carefully analyzed curriculum guides
from around the country, as well as national and
state standards and textbook series - Several usability studies were run. The results
of these were used to improve test items,
teachers directions, and page designs
30D. Review continued
- Earlier editions criticized for problems related
to speed. This version corrects those.
Typically fewer than 4 of students fail to
respond to the last item on each subtest - One of the better batteries of its type.
- Teachers materials exceptionally well-done and
informative
31E. Critique of the Instrument
- Our research on the TerraNova helps us to draw
the following conclusions - A complete and comprehensive test
- Numerous measures and studies were done to ensure
technical requirements - TerraNova takes pride in its overall test design,
construction, norming, national standardization
process, reliability, validity, and the reduction
of bias issues
32E. Critique of the Instrument
- Does a good job supporting its purpose as a
measure to aid in student achievement - Provides three main types of information
including norm-referenced information, some
criterion information, and standards-based
performance information - Serves as a good measure in comparing student
achievement with national performances
33E. Critique of the Instrument
- This is not a test that should be used by itself.
It is simply one type of measure and cannot be
the only measure used in making critical
decisions - When used in conjunction with other test methods
and teacher judgment, it is an effective measure
for what it purports to do - Caution should be used when using this assessment
to track state standards, although it purports to
be accurately correlated, there is no substantial
proof.
34E. Critique of the Instrument
- Interesting Tidbits
- Del Harnish has done research on bias issues and
is published for his work on the TerraNova - Testnote Clarity is a computer program available
with the disaggregation of data which allows the
user to customize and apply to district curriculum