Evaluating and Restructuring Science Assessments: An Example Measuring Student - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Evaluating and Restructuring Science Assessments: An Example Measuring Student

Description:

Items with negative point measure correlations flagged for review. ... Although many measurement and testing textbooks present classical test theory as ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 2
Provided by: uky
Category:

less

Transcript and Presenter's Notes

Title: Evaluating and Restructuring Science Assessments: An Example Measuring Student


1
Evaluating and Restructuring Science Assessments
An Example Measuring Students Conceptual
Understanding of Heat
All authors contributed equally to this
manuscript. Please address all inquiries to Kelly
D. Bradley 131 Taylor Education Building
Lexington, KY 40506
Kelly D. Bradley, Jessica D. Cunningham,
Shannon O. Sampson
Newtons Universe is supported by the National
Science Foundation under Grant No. 0437768. For
more information see, http//www.as.uky.edu/Newto
nsUniverse/.
UNIVERSITY OF KENTUCKY Department of Educational
Policy Studies Evaluation
  • Conclusion
  • Following the reconstruction process, the
    committee was asked to develop a new theoretical
    hierarchy of item difficulty based on the pilot
    results and any revisions made. Using the
    baseline assessment given in September 2006, the
    theoretical and empirical hierarchy of items will
    be compared again.
  • A strength of this study is the partnership of
    science educators with researchers in educational
    measurement to construct a quality assessment.
  • This study provides a model for assessing
    knowledge transferred to students through teacher
    training.
  • Findings will support other researchers attempts
    to link student performance outcomes to teacher
    training, classroom teachers construction of
    their own assessments and the continued growth of
    collaborative efforts between the measurement and
    science education communities.
  • Method
  • Response Frame
  • The target population was middle school science
    students in the rural Appalachian regions of
    Kentucky and Virginia.
  • Instrumentation
  • A student assessment was constructed by the
    Newtons Universe research team to measure
    students conceptual understanding of heat.
  • The pilot assessment contained forty-one,
    multiple-choice items.
  • Data Collection
  • Student assessment piloted with a group of middle
    school students participating in a science camp
    during the summer 2006.
  • Data Analysis
  • The dichotomous Rasch model was applied to the
    data.
  • ZSTD fit statistics acceptable between -2 and 2,
    which indicates the fit statistics are within two
    standard deviations from the mean of zero (Wright
    Masters, 1982).
  • Items with negative point measure correlations
    flagged for review.
  • Spread of items and students along the continuum
    examined for gaps.
  • Background
  • Although many measurement and testing textbooks
    present classical test theory as the only way to
    determine the quality of an assessment (Embretson
    Hershberger, 1999), Item Response Theory offers
    a sound alternative to the classical test theory
    approach.
  • Reliability and various aspects of validity can
    be examined when applying the Rasch model (Smith,
    2004).
  • To examine reliability, Rasch measurement places
    person ability and item difficulty along a linear
    scale. Rasch measurement produces a standard
    error (SE) for each person and item, specifying
    the range within which each persons true
    ability and each items true difficulty fall.
  • Rasch fit statistics, which are derived from a
    comparison of expected response patterns and the
    observed patterns (Smith, 2004, p. 103), can be
    examined to assess the content validity of the
    assessment.
  • Bradley and Sampson (2006) applied a
    one-parameter Item Response Theory model,
    commonly known as the Rasch model, to investigate
    the quality of a middle school science teacher
    assessment and advised appropriate improvements
    to the instrument in an effort to ensure
    consistent and meaningful measures.
  • Discussion
  • The first item on the pilot student assessment
    was relocated to the fourth item in an effort to
    place an easier item first on the student
    assessment.
  • The item flagged for a high outfit ZSTD statistic
    was reworded because test developers felt
    students were overanalyzing the question.
  • The item with the negative point measure
    correlation (item 13) was deleted because the
    committee thought the item in general was
    confusing.
  • Item 19 was revised to replace item 18 from the
    student assessment since it tested the same
    concept.
  • Item 23 was removed from the student assessment
    because the course does not adequately cover the
    concept tested.
  • A more difficult foundations item was added to
    increase the span of foundation items along the
    ability continuum.
  • To fill one potential gap in the item spread,
    item 24 was changed to make the question clearer
    and in turn, less difficult.
  • The answer choices of temperature points were
    changed to increase the difficulty of the items
    12 and 36.
  • For items 3 and 5, the answer options were
    revised because empirically they were not
    functioning as expected as distracters.
  • Items 4 and 40 were determined to be confusing
    for many higher ability students so adjustments
    were made.

References Bond, T., Fox, C. (2001). Applying
the Rasch model Fundamental measurement in the
human sciences. Mahwah, NJ Lawrence Erlbaum
Associates. Bradley, K. D., Sampson, S. O.
(2006). Utilizing the Rasch model in the
construction of science assessments The process
of measurement, feedback, reflection and change.
In X. Liu W. Boone (Eds.), Applications of
Rasch measurement in science education (pp.
23-44). Maple Grove, MN JAM Press. Embretson,
S., Hershberger, S. (1999). The new rules of
measurement. Mahwah, NJ Lawrence Erlbaum
Associates, Inc. Hopkins, K. D. (1998).
Educational and psychological measurement and
evaluation (8th ed.) Needham Heights, MA Allyn
Bacon. Linacre, J. (1999). A users guide to
Facets Rasch measurement computer program.
Chicago, IL MESA Press. Linacre, J. M. (2005).
WINSTEPS Rasch measurement computer program.
Chicago Winsteps.com. Smith, E. (2004).
Evidence for the reliability of measures and
validity of measure interpretation A Rasch
measurement perspective. In E. Smith R. Smith
(Eds.), Introduction to Rasch measurement (pp.
93-122). Maple Grove JAM Press. Wright, B. D.,
Masters, G. N. (1982). Rating scale analysis
Rasch measurement. Chicago, IL MESA
Press. Wright, B.D., Stone, M.H. (2004).
Making measures. Chicago, IL The Phaneron.
  • Objectives
  • Apply dichotomous Rasch model to evaluate quality
    of assessment to measure student conceptual
    understanding of heat
  • Determine fit of data to the Rasch model
  • Restructure the assessment based on results
    coupled with theory
  • Results
  • Person separation and reliability were 2.31 and
    0.84 respectively. Item separation and
    reliability were 1.56 and 0.71.
  • Item 13 resulted in a negative point measure
    correlation.
  • The first item was empirically estimated as more
    difficult than the theoretical item hierarchy.
  • Potential gaps existed between item 28 and items
    9 and 12 as well as between item 11 and items 17,
    18, 21, 23, 24, 39, and 8.
  • Four energy transfer items (18, 21, 23, 24) were
    located at the same difficulty level.
  • Unexpected functioning of distracters occurred
    for items 4, 13, 14, 30, 32, 38, and 40.
  • Items containing distracters not being used
    included 2, 3, 6, 12, 29, 31, 35, 36, 37, and 39.

A special thanks to Newtons Universe committee
members integral in assessment development
Kimberly Lott, Rebecca McNall, Jeffrey L.
Osborn, Sally Shafer, and Joseph Straley.
Submit requests to kdbrad2_at_uky.edu
Write a Comment
User Comments (0)
About PowerShow.com