Growth Scales and Pathways - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Growth Scales and Pathways

Description:

... lower grades (e.g., fewer levels) and certain contents (e.g., athletics, music) ... awards for students meeting or falling below or above their expectations based ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 51
Provided by: hpcus518
Category:

less

Transcript and Presenter's Notes

Title: Growth Scales and Pathways


1
Growth Scales and Pathways
  • William D. Schafer
  • University of Maryland
  • and
  • Jon S. Twing
  • Pearson Educational Measurement

2
NCLB leaves some unmet policy needs
3
  • Assessment of student-level growth

4
  • Sensitivity to change within achievement levels

5
  • Assessment and accountability at all grades

6
  • Broad representation of schooling outcomes

7
  • Descriptions of what students are able to do in
    terms of next steps

8
  • Cost-benefit accountability

9
How can we meet these needs?
  • Our approach starts with measurement of growth
    through cross-grade scaling of achievement

10
  • Current work is being done around
  • Vertical Scales in which
  • Common items for adjacent grades used to generate
    a common scale across grades

11
  • Another approach is grade-equivalents.
  • Both are continuous cross-grade scales.

12
  • We only have three problems with continuous
    cross-grade scales
  • The Past
  • The Present
  • The Future

13
Why the Past?
  • Ignores instructional history
  • The same student score should be interpreted
    differently depending on the grade level of the
    student

14
Why the Present?
  • Relationships among items may (probably do)
    differ depending on grade level of the student.
    (e.g., easy fifth grade items may be difficult
    for fourth graders)
  • Lack of true equating. It is better for fourth
    graders to take fourth grade tests and for fifth
    graders to take fifth grade tests.

15
Why the Future?
  • Instructional expectations differ. A score of GE
    5.0 (or VS 500) carries different growth
    expectations from a fifth-grade experience next
    year for a current fifth grader than for a
    current fourth grader.

16
  • We do need to take seriously the interests of
    policymakers in continuous scaling.
  • But the problems with grade-equivalents and
    vertical scaling may be too severe to recommend
    them.
  • Here are seven criteria that an alternate system
    should demonstrate.

17
1. Implement the Fundamental Accountability
Mission
  • Test all students on what they are supposed to be
    learning.

18
2. Assess all contents at all grades.
  • Educators should be accountable for all public
    expenditures.
  • Apply this principle at least to all
    non-affective outcomes of schooling.

19
3. Define tested domains explicitly.
  • Teachers need to understand their learning
    targets in terms of
  • Knowledge (what students know)
  • Factual
  • Conceptual
  • Procedural
  • Cognition (what they do with it)

20
4. Base test interpretations on the future.
  • We cant change the past, but we can design the
    future.
  • It can be more meaningful to think about what
    students are prepared for than about what they
    have learned.

21
5. Inform decision making about students,
teachers, and programs.
  • Within the limits of privacy, gathering data for
    accountability judgments about everyone and
    everything (within reason) will help decision
    makers reach the most informed decisions.
  • This also means that we will associate
    assessments with those who are responsible for
    improving them.

22
6. Emphasize predictive evidence of validity.
  • Basing assessment interpretations on the future
    (see point 4) suggests that our best evidence to
    validate our interpretations is how well they
    predicted in the past.

23
7. Capitalize on both criterion and norm
referencing.
  • Score reports need to satisfy the needs of the
    recipients. Both criterion-referencing (what
    students are prepared to do) and norm-referencing
    (how many are as, more, and less prepared) convey
    information that is useful.
  • Other things equal, more information is better
    than less.

24
Our Approach to the Criteria
  • Many of the criteria are self-satisfying.
  • Some recent and new concepts are needed.
  • Four recent or new concepts
  • Socially moderated standard setting
  • Operationally defined exit competencies
  • Growth scaling
  • Growth pathways

25
Socially Moderated Standard Setting
  • Ferrara, Johnson, Chen (2005)
  • Judges set achievement level cut points where
    students have prerequisites for the same
    achievement level next year.
  • Note the future orientation of the achievement
    levels. This concept also underlies Lissitz
    Huynhs (2003) concept of vertically moderated
    standards.

26
Operationally Defined Exit Competencies
  • If we implement socially moderated standards,
    where do the cut points for the 12th grade come
    from?
  • Our suggestion is to base them on what the
    students are prepared for, such as (1) college
    credit, (2) ready for college, (3) needs college
    remediation, (4) satisfies federal
    ability-to-benefit rules, (5) capable of
    independent living, (6) below.
  • Modify as needed for lower grades (e.g., fewer
    levels) and certain contents (e.g., athletics,
    music)

27
Growth Scaling
  • Some elements of this have been used in Texas and
    Washington State.
  • Test at each grade level separately for any
    content (i.e., only grade-level items).
  • Report using a three-digit scale.
  • First digit is the grade level.
  • Second two digits are a linear transform of the
    lower proficient (e.g., 40) and advanced
    (e.g., 60)cut points. Could transform
    non-linearly to all cut points with more than
    three levels.

28
Growth Pathways
  • Given that content is backmapped (Wiggins
    McTighe, 1998), and achievement levels are
    socially moderated, can express achievement
    results in terms of readiness for growth (next
    year, or at 12th grade or both).
  • Can generate transition matrices to express
    likelihoods of various futures for students.

29
Adequate Yearly Progress
  • Capitalizing on Hill et al. (2005) can use growth
    pathways as the bases for expectations and give
    point awards for students meeting or falling
    below or above their expectations based on
    year-ago achievement levels.

30
Existing Empirical State Data
  • Using existing data, we explored some of these
    concepts.
  • Two data sets were used from Texas.
  • All data is in the public domain and can be
    obtained from the Texas website.
  • Current Texas data is used TAKS
  • Previous Texas data is used TAAS

31
TAAS Data (2000-2002)
32
Immediate Observations - TAAS Data
  • Passing standards appear to be relatively
    lenient.
  • Actual standards were set in fall of 1989.
  • Curriculum change occurred in 2000.
  • Texas Learning Index (TLI)
  • Is a variation of the Growth Scaling model
    previously discussed.
  • Will be discussed in more detail shortly.
  • Despite the leniency of the standard, average
    cross-sectional gain is shown with the TLI.
  • About a 2.5 TLI value gain on average (across
    grades).

33
TAKS Data (2003-2005)
34
Immediate Observations -TAKS Data
  • Passing standards appear to be more severe than
    TAAS, but still the majority of students pass for
    the most part.
  • Standards were set using Item Mapping and
    field-test data in 2003.
  • Standards were phased in by the SBOE.
  • Passing is labeled as Met the Standard.
  • Scale Scores are transformed within grade and
    subject calibrations using Rasch.
  • Scales were set such that 2100 is always
    passing.
  • Socially moderated expectation that a 2100 this
    year is equal to a 2100 next year.
  • We will look at this in another slide shortly.

35
Immediate Observations-TAKS Data
  • Some Issues/Problems seem obvious
  • Use of field test data the and lack of student
    motivation the first year.
  • Phase in of the standards makes the meaning of
    passing difficult to understand.
  • Construct changes between grades 8 and 9.
  • Math increases in difficulty across the grades.
  • Cross-sectional gain scores show some progress,
    with between 20 and 35 point gains in average
    scaled score across grades and subjects.
  • Finally, the percentage of classifications
    (impact) resulting from the Item Mapping standard
    setting is quite varied.

36
A Pre-Organizer
  • Socially Moderated Standard Setting
  • Really sets the expectation of student
    performance in the next grade.
  • Growth Scaling
  • A different definition of growth.
  • Growth by fiat.
  • Operationally Defined Exit Competencies
  • How does a student exit the program?
  • How to migrate this definition down to other
    grades.
  • Growth Pathways
  • Cumulative probability of success.
  • Not addressed in this paper with Texas data.

37
Socially Moderated Standard Setting
  • Consider the TAKS data in light of Socially
    Moderated Standard Setting.
  • The cut scores were determined separately by
    grade and subject using an Item Mapping
    procedure.
  • 2100 was selected as the transformation of the
    Rasch theta scale associated with passing.
  • 2100 became the passing standard for all grades
    and subjects.
  • Similar to the quasi-vertical scale scores
    procedure described by Ferrara et al. (2005).

38
Socially Moderated Standard Setting
  • Despite implementation procedures, the standard
    setting yielded a somewhat inconsistent set of
    cut scores.
  • Panels consisted of on and adjacent grade
    educators.
  • Performance level descriptors were discussed both
    for the current grade and the next.
  • A review panel was convened to ensure continuity
    between grades within subjects.
  • This review panel was comprised of educators from
    all grades participating in the standard setting
    and use impact data for all grades as well as
    traditionally estimated vertical scaling
    information.

39
Socially Moderated Standard Setting
  • Yet, some inconsistencies are hard to explain.
  • For example, the standards yielded the following
    passing rates for Reading
  • Grade 3 81
  • Grade 4 76
  • Grade 5 67
  • Grade 6 71
  • Clearly, social moderation did not occur
  • Differences in content standards from grade to
    grade.
  • Lack of a clearly defined procedure setting up
    expectation at the next grade.
  • Mitigating factors (i.e., kids cry raw score
    percent correct, etc.).

40
Socially Moderated Standard Setting
  • What about unanticipated consequences?
  • Are teachers, parents and the public calculating
    gain score differences between the grades based
    on these horizontal scale scores?
  • Will the expectation not be 2100 this year
    2100 next year? This is similar to one of the
    concerns in Ferrara et. al. (2005) that
    prohibited the research from being conducted.
  • In fact, based on simple regression using
    matched cohorts, the expectation is a student
    with a scaled score of 2100 in grade 3 reading
    will earn a 2072 in grade 4 reading on average.

41
Growth Scaling
  • The TAAS TLI is an example of this type of
    growth scale.
  • A standard setting was performed for the Exit
    Level TAAS test.
  • This cut score was expressed in standard
    deviation units above or below the mean (i.e., a
    standard score).
  • This same distance was then articulated down to
    other grades.
  • The logic was one defining growth in terms of
    maintaining relative status as students move
    across the grades.
  • For example, if the passing standard was 1.0
    standard deviation above the mean at Exit Level,
    then students who are 1.0 standard deviation
    above the mean in the lower grade distributions
    are on track to pass the Exit Level test
    provided they maintain their current standing /
    progress.

42
Growth Scaling
  • For convenience, the scales were transformed such
    that the passing standards were at 70.
  • Grade level designations were then added to
    further enhance the meaning of the score.
  • This score had some appealing reporting
    properties
  • Passing was 70 at each grade.
  • Since the TLI is a standard score, gain measures
    could be calculated for value added statements.

43
Growth Scaling
  • Some concerns were also noted
  • Outside of the first cut score, the TLI was
    essentially content standard free.
  • Because it was based on distribution statistics,
    the distributions (like norms) would become
    dated.
  • Differences in the shapes of the distributions
    (e.g., test difficulty) would have an unknown
    impact on students actually being able to hold
    their own.
  • Differences in the content being measured across
    the grades is essentially irrelevant.

44
Operationally Defined Exit Competencies
  • The TAKS actually has such a component at the
    Exit Level.
  • This is called the Higher Education Readiness
    Component (HERC) Standard.
  • Students must reach this standard to earn dual
    college credit and to be allowed credit for
    college level work.
  • Two types of research were conducted to provide
    information for traditional standard setting
  • Correlations with existing measures (ACT SAT).
  • Empirical study examining how well second
    semester freshmen performed on the Exit Level
    TAKS test.

45
Operationally Defined Exit Competencies
  • This research yielded the following

46
Operationally Defined Exit Competencies
  • Some interesting observations
  • HERC standard was taken to be 2200, different
    from that needed to graduate.
  • Second semester college freshmen did marginally
    better than the required passing standard for
    TAKS to graduate.
  • Predicted ACT and SAT scores support the notion
    that the TAKS passing standards are moderately
    difficult.
  • Given the content of the TAKS assessments, how
    could this standard be articulated down to lower
    grades?

47
Concluding Remarks
  • Three possible enhancements that may or may not
    be intriguing for policymakers
  • Grades as Achievement Levels
  • Information Rich Classrooms
  • Monetary Metric

48
Grades as Achievement Levels
  • Associating letter grades with achievement levels
    would
  • Provide meaningful interpretations for grades
  • Provide consistent meanings for grades
  • Force use as experts recommend
  • Enable concurrent evaluations of grades
  • Enable predictive evaluations of grades
  • Require help for teachers to implement

49
Information Rich Classrooms
  • Concept is from Schafer Moody (2004).
  • Achievement goals would be clarified through test
    maps.
  • Progress would be tracked at the content strand
    level throughout the year using combinations of
    formative and summative assessments (heavy role
    for computers).
  • Achievement level assignments would occur
    incrementally throughout the year.

50
Monitory Metric for Value Added
  • Economists would establish value of each exit
    achievement level through estimating lifetime
    earned income.
  • The earnings would be amortized across grade
    levels and contents.
  • The value added for each student each year is
    the sum across contents of the products of the
    achievement level times the vector of
    probabilities of exit achievement levels times
    the vector of amortized monitory values.
  • Enables cost-benefit analysis of education in a
    consistent metric for inputs and outputs.
Write a Comment
User Comments (0)
About PowerShow.com