Dr. Mohammad H. Omar Department of Mathematical Sciences - PowerPoint PPT Presentation

About This Presentation
Title:

Dr. Mohammad H. Omar Department of Mathematical Sciences

Description:

Presented at Statistic Research (STAR) colloquium, King Fahd University of ... on test equating in Linn (1993) Educational Measurement, Ace-Oryx publishing ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 34
Provided by: facultyK
Category:

less

Transcript and Presenter's Notes

Title: Dr. Mohammad H. Omar Department of Mathematical Sciences


1
Some Statistics for Equating Multiple Forms of a
test
  • by
  • Dr. Mohammad H. OmarDepartment of Mathematical
    Sciences
  • May 16, 2006
  • Presented at Statistic Research (STAR)
    colloquium,King Fahd University of Petroleum
    Minerals,Dhahran, Saudi Arabia.

2
Equating
3
Brief overview of Talk
  • Test administration using
  • Only one form
  • More than one form
  • Test Equity
  • Steps to ensuring equity
  • Conditions for Equated Score
  • Data Collection Designs
  • Equating procedures
  • Illustration of the Equipercentile Equating
    process
  • Use of smoothing techniques
  • Application of equipercentile equating to data
    collection design
  • Standard errors of equipercentile equating
  • Linear equating
  • Illustration of the Linear Equating process
  • Application of linear equating to data collection
    design
  • Standard errors of linear equating
  • Comparison of equating methods

4
Test Administration using only one Form
Advantage
Disadvantage

1) Score means the same thing for every student (1) Dishonest students can copy answers from neighbouring students.
(2) Scores of dishonest students can be unreliably high
(3) Honest students are disadvantaged by acts of dishonest students.

If cheating doesnt occur
5
Test Administration using more than one form
Advantage
Disadvantage

1) Substantially reduce chance for dishonesty cheating (1) Some equity issues if test equating is not carried out
2) Honest students are not disadvantaged by acts of dishonest students.
3) Scores of dishonest students are reliably low if cheating occurs
6
Test Equity
  • Definition (laymens definition)
  • Equity
  • "It is a matter of indifference which test
    form a student took"

7
Steps to ensuring Equity
  • Building test forms to the same test content
    specifications
  • Test forms should be interchangeable.
  • No one form should have different content
    specifications than others.
  • Test length should be the same.
  • No one form should be longer than another
  • Students should not be disadvantaged by taking
    a longer test form than their peers.

Interchangeable Content? Interchangeable Content? Interchangeable Content?
  Form X Form Y
Differentiation 80 20
Integration 20 80
Same length? Same length? Same length?
 Time Form X Form Y
Required to finish 2 hr 1 hr
Allotted for Administration 1 hr 30 min 1 hr 30 min
8
Steps to ensuring Equity continued//
  • Building test forms to the same test parameter
    specifications
  • Test forms should be equally difficult
  • Students should not be disadvantaged by taking
    test forms that are very difficult compared to
    what their peers take in the same
    administration.
  • Test forms should be equally reliable.

Same Difficulty? Same Difficulty? Same Difficulty?
  Form X Form Y
Percent of student below median of X 50 70
Same consistency? Same consistency? Same consistency?
  Form X Form Y
Coefficient alpha 0.70 0.90
9
Conditions For Equated Scores
  • The purpose of equating is to establish, as
    nearly as possible, an effective equivalence
    between raw scores on two test forms.
  • Because equating is an empirical procedure, it
    requires a design for data collection and a rule
    for transforming scores on one test form to
    scores on another.
  • Many practitioners would agree with Lord (1980)
    that scores on test X and test Y are equated if
    the following four conditions are met
  • Same Ability the two tests must both be
    measures of the same characteristic (latent
    trait, ability or skill).
  • Equity for every group of examinees of
    identical ability, the conditional frequency
    distribution of scores on test Y, after
    transformation, is the same conditional frequency
    distribution of scores on test X.
  • Population Invariance the transformation is the
    same regardless of the group from of which it is
    derived.
  • Symmetry the transformation is invertible, that
    is, the mapping scores from form X to form Y is
    the same as the mapping of scores from form Y to
    form X

10
Conditions For Equated Scores

continued//
  • The equity condition is unlikely to be precisely
    satisfied in practice.
  • Although it might be possible to build two forms
    of a test that measured the same characteristic
    and were equally reliable generally, it is highly
    unlikely that one could ever build two forms that
    were equally reliable at every ability level, let
    alone that which can produce the same conditional
    frequency distributions.

11
Data Collection Designs
12

13
Equating Data Collection Designs
  • No statistical procedure can provide completely
    appropriate adjustments when non-equivalent or
    naturally occurring groups are used,
  • but
  • adjustments based on an another test that is as
    close as possible to the tests to be equated are
    much more satisfactory than those based on
    nonparallel tests.

14
Equating Procedures
  • Can regression be used to equate scores?
  • No. Because Y abX does not give us the same
    conversion function as X cmY
  • To ensure equity, the conversion functions need
    to be the same.



15
Equating Procedures
  • Pre-Equating
  • Equating done on sections of a test, not the
    final test booklets
  • Scores are not counted for student
  • Post-Equating
  • Equating done on final test booklets, not
    sections of a test
  • Equipercentile Equating
  • Equates percentiles of two score distributions
    for two test forms
  • Linear Equating
  • Equates means and standard deviations of two
    score distributions for two test forms

16
Illustration of the Equipercentile Equating
Process
  • Equipercentile equating can be thought as a
    two-stage process (Kolen, 1984).
  • First,
  • the relative cumulative frequency (i.e.
    percentage of cases below a score interval)
    distributions are tabulated or plotted for the
    two forms to be to be equated.
  • Second,
  • equated scores (e.g. scores with identical
    relative cumulative frequencies) on the two forms
    are obtained from these cumulative frequency
    distributions.

17
Illustration of the Equipercentile
Equating Process continued//
  • A graphical method for equipercentile is
    illustrated in Figure 6.4.
  • First,
  • the relative cumulative frequency distributions,
    each based on 471 examinees, for two forms
    (designated X and Y) of a 60-item
    number-right-scored test were plotted.
  • The crosses (and stars) represent the relative
    cumulative frequency (i.e., percent below) at the
    lower real limit of each integer score interval
    (e.g, at i-0.5, for i1, 2, , n, where n is the
    number of items).
  • Next,
  • the crosses (stars) were connected with straight
    line segments.
  • Graphs constructed in this manner are referred to
    as linearly interpolated relative cumulative
    frequency distributions.
  • The line segments connecting the crosses (stars)
    need not be linear.
  • Methods of curvilinear interpolation, such as the
    use of cubic splines, could also be employed.

18
Illustration of the Equipercentile
Equating Process continued//
  • Let the form-X equipercentile equivalent of yi,
    be denoted ex(yi).
  • The calculation of the form X equipercentile
    equivalent ex(18) of a number-right score of 18
    on form Y is illustrated in Figure 6.4.
  • The left-hand vertical arrow indicates that the
    relative cumulative frequency for a score of 18
    on form Y is 50.
  • The short horizontal arrow shows the point on the
    curve for form X with the same relative
    cumulative frequency (50).
  • The right-hand vertical arrow indicates that a
    score of 30 on form X is associated with this
    relative cumulative frequency.
  • Thus, a score of 30 on form X is considered to
    be equivalent to a score of 18 on form Y.
  • A plot of the score conversion (equivalent) is
    given in Figure 6.5.

19
  • The equipercentile transformation between two
    forms, X and Y, of a test will usually be
    curvilinear.
  • If form X is more difficult than form Y, the
    conversion line will tend to be concave downward.
  • If the distribution of scores on form X is
    flatter, more platykurtic, than that on form Y,
    the conversion will tend to be S-shaped.
  • If the shapes of the score distributions on the
    two forms are the same (i.e., have the same
    moments except for the first two), the conversion
    line will be linear.

20
Use of Smoothing Techniques
  • Unsmoothed equipercentile equating uses straight
    linear interpolation for the ogives
  • Smoothing techniques can be used with curvilinear
    interpolation such as cubic splines with
    different parameters
  • Smoothing on ogives is known as pre-smoothing
    method
  • Smoothing on conversion functions is known as
    post-smoothing method

21
Application of Equipercentile-Equating to Data
Collection Designs
  • Equipercentile equating can also be carried out
    for the anchor-test-random-groups design in the
    following manner
  • Using the data for the group taking tests X and V
    (the anchor test), for each raw score on test V,
    determine the score on test X with the same
    percentile rank.
  • Using the data group taking tests Y and V, for
    each raw score on test V, determine the score on
    test Y with the same percentile rank.

22
Application of Equipercentile-Equating to Data
Collection Designs continued//
  • Tabulate pairs of scores on tests X and Y that
    correspond to the same raw score on test V.
  • Using data from step 3, for each raw score on
    test Y, interpolate to determine the equivalent
    score on test X.
  • The last procedure uses the data on test V to
    adjust for differences in ability between the two
    groups. This procedure really involves two
    equatings, instead of just one, and therefore
    doubles the variance of equating error.

23
Standard Errors of Equipercentile Equating
24
Standard Errors of Equipercentile
Equating
continued//
  • Another procedure that may be used to estimate
    the standard error of an equipercentile equating
    is the bootstrap method (Efron 1982).

25
Linear Equating
When tests X and Y are not equally reliable, true
score x and y are used instead
26
Illustration of the Linear Equating Process
  • Linear equating, like equipercentile equating,
    can be thought of as two-stage process.
  • First, compute the sample means (m) and standard
    deviations (s) of scores on the two forms to be
    equated.
  • Second, obtain equated scores on the two forms by
    substituting these values into linear equating
    equation.
  • For example, suppose the raw-score means and the
    standard deviations for two-forms, X and Y, of a
    60-item number-right-scored test administered to
    a single group of 471 examinees are

27
Illustration of the Linear Equating
Process continued//
28
Application of Linear Equating to Data Collection
Designs
  • Linear equating can be carried out for the
    anchor-test-random-groups design in the same
    manner as for the equivalent-group design, in
    which case, the data on anchor-test V are
    ignored.
  • However, even when the groups are chosen at
    random, it is inevitable that there will be some
    differences between them, which, if ignored, will
    lead to bias in the conversion line.
  • The data on test V can be used to adjust for
    differences between groups by means of the
    maximum-likelihood approach (Lord, 1955a).
  • Maximum-likelihood estimates of the population
    means and standard deviations on forms X and Y
    are as follows-

29
Application of Linear Equating to Data Collection
Designs continued..//

30
Standard Errors of Linear Equating
31
Comparison of equating methods
  • Equipercentile Equating
  • Adjust for differences in difficulty of test
    forms
  • Can equate up to the fourth moments of the score
    distribution
  • Percent of students below a particular score is
    equated
  • Linear Equating
  • Adjust for differences in difficulty of test
    forms
  • Only equates up to the first two moments of the
    score distribution
  • Percent of students scoring below an equated
    score is not equated

32
References
  • Kolen and Brennan (1995) Test equating, springer
    verlag
  • Kolen, Peterson, Hoovers chapter on test
    equating in Linn (1993) Educational Measurement,
    Ace-Oryx publishing

33
Thank You
Thank You
34
Application of Linear Equating to Data Collection
Designs
Write a Comment
User Comments (0)
About PowerShow.com