Dr. Mohammad H. Omar Department of Mathematical Sciences - PowerPoint PPT Presentation

About This Presentation

Title:

Dr. Mohammad H. Omar Department of Mathematical Sciences

Description:

Presented at Statistic Research (STAR) colloquium, King Fahd University of ... on test equating in Linn (1993) Educational Measurement, Ace-Oryx publishing ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 34

Provided by: facultyK

Category:

more less

Transcript and Presenter's Notes

Title: Dr. Mohammad H. Omar Department of Mathematical Sciences

1
Some Statistics for Equating Multiple Forms of a
test

by
Dr. Mohammad H. OmarDepartment of Mathematical
Sciences
May 16, 2006
Presented at Statistic Research (STAR)
colloquium,King Fahd University of Petroleum
Minerals,Dhahran, Saudi Arabia.

2
Equating
3
Brief overview of Talk

Test administration using
Only one form
More than one form
Test Equity
Steps to ensuring equity
Conditions for Equated Score
Data Collection Designs
Equating procedures
Illustration of the Equipercentile Equating
process
Use of smoothing techniques
Application of equipercentile equating to data
collection design
Standard errors of equipercentile equating
Linear equating
Illustration of the Linear Equating process
Application of linear equating to data collection
design
Standard errors of linear equating
Comparison of equating methods

4
Test Administration using only one Form
Advantage
Disadvantage

1) Score means the same thing for every student (1) Dishonest students can copy answers from neighbouring students.
(2) Scores of dishonest students can be unreliably high
(3) Honest students are disadvantaged by acts of dishonest students.

If cheating doesnt occur
5
Test Administration using more than one form
Advantage
Disadvantage

1) Substantially reduce chance for dishonesty cheating (1) Some equity issues if test equating is not carried out
2) Honest students are not disadvantaged by acts of dishonest students.
3) Scores of dishonest students are reliably low if cheating occurs
6
Test Equity

Definition (laymens definition)
Equity
"It is a matter of indifference which test
form a student took"

7
Steps to ensuring Equity

Building test forms to the same test content
specifications
Test forms should be interchangeable.
No one form should have different content
specifications than others.
Test length should be the same.
No one form should be longer than another
Students should not be disadvantaged by taking
a longer test form than their peers.

Interchangeable Content? Interchangeable Content? Interchangeable Content?
Form X Form Y
Differentiation 80 20
Integration 20 80
Same length? Same length? Same length?
Time Form X Form Y
Required to finish 2 hr 1 hr
Allotted for Administration 1 hr 30 min 1 hr 30 min
8
Steps to ensuring Equity continued//

Building test forms to the same test parameter
specifications
Test forms should be equally difficult
Students should not be disadvantaged by taking
test forms that are very difficult compared to
what their peers take in the same
administration.
Test forms should be equally reliable.

Same Difficulty? Same Difficulty? Same Difficulty?
Form X Form Y
Percent of student below median of X 50 70
Same consistency? Same consistency? Same consistency?
Form X Form Y
Coefficient alpha 0.70 0.90
9
Conditions For Equated Scores

The purpose of equating is to establish, as
nearly as possible, an effective equivalence
between raw scores on two test forms.
Because equating is an empirical procedure, it
requires a design for data collection and a rule
for transforming scores on one test form to
scores on another.
Many practitioners would agree with Lord (1980)
that scores on test X and test Y are equated if
the following four conditions are met
Same Ability the two tests must both be
measures of the same characteristic (latent
trait, ability or skill).
Equity for every group of examinees of
identical ability, the conditional frequency
distribution of scores on test Y, after
transformation, is the same conditional frequency
distribution of scores on test X.
Population Invariance the transformation is the
same regardless of the group from of which it is
derived.
Symmetry the transformation is invertible, that
is, the mapping scores from form X to form Y is
the same as the mapping of scores from form Y to
form X

10
Conditions For Equated Scores

continued//

The equity condition is unlikely to be precisely
satisfied in practice.
Although it might be possible to build two forms
of a test that measured the same characteristic
and were equally reliable generally, it is highly
unlikely that one could ever build two forms that
were equally reliable at every ability level, let
alone that which can produce the same conditional
frequency distributions.

11
Data Collection Designs
12

13
Equating Data Collection Designs

No statistical procedure can provide completely
appropriate adjustments when non-equivalent or
naturally occurring groups are used,
but
adjustments based on an another test that is as
close as possible to the tests to be equated are
much more satisfactory than those based on
nonparallel tests.

14
Equating Procedures

Can regression be used to equate scores?
No. Because Y abX does not give us the same
conversion function as X cmY
To ensure equity, the conversion functions need
to be the same.

15
Equating Procedures

Pre-Equating
Equating done on sections of a test, not the
final test booklets
Scores are not counted for student
Post-Equating
Equating done on final test booklets, not
sections of a test
Equipercentile Equating
Equates percentiles of two score distributions
for two test forms
Linear Equating
Equates means and standard deviations of two
score distributions for two test forms

16
Illustration of the Equipercentile Equating
Process

Equipercentile equating can be thought as a
two-stage process (Kolen, 1984).
First,
the relative cumulative frequency (i.e.
percentage of cases below a score interval)
distributions are tabulated or plotted for the
two forms to be to be equated.
Second,
equated scores (e.g. scores with identical
relative cumulative frequencies) on the two forms
are obtained from these cumulative frequency
distributions.

17
Illustration of the Equipercentile
Equating Process continued//

A graphical method for equipercentile is
illustrated in Figure 6.4.
First,
the relative cumulative frequency distributions,
each based on 471 examinees, for two forms
(designated X and Y) of a 60-item
number-right-scored test were plotted.
The crosses (and stars) represent the relative
cumulative frequency (i.e., percent below) at the
lower real limit of each integer score interval
(e.g, at i-0.5, for i1, 2, , n, where n is the
number of items).
Next,
the crosses (stars) were connected with straight
line segments.
Graphs constructed in this manner are referred to
as linearly interpolated relative cumulative
frequency distributions.
The line segments connecting the crosses (stars)
need not be linear.
Methods of curvilinear interpolation, such as the
use of cubic splines, could also be employed.

18
Illustration of the Equipercentile
Equating Process continued//

Let the form-X equipercentile equivalent of yi,
be denoted ex(yi).
The calculation of the form X equipercentile
equivalent ex(18) of a number-right score of 18
on form Y is illustrated in Figure 6.4.
The left-hand vertical arrow indicates that the
relative cumulative frequency for a score of 18
on form Y is 50.
The short horizontal arrow shows the point on the
curve for form X with the same relative
cumulative frequency (50).
The right-hand vertical arrow indicates that a
score of 30 on form X is associated with this
relative cumulative frequency.
Thus, a score of 30 on form X is considered to
be equivalent to a score of 18 on form Y.
A plot of the score conversion (equivalent) is
given in Figure 6.5.

The equipercentile transformation between two
forms, X and Y, of a test will usually be
curvilinear.
If form X is more difficult than form Y, the
conversion line will tend to be concave downward.
If the distribution of scores on form X is
flatter, more platykurtic, than that on form Y,
the conversion will tend to be S-shaped.
If the shapes of the score distributions on the
two forms are the same (i.e., have the same
moments except for the first two), the conversion
line will be linear.

20
Use of Smoothing Techniques

Unsmoothed equipercentile equating uses straight
linear interpolation for the ogives
Smoothing techniques can be used with curvilinear
interpolation such as cubic splines with
different parameters
Smoothing on ogives is known as pre-smoothing
method
Smoothing on conversion functions is known as
post-smoothing method

21
Application of Equipercentile-Equating to Data
Collection Designs

Equipercentile equating can also be carried out
for the anchor-test-random-groups design in the
following manner
Using the data for the group taking tests X and V
(the anchor test), for each raw score on test V,
determine the score on test X with the same
percentile rank.
Using the data group taking tests Y and V, for
each raw score on test V, determine the score on
test Y with the same percentile rank.

22
Application of Equipercentile-Equating to Data
Collection Designs continued//

Tabulate pairs of scores on tests X and Y that
correspond to the same raw score on test V.
Using data from step 3, for each raw score on
test Y, interpolate to determine the equivalent
score on test X.
The last procedure uses the data on test V to
adjust for differences in ability between the two
groups. This procedure really involves two
equatings, instead of just one, and therefore
doubles the variance of equating error.

23
Standard Errors of Equipercentile Equating
24
Standard Errors of Equipercentile
Equating
continued//

Another procedure that may be used to estimate
the standard error of an equipercentile equating
is the bootstrap method (Efron 1982).

25
Linear Equating
When tests X and Y are not equally reliable, true
score x and y are used instead
26
Illustration of the Linear Equating Process

Linear equating, like equipercentile equating,
can be thought of as two-stage process.
First, compute the sample means (m) and standard
deviations (s) of scores on the two forms to be
equated.
Second, obtain equated scores on the two forms by
substituting these values into linear equating
equation.
For example, suppose the raw-score means and the
standard deviations for two-forms, X and Y, of a
60-item number-right-scored test administered to
a single group of 471 examinees are

27
Illustration of the Linear Equating
Process continued//
28
Application of Linear Equating to Data Collection
Designs

Linear equating can be carried out for the
anchor-test-random-groups design in the same
manner as for the equivalent-group design, in
which case, the data on anchor-test V are
ignored.
However, even when the groups are chosen at
random, it is inevitable that there will be some
differences between them, which, if ignored, will
lead to bias in the conversion line.
The data on test V can be used to adjust for
differences between groups by means of the
maximum-likelihood approach (Lord, 1955a).
Maximum-likelihood estimates of the population
means and standard deviations on forms X and Y
are as follows-

29
Application of Linear Equating to Data Collection
Designs continued..//

30
Standard Errors of Linear Equating
31
Comparison of equating methods

Equipercentile Equating
Adjust for differences in difficulty of test
forms
Can equate up to the fourth moments of the score
distribution
Percent of students below a particular score is
equated

Linear Equating
Adjust for differences in difficulty of test
forms
Only equates up to the first two moments of the
score distribution
Percent of students scoring below an equated
score is not equated

32
References

Kolen and Brennan (1995) Test equating, springer
verlag
Kolen, Peterson, Hoovers chapter on test
equating in Linn (1993) Educational Measurement,
Ace-Oryx publishing

33
Thank You
Thank You
34
Application of Linear Equating to Data Collection
Designs

Write a Comment

User Comments (0)