Title: EPSY 546: LECTURE 1 INTRODUCTION TO MEASUREMENT THEORY
1EPSY 546 LECTURE 1INTRODUCTION TO MEASUREMENT
THEORY
2What is test theory?
3WHAT IS A TEST?
- Test A procedure for obtaining a sample of
person behavior from a specified domain of items.
4WHAT IS A TEST?
- Test A procedure for obtaining a sample of
person behavior from a specified domain of items. - General Exam, questionnaire, survey,
judge-observed task, etc.
5ITEM RESPONSE SCORING
- Test item responses are scored.
- Some Examples
- Dichotomous
- 1 Correct, 0 Incorrect
- (Scored from possibly a multiple choice test item)
6ITEM RESPONSE SCORING
- Test item responses are scored.
- Some Examples
- Rating Scale
- 1 Strongly Disagree
- 2 Disagree
- 3 Agree
- 4 Strongly Agree
7ITEM RESPONSE SCORING
- Test item responses are scored.
- Some Examples
- Partial Credit
- 1 Completely incorrect
- 2 Partially correct
- 3 Completely correct
8WHAT TESTS DO
- Tests are designed to measure latent traits that
manifest in the responses to the test items.
9LATENT VARIABLES
- Some substantive examples of latent traits
- Exam Ability on long division.
- Attitude Questionnaire Agreement towards capital
punishment. - Survey Frequency of drug use.
- Survey Quality of life.
10LATENT VARIABLES
- Latent trait
- latent variable
- psychological trait/variable/attribute
- unidimensional variable
- construct
11LATENT VARIABLES
- For measurement, latent variables are often
numerically represented either - by total test score (person or item),
- or by parameters of person ability or item
difficulty.
12Some Challenges of latent trait measurement (5)
- 1. No single approach to the measurement of a
latent trait is universally accepted.
13Some Challenges of latent trait measurement (5)
- 1. No single approach to the measurement of a
latent trait is universally accepted. - Two theorists may possibly select
- different items to measure a particular
- latent trait (e.g., math ability).
14Some Challenges of latent trait measurement (5)
- 2. Psychological measurements are usually based
on limited samples of behavior.
15Some Challenges of latent trait measurement (5)
- 2. Psychological measurements are usually based
on limited samples of behavior. - Practically impossible to confront respondents
with all possible items that represent the latent
trait (e.g., all long division items)
16Some Challenges of latent trait measurement (5)
- 2. Psychological measurements are usually based
on limited samples of behavior. - N 1, for each person on an item.
17Some Challenges of latent trait measurement (5)
- 3. Latent trait measurement obtained is
- always subject to error.
18Some Challenges of latent trait measurement (5)
- 3. Latent trait measurement obtained is
- always subject to error.
- Random
- sampling error of respondents,
- and of
items - inherent unreliability of respondents (e.g.,
boredom, lucky guess, carelessness).
19Some Challenges of latent trait measurement (5)
- 3. Latent trait measurement obtained is
- always subject to error.
- Systematic
- Cheating on exam Response bias
- item does not measure latent trait
- misscoring test form out of order.
20Some Challenges of latent trait measurement (5)
- 4. Establishing measurement scales for the
latent trait.
21Some Challenges of latent trait measurement (5)
- 4. Establishing measurement scales for the
latent trait. - Stevens (1946)
- the assignment of numerals or events according
to rules. (NOT!)
22Some Challenges of latent trait measurement (5)
- 4. Establishing measurement scales for the
latent trait. - Michell Measurement requires tests of the
hypothesis that the variable is quantitative.
(Echoing Luce, Krantz, Suppes, Tversky, in three
FM volumes)
23Some Challenges of latent trait measurement (5)
- 5. Latent traits must also demonstrate
relationships to other important traits or
observable phenomena.
24Some Challenges of latent trait measurement (5)
- 5. Latent traits must also demonstrate
relationships to other important traits or
observable phenomena. - Measurements of latent traits have value when
they can be related to other traits or events in
the real world.
25WHAT IS TEST THEORY?
- The study of the 5 pervasive measurement problems
just described, and developing/applying methods
for their resolution.
26TEST THEORY COURSE
- Become aware of the logic and mathematical models
that underlie practices in test use and
construction.
27TEST THEORY COURSE
- Awareness of these models, including their
assumptions and limitations, should lead to an
improved practice in test construction and more
intelligent use of test information in decision
making.
28TEST THEORY COURSE
- Test theory provides general framework for
viewing the process of instrument development. - Test theory distinguishes from the more applied
subject of educational and psychological
assessment (focuses on administration and
interpretation of specific tests).
29Process of Test Construction
30TEST CONSTRUCTION
- 10 steps can be followed to construct an test for
the measurement of persons - (and items).
- (CA, Chapter 4)
31TEST CONSTRUCTION
- 1. Identify the primary purpose(s) for
- which the test measurements will be
- used.
32TEST CONSTRUCTION
- 1. Identify the primary purpose(s) for
- which the test measurements will be
- used.
- 2. Hypothesize items that define the
- latent trait of interest.
33TEST CONSTRUCTION
- 3. Prepare a set of test specifications,
delineating the proportion of items that should
focus on each type of behavior identified in Step
2.
34TEST CONSTRUCTION
- 3. Prepare a set of test specifications,
delineating the proportion of items that should
focus on each type of behavior identified in Step
2. - 4. Construct an initial pool of items.
35TEST CONSTRUCTION
- 5. Have items reviewed and revised.
36TEST CONSTRUCTION
- 5. Have items reviewed and revised.
- 6. Hold preliminary item tryouts (and revise).
37TEST CONSTRUCTION
- 5. Have items reviewed and revised.
- 6. Hold preliminary item tryouts (and revise).
- 7. Field test the items on a large sample
representative of the examinee population for
whom the test is intended. (PILOT STUDY)
38TEST CONSTRUCTION
- 8. Determine statistical properties of the items,
and when appropriate, eliminate items that do not
meet pre-established criteria.
39TEST CONSTRUCTION
- 8. Determine statistical properties of the items,
and when appropriate, eliminate items that do not
meet pre-established criteria. - 9. Design and conduct reliability and validity
studies for the final form of the test.
40TEST CONSTRUCTION
- 10. Develop guidelines for administration,
- scoring, and interpretation of the test
- scores.
-
- (e.g., prepare norm tables, suggest
recommended cutting scores or standards for
performance, etc.)
41Statistical Concepts for Test Theory
42BASIC STATISTICS (CA2)
- Frequency tables and graphs
- Distribution
- Normal distribution (p.d.f., c.d.f.)
- Central tendency Mode, median, mean.
- Variability Variance, standard deviation.
- Z - scores
- For infinite populations.
43BASIC STATISTICS (CA2)
- Relationship between two variables
- Scatterplot.
- Pearsons correlation coefficient.
- Ordinary linear regression.
- Standard error of Y predictions, for a given
regression equation.
44BASIC STATISTICS (CA5)
- Statistics Test Items
- Mean and total score for an item,
- over respondents (item difficulty).
- Variance of responses on a test item
- Inter-item correlation (Pearsons product moment
correlation or phi-correlation)
45VARIANCE OF TEST SCORES AND TEST ITEMS
- Since tests are usually scored by the sum of the
item scores, -
- it follows that there should be some
relationship between - individual item variances
- and the
- variance of the total test scores.
46VARIANCE OF TEST SCORES AND TEST ITEMS
- In fact,
- since the measurement of individual
- differences is a central goal of testing,
- one goal of test construction should be
- to maximize the variance of the total test
scores. - The reliability and validity of a test depends on
this variance.
47VARIANCE OF TEST SCORES AND TEST ITEMS
- Covariance between items i and j
- N Number of respondents
- J number of items
- ? population mean
48VARIANCE OF TEST SCORES AND TEST ITEMS
- Variance-Covariance Matrix
49VARIANCE OF TEST SCORES AND TEST ITEMS
- Total Test Score Variance
- Sum of item variances
- sum of item covariances
50VARIANCE OF TEST SCORES AND TEST ITEMS
- Implications of Equation (first term)
- Total test score variance increases as the number
of items (J) is increased. - (except when the added items have a non
- positive correlation with the other items).
51VARIANCE OF TEST SCORES AND TEST ITEMS
- Implications of Equation (second term)
- Test score variance increases when items are
added that have positive covariances with the
other test items.
52VARIANCE OF TEST SCORES AND TEST ITEMS
- Implications of Equation
- Test score variance is maximized when
- items are equal in difficulty (this
increases item covariances), - and of medium difficulty (this
increases item variances).
53Introduction To Scaling
544 SCALES OF MEASUREMENT
- 1. Nominal Scale
- Used for classification.
- Assigns the same numbers to objects that are
equivalent, and a different number to objects
that are not.
554 SCALES OF MEASUREMENT
- 1. Nominal Scale
- Class of admissible transformations
- class of one-to-one transformations.
- i.e., ni(x) ni(y) iff nj(x) nj(y)
- for all scales i, j, and objects x, y.
564 SCALES OF MEASUREMENT
- 2. Ordinal Scale
- With respect to some attribute,
- this scale orders objects in magnitude, but
does not measure distances between the objects. - Example Ranking
574 SCALES OF MEASUREMENT
- 2. Ordinal Scale
- Class of admissible transformations
- class of increasing monotonic transformations.
- i.e., ni(x) gt ni(y) iff njj(x) gt nj(y)
- for all scales i, j, and objects x, y.
584 SCALES OF MEASUREMENT
- 3. Interval Scale
- Involves the numerical representation of relation
upon the differences between entities with
respect to some attribute. (no absolute zero
point) - Example temperature measurement.
- (Fahrenheit, Celsius)
594 SCALES OF MEASUREMENT
- 3. Interval Scale
- Class of admissible transformations
- class of positive linear transformations.
-
- nj(x) ani(x) b
- for a gt 0, 0ltb gt 0
- e.g., C (5/9)F ? (160/9)
604 SCALES OF MEASUREMENT
- 4. Ratio Scale
- Has properties of order, equal distance between
units, and an absolute zero point. - Non-zero measurements on this scale may be
expressed as ratios of one another. - Examples Length, weight, etc.
614 SCALES OF MEASUREMENT
- 4. Ratio Scale
- Class of admissible transformations
- class of multiplicative transformations
- ni(x) nj(x) c, for c gt 0
62MEASUREMENT
- As mentioned earlier, establishing a measurement
scale for a given variable requires hypothesis
tests. - The measurement of directly observable, physical
phenomena is easily obtainable and verifiable.
63MEASUREMENT
- However, this is not the case for the measurement
of latent psychological phenomena (e.g.,
ability, intelligence, attitudes, beliefs, etc.),
which are not directly
observable.
64CONJOINT MEASUREMENT
- The axioms of conjoint measurement can be tested
to determine whether latent traits are measurable
on an ordinal or interval scale.
65INDEPENDENCE AXIOM (row)
66Monotone Homogeneity (MH)
672PL
683PL
694PL
70INDEPENDENCE AXIOM (column)
71ISOP (Scheiblechner 1995)
72RASCH-1PL
73Thomsen condition(e.g.,double cancellation)
74(No Transcript)
75MH analysis
ICC Crossings
76DM analysis
77Model Selection Evaluation
78Model Assessment Detailed
79Model Assessment Detailed
Person Fit Posterior
Item
Predictive Examinee Responses P-value
2154 110100 .67 279
101001 .12 987 000011
.00