EPSY 546: LECTURE 1 INTRODUCTION TO MEASUREMENT THEORY - PowerPoint PPT Presentation

1 / 79

About This Presentation

Title:

EPSY 546: LECTURE 1 INTRODUCTION TO MEASUREMENT THEORY

Description:

... 0 = Incorrect (Scored from possibly a multiple choice test item) ... LATENT VARIABLES Some substantive examples of latent traits: ... Class of admissible ... – PowerPoint PPT presentation

Number of Views:385

Avg rating:3.0/5.0

Slides: 80

Provided by: George765

Category:

more less

Transcript and Presenter's Notes

Title: EPSY 546: LECTURE 1 INTRODUCTION TO MEASUREMENT THEORY

1
EPSY 546 LECTURE 1INTRODUCTION TO MEASUREMENT
THEORY

George Karabatsos

2
What is test theory?
3
WHAT IS A TEST?

Test A procedure for obtaining a sample of
person behavior from a specified domain of items.

4
WHAT IS A TEST?

Test A procedure for obtaining a sample of
person behavior from a specified domain of items.
General Exam, questionnaire, survey,
judge-observed task, etc.

5
ITEM RESPONSE SCORING

Test item responses are scored.
Some Examples
Dichotomous
1 Correct, 0 Incorrect
(Scored from possibly a multiple choice test item)

6
ITEM RESPONSE SCORING

Test item responses are scored.
Some Examples
Rating Scale
1 Strongly Disagree
2 Disagree
3 Agree
4 Strongly Agree

7
ITEM RESPONSE SCORING

Test item responses are scored.
Some Examples
Partial Credit
1 Completely incorrect
2 Partially correct
3 Completely correct

8
WHAT TESTS DO

Tests are designed to measure latent traits that
manifest in the responses to the test items.

9
LATENT VARIABLES

Some substantive examples of latent traits
Exam Ability on long division.
Attitude Questionnaire Agreement towards capital
punishment.
Survey Frequency of drug use.
Survey Quality of life.

10
LATENT VARIABLES

Latent trait
latent variable
psychological trait/variable/attribute
unidimensional variable
construct

11
LATENT VARIABLES

For measurement, latent variables are often
numerically represented either
by total test score (person or item),
or by parameters of person ability or item
difficulty.

12
Some Challenges of latent trait measurement (5)

1. No single approach to the measurement of a
latent trait is universally accepted.

13
Some Challenges of latent trait measurement (5)

1. No single approach to the measurement of a
latent trait is universally accepted.
Two theorists may possibly select
different items to measure a particular
latent trait (e.g., math ability).

14
Some Challenges of latent trait measurement (5)

2. Psychological measurements are usually based
on limited samples of behavior.

15
Some Challenges of latent trait measurement (5)

2. Psychological measurements are usually based
on limited samples of behavior.
Practically impossible to confront respondents
with all possible items that represent the latent
trait (e.g., all long division items)

16
Some Challenges of latent trait measurement (5)

2. Psychological measurements are usually based
on limited samples of behavior.
N 1, for each person on an item.

17
Some Challenges of latent trait measurement (5)

3. Latent trait measurement obtained is
always subject to error.

18
Some Challenges of latent trait measurement (5)

3. Latent trait measurement obtained is
always subject to error.
Random
sampling error of respondents,
and of
items
inherent unreliability of respondents (e.g.,
boredom, lucky guess, carelessness).

19
Some Challenges of latent trait measurement (5)

3. Latent trait measurement obtained is
always subject to error.
Systematic
Cheating on exam Response bias
item does not measure latent trait
misscoring test form out of order.

20
Some Challenges of latent trait measurement (5)

4. Establishing measurement scales for the
latent trait.

21
Some Challenges of latent trait measurement (5)

4. Establishing measurement scales for the
latent trait.
Stevens (1946)
the assignment of numerals or events according
to rules. (NOT!)

22
Some Challenges of latent trait measurement (5)

4. Establishing measurement scales for the
latent trait.
Michell Measurement requires tests of the
hypothesis that the variable is quantitative.
(Echoing Luce, Krantz, Suppes, Tversky, in three
FM volumes)

23
Some Challenges of latent trait measurement (5)

5. Latent traits must also demonstrate
relationships to other important traits or
observable phenomena.

24
Some Challenges of latent trait measurement (5)

5. Latent traits must also demonstrate
relationships to other important traits or
observable phenomena.
Measurements of latent traits have value when
they can be related to other traits or events in
the real world.

25
WHAT IS TEST THEORY?

The study of the 5 pervasive measurement problems
just described, and developing/applying methods
for their resolution.

26
TEST THEORY COURSE

Become aware of the logic and mathematical models
that underlie practices in test use and
construction.

27
TEST THEORY COURSE

Awareness of these models, including their
assumptions and limitations, should lead to an
improved practice in test construction and more
intelligent use of test information in decision
making.

28
TEST THEORY COURSE

Test theory provides general framework for
viewing the process of instrument development.
Test theory distinguishes from the more applied
subject of educational and psychological
assessment (focuses on administration and
interpretation of specific tests).

29
Process of Test Construction
30
TEST CONSTRUCTION

10 steps can be followed to construct an test for
the measurement of persons
(and items).
(CA, Chapter 4)

31
TEST CONSTRUCTION

1. Identify the primary purpose(s) for
which the test measurements will be
used.

32
TEST CONSTRUCTION

1. Identify the primary purpose(s) for
which the test measurements will be
used.
2. Hypothesize items that define the
latent trait of interest.

33
TEST CONSTRUCTION

3. Prepare a set of test specifications,
delineating the proportion of items that should
focus on each type of behavior identified in Step
2.

34
TEST CONSTRUCTION

3. Prepare a set of test specifications,
delineating the proportion of items that should
focus on each type of behavior identified in Step
2.
4. Construct an initial pool of items.

35
TEST CONSTRUCTION

5. Have items reviewed and revised.

36
TEST CONSTRUCTION

5. Have items reviewed and revised.
6. Hold preliminary item tryouts (and revise).

37
TEST CONSTRUCTION

5. Have items reviewed and revised.
6. Hold preliminary item tryouts (and revise).
7. Field test the items on a large sample
representative of the examinee population for
whom the test is intended. (PILOT STUDY)

38
TEST CONSTRUCTION

8. Determine statistical properties of the items,
and when appropriate, eliminate items that do not
meet pre-established criteria.

39
TEST CONSTRUCTION

8. Determine statistical properties of the items,
and when appropriate, eliminate items that do not
meet pre-established criteria.
9. Design and conduct reliability and validity
studies for the final form of the test.

40
TEST CONSTRUCTION

10. Develop guidelines for administration,
scoring, and interpretation of the test
scores.
(e.g., prepare norm tables, suggest
recommended cutting scores or standards for
performance, etc.)

41
Statistical Concepts for Test Theory
42
BASIC STATISTICS (CA2)

Frequency tables and graphs
Distribution
Normal distribution (p.d.f., c.d.f.)
Central tendency Mode, median, mean.
Variability Variance, standard deviation.
Z - scores
For infinite populations.

43
BASIC STATISTICS (CA2)

Relationship between two variables
Scatterplot.
Pearsons correlation coefficient.
Ordinary linear regression.
Standard error of Y predictions, for a given
regression equation.

44
BASIC STATISTICS (CA5)

Statistics Test Items
Mean and total score for an item,
over respondents (item difficulty).
Variance of responses on a test item
Inter-item correlation (Pearsons product moment
correlation or phi-correlation)

45
VARIANCE OF TEST SCORES AND TEST ITEMS

Since tests are usually scored by the sum of the
item scores,
it follows that there should be some
relationship between
individual item variances
and the
variance of the total test scores.

46
VARIANCE OF TEST SCORES AND TEST ITEMS

In fact,
since the measurement of individual
differences is a central goal of testing,
one goal of test construction should be
to maximize the variance of the total test
scores.
The reliability and validity of a test depends on
this variance.

47
VARIANCE OF TEST SCORES AND TEST ITEMS

Covariance between items i and j
N Number of respondents
J number of items
? population mean

48
VARIANCE OF TEST SCORES AND TEST ITEMS

Variance-Covariance Matrix

49
VARIANCE OF TEST SCORES AND TEST ITEMS

Total Test Score Variance
Sum of item variances
sum of item covariances

50
VARIANCE OF TEST SCORES AND TEST ITEMS

Implications of Equation (first term)
Total test score variance increases as the number
of items (J) is increased.
(except when the added items have a non
positive correlation with the other items).

51
VARIANCE OF TEST SCORES AND TEST ITEMS

Implications of Equation (second term)
Test score variance increases when items are
added that have positive covariances with the
other test items.

52
VARIANCE OF TEST SCORES AND TEST ITEMS

Implications of Equation
Test score variance is maximized when
items are equal in difficulty (this
increases item covariances),
and of medium difficulty (this
increases item variances).

53
Introduction To Scaling
54
4 SCALES OF MEASUREMENT

1. Nominal Scale
Used for classification.
Assigns the same numbers to objects that are
equivalent, and a different number to objects
that are not.

55
4 SCALES OF MEASUREMENT

1. Nominal Scale
Class of admissible transformations
class of one-to-one transformations.
i.e., ni(x) ni(y) iff nj(x) nj(y)
for all scales i, j, and objects x, y.

56
4 SCALES OF MEASUREMENT

2. Ordinal Scale
With respect to some attribute,
this scale orders objects in magnitude, but
does not measure distances between the objects.
Example Ranking

57
4 SCALES OF MEASUREMENT

2. Ordinal Scale
Class of admissible transformations
class of increasing monotonic transformations.
i.e., ni(x) gt ni(y) iff njj(x) gt nj(y)
for all scales i, j, and objects x, y.

58
4 SCALES OF MEASUREMENT

3. Interval Scale
Involves the numerical representation of relation
upon the differences between entities with
respect to some attribute. (no absolute zero
point)
Example temperature measurement.
(Fahrenheit, Celsius)

59
4 SCALES OF MEASUREMENT

3. Interval Scale
Class of admissible transformations
class of positive linear transformations.
nj(x) ani(x) b
for a gt 0, 0ltb gt 0
e.g., C (5/9)F ? (160/9)

60
4 SCALES OF MEASUREMENT

4. Ratio Scale
Has properties of order, equal distance between
units, and an absolute zero point.
Non-zero measurements on this scale may be
expressed as ratios of one another.
Examples Length, weight, etc.

61
4 SCALES OF MEASUREMENT

4. Ratio Scale
Class of admissible transformations
class of multiplicative transformations
ni(x) nj(x) c, for c gt 0

62
MEASUREMENT

As mentioned earlier, establishing a measurement
scale for a given variable requires hypothesis
tests.
The measurement of directly observable, physical
phenomena is easily obtainable and verifiable.

63
MEASUREMENT

However, this is not the case for the measurement
of latent psychological phenomena (e.g.,
ability, intelligence, attitudes, beliefs, etc.),
which are not directly
observable.

64
CONJOINT MEASUREMENT

The axioms of conjoint measurement can be tested
to determine whether latent traits are measurable
on an ordinal or interval scale.

65
INDEPENDENCE AXIOM (row)
66
Monotone Homogeneity (MH)
67
2PL
68
3PL
69
4PL
70
INDEPENDENCE AXIOM (column)
71
ISOP (Scheiblechner 1995)
72
RASCH-1PL
73
Thomsen condition(e.g.,double cancellation)
74
(No Transcript)
75
MH analysis
ICC Crossings
76
DM analysis
77
Model Selection Evaluation
78
Model Assessment Detailed
79
Model Assessment Detailed
Person Fit Posterior
Item
Predictive Examinee Responses P-value
2154 110100 .67 279
101001 .12 987 000011
.00

Write a Comment

User Comments (0)