Introduction to IRTRasch Measurement with Winsteps Ken Conrad, University of Illinois at Chicago Bar - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Introduction to IRTRasch Measurement with Winsteps Ken Conrad, University of Illinois at Chicago Bar

Description:

E.g., NBA players, jockeys. ... 1-5 ordinal metric where both a jockey and NBA player could ... Over 6'=NBA, under 6'=jockey. Uses ordinal data as interval. ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Introduction to IRTRasch Measurement with Winsteps Ken Conrad, University of Illinois at Chicago Bar


1
Introduction to IRT/Rasch Measurement with
WinstepsKen Conrad, University of Illinois at
Chicago Barth Riley and Michael Dennis,Chestnut
Health Systems

2
Agenda
  • 1230. Ken Conrad Power-point presentation on
    classical
  • test theory compared to Rasch, includes history
    and introduction to the Rasch model.
  • 215. Break
  • 230. Discussion of an application of Rasch
    analysis in the measurement of posttraumatic
    stress disorder with interpretation of
    Rasch/Winsteps output.
  • 315. Barth Riley Implications and Extensions of
    Rasch Measurement.
  • 415. Break.
  • 430. Mike Dennis Practical applications of
    IRT/Rasch in SUD screening and outcome assessment
  • 515. Open discussion and Q A.
  • 530. End of workshop.

3
The Dream of Rulers of Human Functioning
  • Beyond organ function to human functionWHO, 1947
  • E.g., quality of life, need to ask person
  • 1970s--Physical, social, and mental health
    issues
  • Measuring many constructs requires many
    itemstime, , burden
  • Todayneed for psychometric efficiency w/o loss
    of reliability and construct validity

4
Prevailing Paradigm, Classical Test Theory
  • CTTmore items for more reliability
  • Since we seek efficiency (fewer items), items
    tend to be where most of the people arearound
    the mean.
  • Resultredundancy at mid-range, few items at
    extremes, ceiling and floor effects
  • Impossible to measure improvement of those in
    ceiling and decline of those in floor.

5
How children measure wooden rods (from Piaget)
  • Classificationseparate the rods from the cups,
    the balls, etc. (nominal)
  • Seriationline them up by size (ordinal)
  • Iterationdevelop a unit to know how much bigger
    (interval)
  • Standardizationmake a rule(r) and a process for
    determining how many units each rod has
  • Children know that classification and seriation
    are not measurement, Stevens did not nominal,
    ordinal, interval, ratio

6
Improvement IRT/Rasch measurement and computers
  • Rasch measurement model enables construction of a
    ruler with as many items as we want at any level
    of the construct
  • The computer enables choice of items based on
    each persons pattern of responses.
  • Each test is tailored to the individual, and not
    all of the items are needed.

7
Classical Test Theory
  • A measure is a sample of items from an infinite
    domain of items that represent the attribute of
    interest.
  • Items are treated as replicates of one another in
    the sense that differences among the items are
    ignored in scaling.
  • More itemsmore reliability
  • Everyone gets the same items
  • Answers needed to all items

8
  • Ranking is sample dependent
  • E.g., NBA players, jockeys. Height could be in
    the same 1-5 ordinal metric where both a jockey
    and NBA player could be rated 5, but this could
    only be interpreted with reference to a
    particular sample. The sample defines height.
  • With interval scaling, height defines the
    sample. Over 6NBA, under 6jockey.

9
Classical Test Theory
  • Uses ordinal data as interval.
  • Using presumably impermissible transformations,
    i.e. using ordinal as interval, usually makes
    little, if any, difference to results of most
    analyses.
  • Thus, if it behaves like an interval scale, it
    can be treated as one.
  • Just use the raw scores. Add em up.
  • Clean and easy

10
Assumption all items are created equal But we
know that is not true. Is that how we measure
potatoes? How about spelling? Items actually
range from Easy-gthard Like addition
-gt division E.g., Guttman 1111100000 Lack
of recent practice on item 5 1111011000 Educated
guess on item 8 1111100100 Slow, nervous
start 0111111000
11
No Difficulty Parameter in CTT. What if two
students both got 5 out of 10 correct, but one
got the 5 easiest right and the other the 5
hardest? Easy-gthard Peter 1111100000
Paul 0000011111Do they have the same ability?
Wouldnt you like to get a better idea of what
happened on Pauls test? Did he arrive late?
Were test pages missing? Maybe they were word
problems, and Paul is a foreign student.
12
  • With CTT, extremely difficult to compare a
    persons scores on two or more different
    testsusually compare z-scores.
  • Assumes that samples of both tests center on the
    same mean.
  • Assumes that all of the tests are normally
    distributed, which is rarely the case.

13
Assumptions of CTT
  • CTT take the test, e.g., SD, D, A, or SA on 50
    items. What if there is missing data?
  • CTT uses ordinal scaling, but assumes equal
    intervals in the rating scale. However, we know
    that distances between scale points usually are
    not equal, e.g., The President is doing a good
    job.
  • SD D A
    SA
  • To WWII veterans Do you wear fashionable shoes?
  • N SD D A
    SA
  • CTT gives us very limited ability to examine the
    performance of our rating scales. Do they really
    work the way we want them to?

14
Cronbachs Alpha
  • Adding items improves alpha, but are they good
    items?
  • Ceiling and floor effects improve alpha.
  • CTT assumes homoscedasticitythat the error of
    measurement is the same at the high end of the
    scale as in the middle or at the low end.
  • However, ordinal measures are biased, especially
    at the extremes where there is much more error.

15
To Count gt To Measure
E.G., From counting potatoes to measuring their
quality. From counting number of drinks to
measuring substance use disorders. From summing
Likert ratings to linear, interval measurement.
Write a Comment
User Comments (0)
About PowerShow.com