Why do something simple, when it - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Why do something simple, when it

Description:

Or, why would someone use a simple model like the Rasch ... (rather than, say, rhomboid prisms?) (ii) why do people use linear models in. regression so much ... – PowerPoint PPT presentation

Number of Views:180
Avg rating:3.0/5.0
Slides: 31
Provided by: markwi4
Category:

less

Transcript and Presenter's Notes

Title: Why do something simple, when it


1
Why do something simple, when its just as easy
to do something complicated?
  • Mark Wilson
  • UC Berkeley
  • April 2006

2
  • Or, why would someone use a simple model like
    the Rasch model, when a more complicated model is
    just as easy to run?
  • After all, a more complicated model will almost
    certainly fit better (because it has more
    parameters).
  • Hence it will allow you to delete fewer items due
    to misfit--maybe delete none.
  • Observation It is very hard to convince
  • subject matter experts,
  • policy-types,
  • business people, etc.
  • that deleting items for misfit is important.

3
Outline
  • Why deleting misfitting items is important
  • Some reasons that have been offered
  • Importance of Wright maps etc. for interpretation
  • 4 building blocks, etc.
  • But they are still uncertain, so will give up
    interpretation advantages of Rasch models
  • examples of worries
  • Strategy limited items approach
  • Expanding the scenarios
  • tactics
  • Conclusion

4
Some reasons that have been offered for deleting
misfitting items
  • Because its philosophically more sound.
  • i.e., (i) to get specific objectivity
  • (ii) to get separation of variables
  • Contra (i) most practitioners dont care
  • (ii) other peoples philosophies
  • may lead to other conclusions

5
  • Because tests are part of the designed world,
  • not the natural world, and
  • linear models are instances of better design
    than non-linear ones.
  • i.e., (i) why are bricks almost always
    rectangular prisms? (rather than, say,
    rhomboid prisms?)
  • (ii) why do people use linear models in
  • regression so much
  • Contra (i) dont care about good design if it
    costs more in item development

6
  • Because its easier to explain to people
  • Contra (i) no one understands these complicated
  • formulae anyway, so you can have as
  • many parameters as you want
  • (ii) no need for explanation, as the results
    of
  • psychometric modeling dont need
  • to be understood, all that matters are
  • their technical characteristics
  • - i.e., if the items are modeled in a more
  • complicated way, that must be more
  • true.

7
  • Because you want to interpret results using
    something equivalent to a Wright Map

8
(No Transcript)
9
What does distance between item responses mean?
  • The idea of "location" of an item response with
    respect to the location of another item response
    only makes sense if that relative meaning is
    independent of the location of the respondent
    involved
  • i.e., the interpretation of relative locations
    needs to be uniform no matter where the
    respondent is.

10
(No Transcript)
11
Another way to put this
  • meaning is the same no matter where you are on
    the map
  • e.g., an "inch represents a mile" wherever you
    are on the map
  • One consequence of this is that the order (on the
    map) of the item responses must remain the same
    for all respondents
  • and that the order of the respondents (on the
    map) must remain the same for all item responses.
  • Note this is equivalent to double stochastic
    ordering (a concept used in non-parametric
    models)

12
But requirement is strongernot just order is
preserved, but metric properties too. In a
picture...
13
(No Transcript)
14
reprise
  • If people just want a number, with certain
    technical characteristics,
  • its hard to convince them that they should
    delete items for misfit.
  • If people just want to do what ETS/CTB/etc. does,
  • its hard to convince them that they should
    delete items for misfit.
  • If people want to save money by including all
    items,
  • its hard to convince them that they should
    delete items for misfit.

15
BUT
  • If people want to be able to interpret their
    results,
  • then you have an in.

16
SO
  • Explain about
  • Wright Maps
  • 4 Building Blocks
  • (See Wilson, M. (2005). Constructing Measures An
    Item Response Modeling Approach. Mahwah, NJ
    Erlbaum.)
  • Etc.

17
Now
  • suppose you have convinced them that
    interpretation matters
  • and that the Rasch approach with Wright maps etc.
    is the best way to go
  • BUT, they still have to deal with issues such as
    those above

18
Examples of their worries
  • need to include items for historical reasons
  • e.g., they are all we have
  • need to include items for technical reasons
  • e.g., they are only items left to represent
    certain categories in a linking
  • need to include items because they love em
  • e.g., they have certain content urges
  • need to not exclude items due to misfit, because
    they just cant understand or accept that they
    should
  • e.g., 3PL true

19
Thus, for any and all of above reasons, they
want to include 2-p or 3-p (or other) items
20
One Strategy limited items approach
  • Identify which items do fit Rasch-family models,
    call them R items
  • Identify which items do not fit Rasch-family
    models, call them L (limited) items
  • limited b/c they are used for limited purposes
  • And assumes that they are limited in number too
  • developed with Claus Carstensen of IPN, Kiel.

21
Then
  • Calibrate with R items using Rasch-family models
  • Anchor R items, calibrate R and L items together
    using Rasch OPLM-like models
  • (e.g., OPLM, SAS NLMixed)
  • (Research Question tests for unidimensionality
    of R and L)

22
Thus,
  • For interpretation, use R items only
  • For accuracy (e.g., smaller sem) use R and L
    items.
  • E.g., estimate a persons q using all items,
  • use R items to develop construct validity,
    criterion-referencing, etc.
  • In a picture

23
1 and 3 for interpretation, 1, 2, 3 for
estimation
Research Questions need to establish acceptance
rules for which items to calibrate, which to
map, etc.
24
Expanding the scenarios
  • (A) They have used a Rasch-family technique in
    the scaling/equating/etc,
  • Example In PISA, facets (LLTM)-like parameters
    are used to control for booklet effects.
  • Example Facets (LLTM)-like parameters are used
    to control for harshness/leniency effects in (a
    fixed set of) raters.

25
Tactics
  • In case where rater and/or booklet effects apply
    only to Rasch-family items, then use approach
    described above
  • its a bit more complicated, but its the same
    general idea.
  • In case where these effects apply to non-Rasch
    items, its more difficult.
  • Maybe re-design non-Rasch items so they dont
    involve booklets and/or raters
  • but need to know ahead to do that.
  • Otherwise, its a research question
  • either adapt the models, or delete the non-Rasch
    items.

26
(B) Suppose they have used a non-Rasch technique
in the scaling/equating/etc.
  • Example Historically, non-Rasch items have been
    used in the equating, so need to be maintained.
  • Example A longitudinal scaling has used
    non-Rasch items, so scale needs to be maintained.

27
Tactics
  • (i) In any set of non-Rasch items, there will be
    a Rasch-like core, identify that
  • e.g., select largest clump of items with slope
    params. within 1 std error of one another,
  • then select subset with low (fit) impact of
    lower asymptote.

28
Tactics
  • (ii) If that set is large and comprehensive
    enough, problem solved. If it is not large or
    comprehensive enough, then either
  • (a) develop Rasch items parallel to non-Rasch
    items, checking empirically and/or judgmentally
    for match, or
  • (b) use non-Rasch items as limited items, as
    before

29
Conclusion
  • Going further afield, the non-Rasch
    characteristics may be non-IRT elements,
  • e.g., standards set by non-scale method such as
    Angoff.
  • Needs creativity
  • map the Angoff standards onto Wright Map
  • critique or accept standards based on that
    perspective

30
Conclusion
  • Possible to achieve practical aims of Rasch
    approach
  • e.g., interpretation via Wright maps,
  • While adapting to non-Rasch environment.
  • e.g., using limited items in scaling.
  • Complications can be dealt with
  • e.g., facets (LLTM)-like situations can be
    included in tactics
  • e.g., non-Rasch scaling could be adapted.
Write a Comment
User Comments (0)
About PowerShow.com