Kendon ConradBarth Riley - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Kendon ConradBarth Riley

Description:

Kendon Conrad Barth Riley University of Illinois at Chicago Michael L. Dennis Chestnut Health Systems – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 62
Provided by: dhdba6
Learn more at: https://www.chestnut.org
Category:

less

Transcript and Presenter's Notes

Title: Kendon ConradBarth Riley


1
  • Kendon Conrad Barth Riley
  • University of Illinois at Chicago

Michael L. Dennis Chestnut Health Systems
2
Overview
  • Global Appraisal of Individual Needs (GAIN)
  • Benefits of Computerized Adaptive Testing
  • CAT How Does it Work
  • Examples of CAT in Clinical Assessment
  • Triage of persons around treatment decisions for
    starting and stopping rule
  • Content Balancing over multiple clinical
    dimensions
  • Identification persons with atypical symptom
    presentations

3
The GAIN
  • Comprehensive biopsychosocial instrument designed
    for intake into substance abuse treatment.
  • Provides 5 axes DSM-IV diagnoses
  • Also supports treatment planning, outcome
    monitoring and program evaluation
  • Versions varying from 2-5 minute screener, 20-30
    minute quick, and 1-2 hour full
  • Over 103 scales, 1000 created variables, and text
    based narrative report

4
The Benefits of Computerized Adaptive Testing
5
General and Targeted Measures
  • Generalized
  • Heavy response burden
  • Lack specificity
  • Targeted
  • Floor and ceiling effects
  • Limited content validity
  • Dont talk with each other.

6
Tailoring Outcome Measurement
administer item
?
CAT
Selects items from
Item Bank
Instrument A
Instrument B
Instrument C
7
Benefits of CAT Item Banking
Respondent Burden
Tailoring/ Specificity
CAT
Coverage of content domains
Floor and ceiling effects
Item Bank
8
CAT vs. Short Forms
  • CAT has been found to be superior to short
    forms of tests, yielding more precise measures.

9
CAT What Is It and How Does It Work?
10
Computerized Adaptive Testing
Typical Pattern of Responses
Increased Difficulty
  • Score is calculated and the next best item is
    selected based on item difficulty

Middle Difficulty
/- 1 Std. Error
Decreased Difficulty
Correct
Incorrect
11
Item Selection
  • There are several methods for selecting items
    during a CAT.
  • The most common method is to find the item that
    provides the most information given the current
    estimate of the measure.

12
Item Selection cont.
13
Item Selection cont.
  • Item selection can also take into account the
    types of domains of items to be represented in
    the CAT session.
  • Examples
  • Items necessary for DSM-IV diagnosis

14
Stop Rules
  • The stop rule, which determines when the item
    administration process of the CAT ends, can be
    based on
  • Measurement precision
  • Number of items administered
  • Test-taking time
  • Some combination of the above

15
Item Bank Size
  • The more items there are in an item bank, the
    more likely it is that items that are tailored to
    an individuals level on the measured variable
    will be available.
  • Typically, item banks consist of hundreds of
    items.
  • The number of items will likely depend on
  • The number of constructs or domains being
    assessed.
  • Whether one wishes to estimate a measure or
    classify persons into groups.

16
CAT for Clinical Assessment
  • The application of CAT to clinical research and
    assessment raises several new measurement issues.
  • Triage of persons around treatment decisions
    for starting and stopping rule
  • Content Balancing over multiple clinical
    dimensions
  • Identification of persons with atypical
    presentation of symptoms

17
Example 1Triage of Individuals to Support
Clinical Decision-Making
18
Classifying Persons Using CAT
  • CAT is typically used to estimate a measure
  • Few studies have examined the use of CAT to place
    persons into diagnostic groups.
  • For placing persons into diagnostic groups, it is
    desirable to vary the level of measurement
    precision depending on the category in which the
    person is placed.
  • Current CAT procedures do not allow one to vary
    measurement precision during the CAT session.

19
Triage of individuals to support
clinical decision making
  • Strategy Use of screener measures to set the
    value of thee initial measure and variable stop
    rules designed to maximize precision and
    efficiency for identification of persons in low,
    medium or high symptom severity
  • Implications Taking into account initial
    location and/or precision around decision points
    can further improve the efficacy of assessment
    without hurting precision for decision making

20
Clinical Decision Making
  • To facilitate clinical diagnoses, it would be
    desirable for a CAT to
  • Classify patients by symptom severity
  • Maximize measurement place within the area of the
    measure that is most critical for decision
    making.
  • Use previously collected information to increase
    the efficiency of the CAT.

21
Study
  • We examined the ability of CAT to place persons
    into low, moderate and high levels of substance
    abuse and substance dependency.
  • The Substance Problem Scale (SPS) is a 16 item
    instrument that measures recency of substance
    use.
  • When was the last time you used alcohol or other
    drugs weekly?

22
Defining Cut Points
  • Cut points can be established by examining where
    persons with different levels of severity fall
    onto the measurement continuum.

23
The Start Rules
  • Random randomly select an item with difficulty
    calibrations between -0.5 and 0.5 logits (average
    level of difficulty).
  • Screener Select an item that has a difficulty
    level that most closely approximates the
    respondents measure on a previously administered
    screener (SDScr).

24
The Variable Stop Rule
  • Stop rules for the CAT were defined in terms of
    maximum standard error of measurement for the
    low, mid and high range of substance abuse
    severity.
  • The mid range stop rule was set to SE0.35 for
    all simulations.
  • Low and High range SE ranged from SE0.5 to 0.75
    logits.

25
CAT Standard Error
26
The Item Selection Algorithm
Start Rule Using
Administer
Re-estimate
Screener
item
measure SE
Select item
Measure in
High range
Yes
high range?
stop rule
No
No
In
Mid range
Yes
mid range?
stop rule
Low range
No
stop rule
Stop
rule met?
Yes
End test
27
Results
  • Screener starting rule improved efficiency of the
    CAT by approximately 7 percent compared to
    standard CAT procedures.
  • Variable stop rules improved efficiency by 15 to
    38 percent, depending on definition of the mid
    range of severity, compared to standard stopping
    rules.

28
Results
  • Pre-calibration and variable stop rules resulted
    in accurate and efficient estimation of substance
    abuse severity.
  • The screener start rule had only a small effect
    on classification precision.

29
Next Step Refining the Algorithm
30
Example 2 Content Balancing over Multiple
Dimensions
31
Measuring Multiple Dimensions
  • Strategy Use of content balancing methods in
    combination with conventional item selection
    procedures to ensure selection of items from each
    substantive domain
  • Implications Assessment of an individuals
    clinical profile can be conducted both
    efficiently and comprehensively at both the total
    and subscale level.

32
Internal Mental Distress Scale
  • The IMDS consists of the following subscales
  • Depression Symptom Scale
  • Anxiety/Fear Symptom Scale
  • Traumatic Distress Scale
  • Homicidal/Suicidal Scale
  • IMDS also has 4 general somatic items as part of
    the total scale score.
  • Clinicians want to estimates for the overall
    severity and in each of the subscale areas.

33
Internal Mental Distress Scale by Content Area
IMDS Subscale Item Calibrations
3
H/S
Trauma
2
Anxiety
Somatic
Depression
1
Logits
0
-1
-2
-3
34
Example No Content Balancing
All Screener Items Administered
35
Example No Content Balancing
Think other people dont understand you Yes
Depression 2 H/S 1 Anxiety 1 Trauma 1
36
Example No Content Balancing
Lost interest in things Yes
Depression 3 H/S 1 Anxiety 1 Trauma 1
37
Example No Content Balancing
Thoughts people taking advantage of me No
Depression 3 H/S 1 Anxiety 2 Trauma 1
38
Example No Content Balancing
Shyness No
Depression 4 H/S 1 Anxiety 2 Trauma 1
39
Example No Content Balancing
Have to repeat action over and over Yes
Depression 4 H/S 1 Anxiety 3 Trauma 1
40
Results
  • If continued to 13 items
  • Except for screener items, no hostility/suicide
    or trauma items were administered during the CAT
    session.
  • Mixed precision on the subscales

41
No Content Balancing
42
IMDS by Content Area
43
IMDS Screener Items
3
Suicidal
Trauma
Anxiety
2
Somatic
Depression
1
0
Logits
-1
-2
-3
44
IMDS Subscale Calibrations
3
Depression
Anxiety
Trauma
Suicidal
2
1
Logits
0
-1
-2
-3
45
IMDS Subscale Item Calibrations
3
Depression
Anxiety
Trauma
Suicidal
2
1
Logits
0
-1
-2
-3
five screener items
46
Re-estimating IMDS
3
Suicidal
Trauma
Anxiety
2
Somatic
Depression
Revised Estimate
1
0
Logits
-1
-2
-3
five screener items
47
Cont. Balancing CAT to Full IMDS
48
Example 3 Identifying Persons with Atypical
Symptom Presentations
49
Overview
  • Strategy Rasch person fit statistics can
    identify persons with atypical clinical
    presentations in a computerized adaptive testing
    context
  • Implications Clients sometimes endorse severe
    clinical symptoms that are not reflected by
    overall scores on standard assessments. Using
    statistics that can identify persons with such an
    atypical presentation has important clinical
    implications.

50
Rasch Fit Statistics
  • Both infit and outfit follow a chi-square
    distribution where the high scores are of primary
    concern
  • Infit or Randomness More changes between
    yes/no that would be expected based on overall
    severity.
  • Low almost too perfect fit
  • High more transitions than expected
  • Outfit or Atypicalness Focuses more on the
    tail ends Group of answers Used to detect
    unexpected outlying, off-target responses.
    Outlier sensitive
  • Low almost too perfect fit
  • High endorsed high severity items, but not the
    percursor items. (e.g.., easier items)

51
Problems with Fit
Responses by Severity Low High Responses by Severity Low High Responses by Severity Low High Randomness Atypicalness
111 11111100000 0000 0.3 0.5
111 10101100010 0000 0.6 1.0
111 11101010000 0000 1.0 1.0
111 00001110000 0000 0.9 1.3
011 11111110000 0000 3.8 1.0
111 11111100000 0001 3.8 1.0
101 01010101010 1010 4.0 2.3
000 00000000011 1111 12.6 4.3
52
Clinical Implications of Misfit
  • Misfit in the context of clinical assessment can
    reflect
  • Difficulty understanding the assessment
  • Cross-cultural effects
  • Differential effects of treatment on some
    symptoms but not others
  • Our analyses indicate that there are subgroups
    who endorse severe symptoms without endorsement
    of milder symptoms.
  • Example atypical suicide profile

53
Example Atypical Suicide
  • Depression is regarded as the major risk factor
    for suicide.
  • However, there is a less common profile
    characterized by suicide-related symptoms but in
    the absence of depressive symptoms.
  • This profile can be identified through the use of
    fit statistics (atypicalness).

00000000000011111
Depression Suicide
54
Atypical Suicide
55
Fit Statistics in CAT
  • Fit statistics such as infit and outfit become
    less sensitive to atypical response patterns as
    the number of items is reduced.
  • Since CAT usually administers items that the
    respondent has a 50 probability of endorsing,
    either a yes or a no response to a
    dichotomous question is equally likely, and
    therefore, consistent with the Rasch model.

56
Randomness by Number of Items
Number of Items Randomness Categories Randomness Categories Randomness Categories
Number of Items lt 0.75 0.75-1.33 gt 1.33
16 23.6 58.2 18.2
12 28.2 55.6 16.2
8 35.2 52.8 12.0
4 51.1 44.0 4.9
57
Atypicalness by Number of Items
Number of Items Atypicalness Categories Atypicalness Categories Atypicalness Categories
Number of Items lt 0.75 0.75-1.33 gt 1.33
16 30.2 48.1 21.7
12 34.3 51.1 14.6
8 38.4 53.2 8.4
4 58.2 40.0 1.8
58
Next Steps Alternatives to Infit and Outfit
  • Several measures/procedures for detecting misfit
    have been developed, specifically for use with
    short tests and/or CAT. These include
  • Adjustment of critical values for fit statistics
  • Statistical process control procedures
  • Modified t, modified H and modified Z statistics
    (Dimitrov and Smith, 2006).

59
Potential of CAT in Clinical Practice
  • Reduce respondent burden
  • Reduce staff resources
  • Reduce data fragmentation
  • Streamline complex assessment procedures
  • Assist in clinical decision making
  • Identify persons with atypical profiles

60
Future Research
  • How do we put it all together?
  • Much of the research in the area of CAT has used
    computer simulation. There is a need to test
    working CAT systems in clinical practice.

61
Contact Information
  • A copy of this presentation will be at
    www.chestnut.org/li/posters
  • For information on this method and a paper on it,
    please contact Barth Riley at barthr_at_uic.edu
  • For information on the GAIN, please contact
    Michael Dennis at mdennis_at_chestnut.org or see
    www.chestnut.org/li/gain
Write a Comment
User Comments (0)
About PowerShow.com