Title: Using Diagnostic Assessment To Guide Timely Interventions
1Using Diagnostic Assessment To Guide Timely
Interventions
2What Well Cover
- A research-based framework for selecting and
using diagnostic reading assessments - Steps in the diagnostic assessment process
- Issues related to assessing the five reading
components - Diagnostic assessment options for each component
- Case examples
3Reading First Assessments
- Screening Brief measures to identify which
students are at risk for reading problems - Progress monitoring Brief measures to determine
if students are making adequate progress in
acquiring reading skills - Diagnostic A comprehensive assessment to locate
the source(s) of reading difficulty for
individual students to guide instruction - Outcome An assessment to determine the extent to
which all students have achieved grade-level
expectations in reading
4Questions to be Answered by Diagnostic Assessments
- In which reading skill areas is this student
achieving at expected levels? - In which reading skill areas is the student
making less than expected progress? - What types, intensity, and duration of
interventions are likely to be effective in
addressing this students skill needs?
5So many tests, so few guidelines . . .
- Growing number of print and online tests
purporting to assess reading - Standards for Psychological and Educational
Testing (AERA, APA, NCME, 1999) - Gives general guidelines--not specific
criteria--for evaluating psychometric quality
6Myths about Reading Assessment
- All claims that a measure is scientifically
based are equally valid. - A valid and reliable measure is equally valid and
reliable for all examinees. - All measures of the same reading component yield
similar results for the same examinee.
7The Case of Tim (Grade 1)Poor or Proficient
Reader?
8Accelerating Student Outcomes
Assessment
Instruction
Data-Based Instructional Planning
9Reading Assessment Models
- Traditional
- Standard Battery (one size fits all)
- Assumes reading problems arise from internal
child deficits - Designed to provide a categorical label for
educational programming
- Component-based
- Targets domains related to the identified
deficits - Assumes most reading problems arise from
experiential and/or instructional deficits - Designed to provide information for guiding
instruction
10Two Sets of Considerations in Selecting
Assessments
- Technical adequacy Psychometric soundness
- Usability Degree to which practitioners can
actually use a measure in applied settings
11Assessment Checklists
- Checklist 1 Evaluating the technical adequacy of
diagnostic reading measures - Checklist 2 Evaluating the usability of
diagnostic reading measures
12Five Key Technical Adequacy Characteristics
- Norms
- Test floors
- Item gradients
- Reliability
- Validity
- Checklist 1 Evaluating Technical Adequacy
13Norms How Do We Interpret Performance?
- Norm-referenced measures Comparisons with
age/grade peers - Criterion-referenced measures Comparisons with
pre-determined performance standards - Nonstandardized measures Research norms or
examiner judgment
14Evaluating the Adequacy of Norms
- Are they representative?
- Criteria Should match a national or appropriate
reference population - Are they recent?
- Criteria No more than 7 12 years old
- Are subgroup and sample sizes large enough?
- Criteria At least 100 1000, respectively
15Evaluating Norms, II
- Are norm table intervals small enough to reflect
changes in skill development? - Criteria
- No more than 6 months for students aged 7-11 and
younger - No more than 1 year for students aged 8-0 to 18
16Reliability Are Scores Consistent and Accurate?
- Alternate-form Form A vs Form B
- Internal consistency Item A vs Item B
- Test-retest Time A vs Time B
- Interscorer Scorer A vs Scorer B
- Criteria /gt .80 for screening measures and .90
for diagnostic measures
17Hidden Threat to Reliability
- Examiner variance Differences among assessors in
administering tasks and recording responses - Especially likely on
- Live-voice tasks (phoneme blending)
- Fluency-based tasks (CBM, TOWRE)
- Tasks with complex administration or scoring
systems (DIBELS ISF, LAC3)
18Test Floors Can the Test Detect Poor Readers?
- Test floor Lowest possible standard score when a
student answers 1 item correctly - Adequate floors Permit identification of
students with very weak skills - Inadequate floors Overestimate students level
of skills
19Test Floor Criteria
- A subtest raw score of 1 should yield a standard
score gt 2 SDs below the subtest mean. - SS of 3 or less for a subtest mean of 10
- SS of 69 or less for a subtest mean of 100
20Which Tests and Tasks Are Likely to Display Floor
Effects?
- Cradle-to-grave tests (WJ III)
- Phonemic manipulation tasks (deletion,
substitution, reversal) - Oral reading fluency tests
- Pseudoword reading tests
- Spelling tests
- Reading comprehension tests
21Why Floor Effects Matter
- TOWRE Phoneme Decoding Efficiency
- A student in the 2nd month of Grade 1 with 1 item
correct earns a SS of 97 (average). - WJ III Reading Vocabulary
- A student in the 3rd month of Grade 1 with 1 item
correct earns a SS of 94 (average).
22Item Gradients Can the Test Detect Small
Differences?
- Item gradient Steepness with which standard
scores change from 1 raw score unit to another - Adequate gradient Sensitive to small differences
in performance - Steep gradient Obscures differences among
performance levels
23Item Gradient Criteria
- 6 or more items between subtest floor and mean (M
10) or - 10 or more items between subtest floor and mean
(M 100) - GRADE Listening Comprehension (K)
- 17 items correct 5th stanine
- 18 items correct (100) 8th stanine
24Test Floors and Item Gradients Special Cases
- Screening tests
- Critical issue is cutoff score accuracy, not
floor/gradient violations - Tests not yielding standard scores
- Deciles, percentiles, quartiles, stanines
- Rasch-model tests
- Preclude direct inspection of raw score-standard
score relationships - WJ family WJ III, WRMT-R/NU, WDRB
25Validity Are the Results Meaningful?
- Content validity Effectiveness in assessing the
relevant domain - Criterion-related validity Effectiveness in
predicting performance now (concurrent validity)
or later (predictive validity) - Construct Effectiveness in measuring what the
test is supposed to measure - Criteria Evidence of all three types of validity
for the target population
26Content Validity Are Tests Assessing the Same
Domain?
27Predictive and Diagnostic Validity
- Does the test predict reading outcomes for the
target age/grade group? - Concurrent vs. predictive validity evidence
- Does the test differentiate between students with
and without reading problems? - Group differentiation studies
28The Rest of the Story Usability Considerations
- Usability often has more influence in test
selection and use than technical adequacy. - I know how to give it.
- It doesnt take long to give.
- Its easy to carry around.
- I think I saw one in the storage closet.
29Practical Characteristics
- Test construction
- Administration
- Accommodations and adaptations
- Scores and scoring
- Interpretation
- Links to intervention
- Checklist 2 Evaluating Usability
30The Critical Usability Issue in Diagnostic
Assessment
- Is there evidence that test results can be used
to design instruction to address the reading
deficits that have been identified?
31The Diagnostic Assessment Process
- What can we learn from the results of screening
and/or progress monitoring measures? - Are there weaknesses in fluency, phonics, or
phonemic awareness? - What can we learn from the results of outcome
measures (if available)? - Are there weaknesses in vocabulary and/or
comprehension?
32Types of Students with Reading Problems
Students with specific phonological processing
problems
Students with global language deficits
Reading Performance Problem
Attentional Problems
Disruptive Behavior Problems
33Future Language Deficits?
34 Identified Deficit
Comprehension
Fluency
Phonics
Vocabulary
Reading-Related Cognitive Abilities
Phonemic Awareness
35The Critical Role of Fluency
36Issues in Assessing Fluency
- Floor effects common
- Task variations foundational skills vs. word
reading vs. contextual reading - Variations in level of text difficulty
- Oral vs. silent reading formats
- Interexaminer variance
- Differences in fluency definitions
37Fluency Options
- BEAR WPM Fluency Scale
- CBM (students own text) WCPM
- CBM (DIBELS) WCPM
- GORT4 Rate Fluency SS, PR, GE, AE
- FOX Fluency WCPM Fluency Scale
- Virginia PALS WPM Fluency Scale
- Center City Consortium PALS WCPM
- TPRI WCPM
38Best Practices in Assessing Fluency
- Administer graded passages with documented
readability levels. - Use WCPM as the fluency metric.
- Assess at the passage level (i.e., more than 1
minute reading). - Take running records to obtain diagnostic and
intervention planning information. - Beware of floor effects in norm-referenced tests.
39Phonics Subskills
40Issues in Assessing Phonics
- Wide differences in content coverage for alphabet
knowledge - WJ III Letter-Word ID 13 letters
- TERA3 Alphabet 13 letters
- ERDA2 Letter Recognition 26 letters
- WRMTR/NU Letter ID 51 letters
- Floor effects common for pseudoword reading and
spelling tests
41Phonics Issues, II
- Differences in task types
- Pseudoword reading recognition
- Spelling recall (more sensitive)
- Differences in pseudoword construction
- vake many neighbors (easier to read)
- vaik few neighbors (harder to read)
- Pseudoword reading tests vulnerable to examiner
variance and interscorer inconsistency
42Alphabet Knowledge Options
- Book Buddies NS
- CORE Phonics Survey NS
- ERDA 2 NR
- FOX CR
- PALS CR
- TPRI CR
- Random letter arrays NS
43Spelling Options Looking in through the Phonics
Window
- Book Buddies (NS - developmental scoring)
- CORE Phonics Survey (CR)
- FOX (CR)
- PALS (CR - developmental scoring)
- TPRI (CR)
- WIATII Spelling (NR)
- WJ III Spelling, Spelling of Sounds (NR)
44Pseudoword Reading Options
- CORE Phonics Survey NS
- ERDA2/WIAT2 NR
- FOX Decoding Sight Words CR
- PAT Decoding NR CR
- Phonics-Based Reading Test NR CR
- WRMTR/NU Word Attack NR
- WJ III Word Attack NR
- Informal pseudoword measures
45Best Practices in Assessing Phonics
- Assess all relevant phonics components.
- Select measures with adequate content coverage.
- Include both recognition (pseudoword reading) and
recall measures (spelling). - Include developmental spelling measures with
differentiated scoring systems.
46Phonological vs. Phonemic Awareness
- Phonological awareness General awareness of the
sound structure of language vs. meaning - Phonemic awareness Understanding that speech is
composed of individual sounds that can be
analyzed and manipulated
47Issues in Assessing Phonemic Awareness
- Variations in linguistic unit, presentation and
response formats, coverage, item types, and
scoring (all or nothing vs. partial credit) - Variations in predictive power, depending on
childrens stage of literacy development - Vulnerable to examiner and interscorer variance,
especially for live-voice measures -
48Which skills are being measured and how?
49Phonemic Awareness Options
- CTOPP (7 tasks) NR
- FOX (7 tasks) CR
- LAC-3 (2 tasks) NR CR
- PALS (4 tasks) CR
- PAT (6 tasks) NR CR
- TPRI (5 tasks) CR
50Best Practices in Assessing Phonemic Awareness
- Select multiple measures with adequate content
coverage for the domain. - Maximize diagnostic power by matching measures to
childrens stage of literacy development. - Use individually administered measures with oral
response formats. - Provide training and reliability checks for
complex and live-voice measures.
51Issues in Assessing Comprehension
- Floor effects common
- Variations in
- level of measurement (word, sentence, passage)
- text type (narrative or expository)
- reading format (oral vs. silent)
- response format (oral or written, etc.)
- skills assessed (main idea, sequence, etc.)
- types of question (literal, inferential, lexical)
52Comprehension Task Types
53One More Time Different Tests Yield Different
Results
- Comprehension measures show more variation across
more test features than virtually any other type
of reading instrument.
54Does Jacqueline (Grade 2) Have a Comprehension
Problem?
55Comprehension Measures
- CBM in Oral Reading CR
- ERDA 2 NR
- GORT 4 NR
- PALS CR
- TPRI CR
- WIAT II NR
- WJ III NR
56Best Practices in Assessing Comprehension
- Use individually administered measures with oral
reading formats. - Supplement formal with informal measures to
obtain information for instructional planning. - Compare results with listening comprehension
results to differentiate children with general
language deficits from children with decoding
problems.
57Issues in Assessing Vocabulary
- Variation in content, task/response formats,
skills assessed, and scoring systems - May lead to the overidentification of culturally
and linguistically diverse children - May lead to the underidentification of children
from literacy-rich home environments - Often of poor technical quality
- Can be difficult to interpret poor performance
58Types of Vocabulary Measures
- Oral expression/expressive vocabulary
- ITPA-3 Spoken Vocabulary Listening to an
examiner-provided attribute and providing a noun
(something with a roof house) - Listening comprehension/receptive vocabulary
- GRADE Listening Comprehension Listening to
stories and marking one of four pictures
59Receptive Vocabulary Options
- ERDA 2 (2 tasks) NR
- FOX (1 task) CR
- OWLS Listening Comp. Scale NR
- PPVT III NR
- TOLD-Primary3 (2 tasks) NR
- TPRI (1 task) NR
- WIAT-II (1 task) NR
- WJ III (2 tasks) NR
60Expressive Vocabulary Options
- ERDA 2 (2 tasks) NR
- EVT NR
- FOX (1 task) CR
- ITPA-3 (4 tasks) NR
- OWLS Oral Expression Scale NR
- TOLD-Primary3 (5 tasks) NR
- TPRI (1 task) NR
- WIAT-II (1 task) NR
- WJ III (3 tasks) NR
61Best Practices in Assessing Vocabulary
- Assess both receptive and expressive language
processes. - Use measures with more than one format (e.g.,
one-word responses). - Include both formal and informal measures for
intervention planning. - Interpret results cautiously for culturally and
linguistically diverse learners.
62Increasing the Validity and Utility of Diagnostic
Assessments
- Analyze screening, progress monitoring, and
outcome results for diagnostic clues. - Select as core assessments research-based tests
that meet Reading First standards for reliability
and validity. - Supplement norm-referenced measures with
criterion-referenced and informal measures to
ensure adequate coverage and increase
instructionally relevant information.
63Increasing Validity, II
- Evaluate the presence of attentional and behavior
problems. - Key variables differentiating between children
who respond to treatment and difficult to
remediate poor readers - Assess environmental factors to understand the
context of poor reading skills.
64Instructional Deficits?
65Increasing Validity, III
- Know the psychometric strengths and limitations
of each measure, including changes in revised
tests that may affect performance levels and
interpretation. - For less adequate measures, build in strategies
to obtain the highest possible reliability and
validity.
66The Case of Darla
(Please note Neither of these students is Darla.)
67PALS Fall of Grade 1
68PALS Spring of Grade 1
69Diagnostic AssessmentJune of Grade 1
70The Rest of the Picture
- Limited fluency
- CBM in oral reading 28 WCPM in Grade 1 text
- Very poor decoding skills
- WIAT-II Pseudoword Decoding 1 correct
- Attentional and persistence problems
- Can I take the test home?
- Diagnosis? Severe decoding problem obscured by a
small memorized sight vocabulary and good
language skills
71The Golden Rule of Assessment
- The best designed assessment with the most
reliable and valid measures administered by the
best trained assessor wont change a childs
reading trajectory . . . - unless someone in the childs life does
something different.
72"Look, Dr. Rathvon! I'm READING!"
73Best Practices in Developing Timely Interventions
- Identify specific areas of deficiency in fluency,
phonics, phonemic awareness, comprehension and/or
vocabulary. - Specify the desired levels of performance in each
area. - Describe the instructional, programmatic, and
support services to be provided. - Specify the methods and schedule for progress
monitoring.
74Best Practices, II
- Account for attentional, behavioral, and
environmental variables in diagnosis and
intervention planning. - Include parents as partners in planning,
implementing, and evaluating interventions. - Provide explicit instruction and performance
feedback to help intervention agents deliver
interventions as planned (treatment fidelity).
75The Ultimate Goal