Using Diagnostic Assessment To Guide Timely Interventions - PowerPoint PPT Presentation

1 / 75

About This Presentation

Title:

Using Diagnostic Assessment To Guide Timely Interventions

Description:

Reading Assessment Models. Traditional 'Standard Battery' (one size fits all) ... 6 or more items between subtest floor and mean (M = 10) or ... Floor effects common ... – PowerPoint PPT presentation

Number of Views:234

Avg rating:5.0/5.0

Slides: 76

Provided by: natalie60

Category:

more less

Transcript and Presenter's Notes

Title: Using Diagnostic Assessment To Guide Timely Interventions

1
Using Diagnostic Assessment To Guide Timely
Interventions

Natalie Rathvon, Ph.D.

2
What Well Cover

A research-based framework for selecting and
using diagnostic reading assessments
Steps in the diagnostic assessment process
Issues related to assessing the five reading
components
Diagnostic assessment options for each component
Case examples

3
Reading First Assessments

Screening Brief measures to identify which
students are at risk for reading problems
Progress monitoring Brief measures to determine
if students are making adequate progress in
acquiring reading skills
Diagnostic A comprehensive assessment to locate
the source(s) of reading difficulty for
individual students to guide instruction
Outcome An assessment to determine the extent to
which all students have achieved grade-level
expectations in reading

4
Questions to be Answered by Diagnostic Assessments

In which reading skill areas is this student
achieving at expected levels?
In which reading skill areas is the student
making less than expected progress?
What types, intensity, and duration of
interventions are likely to be effective in
addressing this students skill needs?

5
So many tests, so few guidelines . . .

Growing number of print and online tests
purporting to assess reading
Standards for Psychological and Educational
Testing (AERA, APA, NCME, 1999)
Gives general guidelines--not specific
criteria--for evaluating psychometric quality

6
Myths about Reading Assessment

All claims that a measure is scientifically
based are equally valid.
A valid and reliable measure is equally valid and
reliable for all examinees.
All measures of the same reading component yield
similar results for the same examinee.

7
The Case of Tim (Grade 1)Poor or Proficient
Reader?
8
Accelerating Student Outcomes
Assessment
Instruction
Data-Based Instructional Planning
9
Reading Assessment Models

Traditional
Standard Battery (one size fits all)
Assumes reading problems arise from internal
child deficits
Designed to provide a categorical label for
educational programming

Component-based
Targets domains related to the identified
deficits
Assumes most reading problems arise from
experiential and/or instructional deficits
Designed to provide information for guiding
instruction

10
Two Sets of Considerations in Selecting
Assessments

Technical adequacy Psychometric soundness
Usability Degree to which practitioners can
actually use a measure in applied settings

11
Assessment Checklists

Checklist 1 Evaluating the technical adequacy of
diagnostic reading measures
Checklist 2 Evaluating the usability of
diagnostic reading measures

12
Five Key Technical Adequacy Characteristics

Norms
Test floors
Item gradients
Reliability
Validity
Checklist 1 Evaluating Technical Adequacy

13
Norms How Do We Interpret Performance?

Norm-referenced measures Comparisons with
age/grade peers
Criterion-referenced measures Comparisons with
pre-determined performance standards
Nonstandardized measures Research norms or
examiner judgment

14
Evaluating the Adequacy of Norms

Are they representative?
Criteria Should match a national or appropriate
reference population
Are they recent?
Criteria No more than 7 12 years old
Are subgroup and sample sizes large enough?
Criteria At least 100 1000, respectively

15
Evaluating Norms, II

Are norm table intervals small enough to reflect
changes in skill development?
Criteria
No more than 6 months for students aged 7-11 and
younger
No more than 1 year for students aged 8-0 to 18

16
Reliability Are Scores Consistent and Accurate?

Alternate-form Form A vs Form B
Internal consistency Item A vs Item B
Test-retest Time A vs Time B
Interscorer Scorer A vs Scorer B
Criteria /gt .80 for screening measures and .90
for diagnostic measures

17
Hidden Threat to Reliability

Examiner variance Differences among assessors in
administering tasks and recording responses
Especially likely on
Live-voice tasks (phoneme blending)
Fluency-based tasks (CBM, TOWRE)
Tasks with complex administration or scoring
systems (DIBELS ISF, LAC3)

18
Test Floors Can the Test Detect Poor Readers?

Test floor Lowest possible standard score when a
student answers 1 item correctly
Adequate floors Permit identification of
students with very weak skills
Inadequate floors Overestimate students level
of skills

19
Test Floor Criteria

A subtest raw score of 1 should yield a standard
score gt 2 SDs below the subtest mean.
SS of 3 or less for a subtest mean of 10
SS of 69 or less for a subtest mean of 100

20
Which Tests and Tasks Are Likely to Display Floor
Effects?

Cradle-to-grave tests (WJ III)
Phonemic manipulation tasks (deletion,
substitution, reversal)
Oral reading fluency tests
Pseudoword reading tests
Spelling tests
Reading comprehension tests

21
Why Floor Effects Matter

TOWRE Phoneme Decoding Efficiency
A student in the 2nd month of Grade 1 with 1 item
correct earns a SS of 97 (average).
WJ III Reading Vocabulary
A student in the 3rd month of Grade 1 with 1 item
correct earns a SS of 94 (average).

22
Item Gradients Can the Test Detect Small
Differences?

Item gradient Steepness with which standard
scores change from 1 raw score unit to another
Adequate gradient Sensitive to small differences
in performance
Steep gradient Obscures differences among
performance levels

23
Item Gradient Criteria

6 or more items between subtest floor and mean (M
10) or
10 or more items between subtest floor and mean
(M 100)
GRADE Listening Comprehension (K)
17 items correct 5th stanine
18 items correct (100) 8th stanine

24
Test Floors and Item Gradients Special Cases

Screening tests
Critical issue is cutoff score accuracy, not
floor/gradient violations
Tests not yielding standard scores
Deciles, percentiles, quartiles, stanines
Rasch-model tests
Preclude direct inspection of raw score-standard
score relationships
WJ family WJ III, WRMT-R/NU, WDRB

25
Validity Are the Results Meaningful?

Content validity Effectiveness in assessing the
relevant domain
Criterion-related validity Effectiveness in
predicting performance now (concurrent validity)
or later (predictive validity)
Construct Effectiveness in measuring what the
test is supposed to measure
Criteria Evidence of all three types of validity
for the target population

26
Content Validity Are Tests Assessing the Same
Domain?
27
Predictive and Diagnostic Validity

Does the test predict reading outcomes for the
target age/grade group?
Concurrent vs. predictive validity evidence
Does the test differentiate between students with
and without reading problems?
Group differentiation studies

28
The Rest of the Story Usability Considerations

Usability often has more influence in test
selection and use than technical adequacy.
I know how to give it.
It doesnt take long to give.
Its easy to carry around.
I think I saw one in the storage closet.

29
Practical Characteristics

Test construction
Administration
Accommodations and adaptations
Scores and scoring
Interpretation
Links to intervention
Checklist 2 Evaluating Usability

30
The Critical Usability Issue in Diagnostic
Assessment

Is there evidence that test results can be used
to design instruction to address the reading
deficits that have been identified?

31
The Diagnostic Assessment Process

What can we learn from the results of screening
and/or progress monitoring measures?
Are there weaknesses in fluency, phonics, or
phonemic awareness?
What can we learn from the results of outcome
measures (if available)?
Are there weaknesses in vocabulary and/or
comprehension?

32
Types of Students with Reading Problems
Students with specific phonological processing
problems
Students with global language deficits
Reading Performance Problem
Attentional Problems
Disruptive Behavior Problems
33
Future Language Deficits?
34
Identified Deficit
Comprehension
Fluency
Phonics
Vocabulary
Reading-Related Cognitive Abilities
Phonemic Awareness
35
The Critical Role of Fluency
36
Issues in Assessing Fluency

Floor effects common
Task variations foundational skills vs. word
reading vs. contextual reading
Variations in level of text difficulty
Oral vs. silent reading formats
Interexaminer variance
Differences in fluency definitions

37
Fluency Options

BEAR WPM Fluency Scale
CBM (students own text) WCPM
CBM (DIBELS) WCPM
GORT4 Rate Fluency SS, PR, GE, AE
FOX Fluency WCPM Fluency Scale
Virginia PALS WPM Fluency Scale
Center City Consortium PALS WCPM
TPRI WCPM

38
Best Practices in Assessing Fluency

Administer graded passages with documented
readability levels.
Use WCPM as the fluency metric.
Assess at the passage level (i.e., more than 1
minute reading).
Take running records to obtain diagnostic and
intervention planning information.
Beware of floor effects in norm-referenced tests.

39
Phonics Subskills
40
Issues in Assessing Phonics

Wide differences in content coverage for alphabet
knowledge
WJ III Letter-Word ID 13 letters
TERA3 Alphabet 13 letters
ERDA2 Letter Recognition 26 letters
WRMTR/NU Letter ID 51 letters
Floor effects common for pseudoword reading and
spelling tests

41
Phonics Issues, II

Differences in task types
Pseudoword reading recognition
Spelling recall (more sensitive)
Differences in pseudoword construction
vake many neighbors (easier to read)
vaik few neighbors (harder to read)
Pseudoword reading tests vulnerable to examiner
variance and interscorer inconsistency

42
Alphabet Knowledge Options

Book Buddies NS
CORE Phonics Survey NS
ERDA 2 NR
FOX CR
PALS CR
TPRI CR
Random letter arrays NS

43
Spelling Options Looking in through the Phonics
Window

Book Buddies (NS - developmental scoring)
CORE Phonics Survey (CR)
FOX (CR)
PALS (CR - developmental scoring)
TPRI (CR)
WIATII Spelling (NR)
WJ III Spelling, Spelling of Sounds (NR)

44
Pseudoword Reading Options

CORE Phonics Survey NS
ERDA2/WIAT2 NR
FOX Decoding Sight Words CR
PAT Decoding NR CR
Phonics-Based Reading Test NR CR
WRMTR/NU Word Attack NR
WJ III Word Attack NR
Informal pseudoword measures

45
Best Practices in Assessing Phonics

Assess all relevant phonics components.
Select measures with adequate content coverage.
Include both recognition (pseudoword reading) and
recall measures (spelling).
Include developmental spelling measures with
differentiated scoring systems.

46
Phonological vs. Phonemic Awareness

Phonological awareness General awareness of the
sound structure of language vs. meaning
Phonemic awareness Understanding that speech is
composed of individual sounds that can be
analyzed and manipulated

47
Issues in Assessing Phonemic Awareness

Variations in linguistic unit, presentation and
response formats, coverage, item types, and
scoring (all or nothing vs. partial credit)
Variations in predictive power, depending on
childrens stage of literacy development
Vulnerable to examiner and interscorer variance,
especially for live-voice measures

48
Which skills are being measured and how?
49
Phonemic Awareness Options

CTOPP (7 tasks) NR
FOX (7 tasks) CR
LAC-3 (2 tasks) NR CR
PALS (4 tasks) CR
PAT (6 tasks) NR CR
TPRI (5 tasks) CR

50
Best Practices in Assessing Phonemic Awareness

Select multiple measures with adequate content
coverage for the domain.
Maximize diagnostic power by matching measures to
childrens stage of literacy development.
Use individually administered measures with oral
response formats.
Provide training and reliability checks for
complex and live-voice measures.

51
Issues in Assessing Comprehension

Floor effects common
Variations in
level of measurement (word, sentence, passage)
text type (narrative or expository)
reading format (oral vs. silent)
response format (oral or written, etc.)
skills assessed (main idea, sequence, etc.)
types of question (literal, inferential, lexical)

52
Comprehension Task Types
53
One More Time Different Tests Yield Different
Results

Comprehension measures show more variation across
more test features than virtually any other type
of reading instrument.

54
Does Jacqueline (Grade 2) Have a Comprehension
Problem?
55
Comprehension Measures

CBM in Oral Reading CR
ERDA 2 NR
GORT 4 NR
PALS CR
TPRI CR
WIAT II NR
WJ III NR

56
Best Practices in Assessing Comprehension

Use individually administered measures with oral
reading formats.
Supplement formal with informal measures to
obtain information for instructional planning.
Compare results with listening comprehension
results to differentiate children with general
language deficits from children with decoding
problems.

57
Issues in Assessing Vocabulary

Variation in content, task/response formats,
skills assessed, and scoring systems
May lead to the overidentification of culturally
and linguistically diverse children
May lead to the underidentification of children
from literacy-rich home environments
Often of poor technical quality
Can be difficult to interpret poor performance

58
Types of Vocabulary Measures

Oral expression/expressive vocabulary
ITPA-3 Spoken Vocabulary Listening to an
examiner-provided attribute and providing a noun
(something with a roof house)
Listening comprehension/receptive vocabulary
GRADE Listening Comprehension Listening to
stories and marking one of four pictures

59
Receptive Vocabulary Options

ERDA 2 (2 tasks) NR
FOX (1 task) CR
OWLS Listening Comp. Scale NR
PPVT III NR
TOLD-Primary3 (2 tasks) NR
TPRI (1 task) NR
WIAT-II (1 task) NR
WJ III (2 tasks) NR

60
Expressive Vocabulary Options

ERDA 2 (2 tasks) NR
EVT NR
FOX (1 task) CR
ITPA-3 (4 tasks) NR
OWLS Oral Expression Scale NR
TOLD-Primary3 (5 tasks) NR
TPRI (1 task) NR
WIAT-II (1 task) NR
WJ III (3 tasks) NR

61
Best Practices in Assessing Vocabulary

Assess both receptive and expressive language
processes.
Use measures with more than one format (e.g.,
one-word responses).
Include both formal and informal measures for
intervention planning.
Interpret results cautiously for culturally and
linguistically diverse learners.

62
Increasing the Validity and Utility of Diagnostic
Assessments

Analyze screening, progress monitoring, and
outcome results for diagnostic clues.
Select as core assessments research-based tests
that meet Reading First standards for reliability
and validity.
Supplement norm-referenced measures with
criterion-referenced and informal measures to
ensure adequate coverage and increase
instructionally relevant information.

63
Increasing Validity, II

Evaluate the presence of attentional and behavior
problems.
Key variables differentiating between children
who respond to treatment and difficult to
remediate poor readers
Assess environmental factors to understand the
context of poor reading skills.

64
Instructional Deficits?
65
Increasing Validity, III

Know the psychometric strengths and limitations
of each measure, including changes in revised
tests that may affect performance levels and
interpretation.
For less adequate measures, build in strategies
to obtain the highest possible reliability and
validity.

66
The Case of Darla
(Please note Neither of these students is Darla.)
67
PALS Fall of Grade 1
68
PALS Spring of Grade 1
69
Diagnostic AssessmentJune of Grade 1
70
The Rest of the Picture

Limited fluency
CBM in oral reading 28 WCPM in Grade 1 text
Very poor decoding skills
WIAT-II Pseudoword Decoding 1 correct
Attentional and persistence problems
Can I take the test home?
Diagnosis? Severe decoding problem obscured by a
small memorized sight vocabulary and good
language skills

71
The Golden Rule of Assessment

The best designed assessment with the most
reliable and valid measures administered by the
best trained assessor wont change a childs
reading trajectory . . .
unless someone in the childs life does
something different.

72
"Look, Dr. Rathvon! I'm READING!"
73
Best Practices in Developing Timely Interventions

Identify specific areas of deficiency in fluency,
phonics, phonemic awareness, comprehension and/or
vocabulary.
Specify the desired levels of performance in each
area.
Describe the instructional, programmatic, and
support services to be provided.
Specify the methods and schedule for progress
monitoring.

74
Best Practices, II

Account for attentional, behavioral, and
environmental variables in diagnosis and
intervention planning.
Include parents as partners in planning,
implementing, and evaluating interventions.
Provide explicit instruction and performance
feedback to help intervention agents deliver
interventions as planned (treatment fidelity).

75
The Ultimate Goal

Write a Comment

User Comments (0)