Jamal Abedi - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Jamal Abedi

Description:

Reading, N= RFEP. Low SES. Hi SES. Sub-scale (Items) Non-LEP Students ... Presented at the 2003 Large-Scale Assessment Conference. N 1. Test time increased ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 40
Provided by: cse178
Category:
Tags: abedi | jamal

less

Transcript and Presenter's Notes

Title: Jamal Abedi


1
Psychometric Issues in the ELL Assessment and
Special Education Eligibility
English Language Learners Struggling to
Learn Emergent Research on Linguistic
Differences and Learning Disabilities
Jamal Abedi National Center for Research on
Evaluation, Standards, and Student Testing
UCLA Graduate School of Education Information
Studies November 18, 2004
2
Why Should English Language Learners be Assessed?
  • Goals 2000
  • Title I and VII of the Improving Americas School
    Act of 1994 (IASA)
  • No Child Left Behind Act

3
  • Should Schools Test English Language Learners?

? Yes
General Problems English language learners
(ELLs) can be placed at a disadvantage because
  • Assessment outcomes may not be valid because
    their low level English proficiency
    interferes with content knowledge
    performance
  • Test results affect decisions regarding promotion
    or graduation
  • They may be inappropriately placed into special
    educational programs where they receive
    inappropriate instruction
  • ELL students may not have received the same
    curriculum which is assumed for the test

4
  • Should Schools Test English Language Learners?

? Yes
  • Problems In Large-Scale Assessment
  • Standardized assessment
  • Assessment tools in large-scale assessments are
    usually constructed based on norms that exclude
    ELL populations
  • Research shows major differences between the
    performance of ELL and non-ELL students on the
    results of standardized large-scale assessments
  • The tests may be biased in favor of non-ELL
    populations
  • Performance/alternative assessment
  • Such assessments require more language
    production thus students with lower language
    capabilities are at a greater disadvantage
  • Scorers may not be familiar with rating ELL
    performance

5
  • Should Schools Test English Language Learners?

? No
  • Problems
  • Due to the powerful impact of assessment on
    instruction, ELL and SWD students quality of
    instruction may be affected
  • If excluded, they will be dropped out of the
    accountability picture
  • Institutions will not be held responsible for
    their performance in school
  • They will not be included in state or federal
    policy decision
  • Their academic progress, skills, and needs may
    not be appropriately assessed

6
States with the Highest Proportion of ELL Students
  • Percentage of Total Student Population
  • California 27.0
  • New Mexico 19.0
  • Arizona 15.4
  • Alaska 15.0
  • Texas 14.0
  • Nevada 11.8
  • Florida 10.7

7
Problems in AYP Reporting Focus on LEP Students
  1. Problems in classification/reclassification of
    LEP students (moving target subgroup)
  2. Measurement quality
  3. Low baseline
  4. Instability of the LEP subgroup
  5. Sparse LEP population
  6. LEP cutoff points (Conjunctive vs. Compensatory
    model)

8
Site 2 Stanford 9 Sub-scale Reliabilities (1998)
Grade 9 Alphas
Non-LEP Students Non-LEP Students
Sub-scale (Items) Hi SES Low SES English Only FEP RFEP LEP
Reading, N 205,092 35,855 181,202 37,876 21,869 52,720
-Vocabulary (30) .828 .781 .835 .814 .759 .666
-Reading Comp (54) .912 .893 .916 .903 .877 .833
Average Reliability .870 .837 .876 .859 .818 .750
Math, N 207,155 36,588 183,262 38,329 22,152 54,815
-Total (48) .899 .853 .898 .898 .876 .802
Language, N 204,571 35,866 180,743 37,862 21,852 52,863
-Mechanics (24) .801 .759 .803 .802 .755 .686
-Expression (24) .818 .779 .812 .804 .757 .680
Average Reliability .810 .769 .813 .803 .756 .683
Science, N 163,960 28,377 144,821 29,946 17,570 40,255
-Total (40) .800 .723 .805 .778 .716 .597
Social Science, N 204,965 36,132 181,078 38,052 21,967 53,925
-Total (40) .803 .702 .805 .784 .722 .530
9
Classical Test Theory Reliability
2
  • s2X s2T s2E

X Observed ScoreT True ScoreE Error Score
rXX s2T /s2X
rXX 1- s2E /s2X
Textbook examples of possible sources that
contribute to the measurement error
RaterOccasionItemTest Form
10
Classical Test Theory Reliability
2
  • s2X s2T s2E
  • s2X s2T s2E s2S sES

rXX 1- ((s2E s2S sES )/s2X)
11
Generalizability TheoryPartitioning Error
Variance into Its Components
3
  • s2(Xpro) s2p s2r s2o s2pr s2po s2ro
    s2pro,e

p Personr Ratero Occasion
Are there any sources of measurement error that
may specifically influence ELL performance?
12
Grade 11 Stanford 9 Reading and Science
Structural Modeling Results (DF24), Site 3
All Cases (N7,176) Even Cases (N3,588) Odd Cases (N3,588) Non-LEP (N6,932) LEP (N244)
Goodness of Fit
Chi Square 1786 943 870 1675 81
NFI .931 .926 .934 .932 .877
NNFI .898 .891 .904 .900 .862
CFI .932 .928 .936 .933 .908
Factor Loadings
Reading Variables
Composite 1 .733 .720 .745 .723 .761
Composite 2 .735 .730 .741 .727 .713
Composite 3 .784 .779 .789 .778 .782
Composite 4 .817 .722 .712 .716 .730
Composite 5 .633 .622 .644 .636 .435
Math Variables
Composite 1 .712 .719 705 709 660
Composite 2 .695 .696 .695 .701 .581
Composite 3 .641 .628 .654 .644 .492
Composite 4 .450 .428 .470 .455 .257
Factor Correlation
Reading vs. Math .796 .796 .795 .797 .791
Note. NFI Normed Fit Index. NNFI Non-Normed
Fit Index. CFI Comparative Fit Index.
13
Normal Curve Equivalent Means Standard
Deviations for Students in Grades 10 and 11, Site
3 School District
Reading Science Math
M SD M SD M SDGrade 10 SWD
only 16.4 12.7 25.5 13.3 22.5 11.7 LEP
only 24.0 16.4 32.9 15.3 36.8 16.0 LEP
SWD 16.3 11.2 24.8 9.3 23.6 9.8 Non-LEP/SWD
38.0 16.0 42.6 17.2 39.6 16.9 All
students 36.0 16.9 41.3 17.5 38.5 17.0 Grade
11 SWD Only 14.9 13.2 21.5 12.3 24.3 13.2
LEP Only 22.5 16.1 28.4 14.4 45.5 18.2 LEP
SWD 15.5 12.7 26.1 20.1 25.1 13.0 Non-LEP/SWD
38.4 18.3 39.6 18.8 45.2 21.1 All
Students 36.2 19.0 38.2 18.9 44.0 21.2
14
Site 2 Grade 7 SAT 9 Subsection Scores
Subgroup Reading Math Language Spelling
LEP Status
LEP
Mean 26.3 34.6 32.3 28.5
SD 15.2 15.2 16.6 16.7
N 62,273 64,153 62,559 64,359
Non-LEP
Mean 51.7 52.0 55.2 51.6
SD 19.5 20.7 20.9 20.0
N 244,847 245,838 243,199 246,818

SES
Low SES
Mean 34.3 38.1 38.9 36.3
SD 18.9 17.1 19.8 20.0
N 92,302 94,054 92,221 94,505
Higher SES
Mean 48.2 49.4 51.7 47.6
SD 21.8 21.6 22.6 22.0
N 307,931 310,684 306,176 312,321


15
Site 4 Grade 8 Descriptive Statistics for the
SAT 9 Test Scores by Strands
Reading Math Math Calculation Math Analytical
Non-LEP/Non-SWD
Mean 45.63 49.30 49.09 48.75
SD 21.10 20.47 20.78 19.61
N 9217 91.18 9846 92.50

LEP only
Mean 20.26 36.00 39.20 33.86
SD 16.39 18.48 21.25 16.88
N 692 687 696 699

SWD only
Mean 18.86 27.82 28.42 29.10
SD 19.70 14.10 15.76 15.14
N 872 843 883 873

LEP/SWD
Mean 9.78 21.37 22.75 22.87
SD 11.50 10.75 12.94 12.06
N 93 92 97 94


16
Accommodations for SWD/LEP
  • Accommodations that are appropriate for the
    particular subgroup should be used

17
Why Should English Language Learners be
Accommodated?
  • Their possible English language deficiency may
    interfere with their content knowledge
    performance.
  • Assessment tools may be culturally and
    linguistically biased for these students.
  • Linguistic complexity of the assessment tools may
    be a source of measurement error.
  • Language factors may be a source of construct
    irrelevant variance.

18
SY 2000-2001 Accommodations Designated for ELLs
Cited in States Policies
There are 73 accommodations listed N Not
Related R Remotely Related M Moderately
Related H Highly Related
From Rivera (2003) State assessment policies for
English language learners. Presented at the 2003
Large-Scale Assessment Conference
19
SY 2000-2001 Accommodations Designated for ELLs
Cited in States Policies
I. Timing/Scheduling (N 5)
N 1. Test time increased N 2. Breaks
provided N 3. Test schedule extended N 4.
Subtests flexibly scheduled N 5. Test
administered at time of day most
beneficial to test-taker
N not related R remotely related M
moderately related H highly related
20
There are 73 Accommodations Listed
  • 47 or 64 are not related
  • 7 or 10 are remotely related
  • 8 or 11 are moderately related
  • 11 or 15 are highly related

21
A Clear Language of Instruction and Assessment
Works for ELLs, SWDs, and Everyone
  • What is language modification of test items?

22
Examining Complex Linguistic Features in
Content-Based Test Items
23
Linguistic Modification Concerns
  • Familiarity/frequency of non-math vocabulary
    unfamiliar or infrequent words changed
  • census gt video game
  • A certain reference file gt Macks company
  • Length of nominals long nominals shortened
  • last years class vice president gt vice
    president
  • the pattern of puppys weight gain gt the pattern
    above
  • Question phrases complex question phrases
    changed to simple question words
  • At which of the following times gt When
  • which is best approximation of the number gt
    approximately how many

24
Linguistic Modification cont.
  • Voice of verb phrase passive verb forms
    changed to active The weights of 3 objects were
    compared gt Sandra compared the weights of 3
    rabbits
  • If a marble is taken from the bag gt if you take
    a marble from the bag
  • Conditional clauses conditionals either
    replaced with separate sentences or order of
    conditional and main clause changed
    If Lee delivers x newspapers gt Lee delivers x
    newspapers
  • If two batteries in the sample were found to be
    dead gt he found three broken pencils in the
    sample
  • Relative clauses relative clauses either
    removed or re-cast
  • A report that contains 64 sheets of paper gt He
    needs 64 sheets of paper for each report

25
Original2. The census showed that three
hundred fifty-six thousand, ninety-seven
people lived in Middletown. Written as a
number, that isA. 350,697B. 356,097C.
356,907D. 356,970  
Modified2. Janet played a video game. Her
score was three hundred fifty-six thousand,
ninety-seven. Written as number, that
is A. 350,697B. 356,097C. 356,907D. 356,970
26
Interview Study
  • Table 1. Student Perceptions Study First Set
    (N19)
  • Item Original item chosen Revised item chosen
  • 1 3 16
  • 2 4 15
  • 3 10 9
  • 4 11 8
  • Table 2. Student Perceptions Study Second Set
    (N17)
  • Item Original item chosen Revised item
    chosen5 3 14
  • 6 4.5a 12.5
  • 7 2 15
  • 8 2 15

27
Many students indicated that the language in the
revised item was easier
  • Well, it makes more sense.
  • It explains better.
  • Because that ones more confusing.
  • It seems simpler. You get a clear idea of
    what they want you to do.

28
Issues in the ELL Special Education Eligibility
  • Issues concerning authenticity of English
    language Proficiency tests
  • Issues and problems in identifying students with
    learning disability in general
  • Distribution of English language proficiency
    across ELL/non-ELL student categories

29
Issues concerning authenticity of English
language Proficiency tests
  • Issues in theoretical bases (discrete point
    approach, holistic approach, Pragmatic approach)
  • Issues in content coverage (language proficiency
    standards)
  • Issues concerning psychometrics of the assessment
  • Low relationship between ELL classification
    categories and English proficiency scores

30
Issues and problems in identifying students with
learning disability in general
  • A large majority of students with disabilities
    fall in learning disability
  • Validity of identifying students with learning
    disability is questionable

31
Distribution of English language proficiency
across ELL/non-ELL student
  • Most of the existing tests of English proficiency
    lack enough discrimination power
  • There is a large number of ELL students perform
    higher than non-ELL student
  • The line between ELL and non-ELL on their English
    proficiency is not a clear line

32
Reducing the Language Load of Test Items
  • Reducing unnecessary language complexity of test
    items helps ELL students (and to some extent
    SWDs) present a more valid picture of their
    content knowledge.
  • The language clarification of test items may be
    used as a form of accommodation for English
    language learners.
  • The results of our research suggest that
    linguistic complexity of test items may be a
    significant source of measurement error for ELL
    students.

33
Conclusions and Recommendation1. Classification
Issues
  • Classifications of ELLs and SWDs
  • Must be based on multiple criteria that have
    predictive power for such classifications
  • These criteria must be objectively defined
  • Must have sound theoretical and practical bases
  • Must be easily and objectively measurable

34
Conclusions and Recommendation2. Assessment
Issues
  • Assessment for ELLs and SWDs
  • Must be based on a sound psychometric principles
  • Must be controlled for all sources of nuisance or
    confounding variables
  • Must be free of unnecessary linguistic
    complexities
  • Must include sufficient number of ELLs and SWDs
    in its development process (field testing,
    standard setting, etc.)
  • Must be free of biases, such as cultural biases
  • Must be sensitive to students linguistics and
    cultural needs

35
3. Issues concerning special education
eligibility particularly in placing ELL students
at the lower English language proficiency in the
learning/ reading disability category
  • There are psychometric issues with the English
    language proficiency tests
  • Standardized achievement tests may not provide
    reliable and valid assessment of ELL students
  • Reliable and valid measures are needed to
    distinguish between learning disability and low
    level of English proficiency

36
Conclusions and Recommendation4. Accommodation
Issues
  • Accommodations
  • Must be relevant to the subgroups of students
  • Must be effective in reducing the performance gap
    between accommodated and non-accommodated
    students
  • Must be valid, that is, accommodations should not
    alter the construct being measured
  • The results could be combined with the
    assessments under standard conditions
  • Must be feasible in the national and state
    assessments

37
Now for a visual art representation of invalid
accommodations
38
(No Transcript)
39
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com