Title: Testing Linguistic Minorities Linguistically Diverse Populations
1Testing Linguistic Minorities(Linguistically
Diverse Populations)
- Stephen G. Sireci
- April L. Zenisky
- Center for Educational Assessment University of
Massachusetts Amherst - Presentation at CCSSOs 36th Annual Conference
- on Large-Scale Assessment June 28, 2006
2Purposes of this presentation
- Review psychometric issues in testing students
with limited proficiency in the language in
which the test is written. - Summarize studies that looked into the effects of
test modifications on ELL students test
performance. - Provide suggestions for future research and
practice in this area.
3Interest in multilingualism is very different
from earlier part of the last century
- If English was good enough for Jesus, its good
enough for the school children of Texas. - Texas Governor James Pa Ferguson (1917) after
vetoing a bill to finance the teaching of foreign
languages in classrooms.
4Multilingualism is not limited to K-12 assessment
- Popular Intelligence, Aptitude, and Personality
Tests - TIMSS-R, OECD/PISA, IEA
- SAT, PET, WORKKEYS, NAEP
- Credentialing Exams, e.g. Microsoft and Novell,
and many others
Thus, measurement in only a single language is
becoming less common!
5However
- There are many issues involved in testing
linguistically diverse populations, and this is a
long recognized problem in the psychometric
community
6Including ELLs in Federal or State-Mandated
Assessments
- Desire (legislative requirement) to include
linguistic minorities in assessments - Historically, ELLs were not included in
accountability testing (Coltrane, 2002) - Problem of English proficiency interfering with
measurement of construct of interest
7Threats to valid test score interpretation
- Construct underrepresentation
- Construct-irrelevant variance
- Tests are imperfect measures of constructs
because they either leave out something that
should be includedor else include something that
should be left out, or both (Messick, 1989, p.
34)
8Standards for Educational Psychological Testing
- Any test that employs language is, in part, a
measure of their language skills test results
for ELLs may not reflect accurately the
qualities and competencies intended to be
measured (AERA, et al., 1999, p. 91).
9Validity Issues in Testing ELLs
- Standards for Educational Psychological
Testing - 7.7 In testing applications where the level of
linguistic or reading ability is not part of the
construct of interest, the linguistic or reading
demands of the test should be kept to the minimum
necessary for the valid assessment of the
intended construct (AERA, et al., 1999, p. 82).
10Rodriguez (1989)
- Clearly, a test written in English is inadequate
to measure the performance of a person who does
not understand English well
11Diana v. California State Board of Education
(1970)
- 9 Mexican-American students classified as
mentally retarded - Stanford-Binet
- WISC
- Upon retesting by a bilingual test administrator,
their IQs increased 1 SD. - Ruled that students must be tested in their
native language
12Rodriguez (1989)
- Another issue of potential bias is interpreting
ELL performance on tests normed on nonminority,
white, middle-class populations - Testing behaviors are culturally learned
behaviors (pp. 12-13).
13So there are issues
- Thats not new.
- What can be done?
14Validity Issues in Testing ELLs
- Standards for Educational Psychological
Testing - 9.1 Testing practice should be designed to
reduce threats to the reliability and validity of
test score inferences that may arise from
language differences (AERA, et al., 1999, p. 97).
15Including ELLs in Federal or State-Mandated
Assessments
- Strategies for inclusion
- Adapted (translated) tests
- Dual language test booklets
- Test modifications (accommodations, adaptations)
16Test Accommodations for ELLs
- Linguistic modification
- Simplified English
- Modified English
- Dictionaries
- Customized
- Bilingual
- Glosses
- Dual-language, translated tests
- Extended time
17Validity Issues in Test Accommodations
- Does the accommodation change the construct
measured? - But also
- Do standardized conditions inhibit measurement of
the construct for some or all students?
18Please note
- Adaptation versus accommodation
- Similar, if not identical, methodological issues
- Same methods can be used for evaluating
- Cross-cultural differences within a single
language version of a test - Test accommodations for individuals with
disabilities
19Important research questions for accommodated
tests
- Has the accommodation changed the construct
measured? - Speed
- Different skill
- Do test scores from accommodated and
non-accommodated administrations have the same
meaning?
20Research on test accommodations for ELL
- Little empirical study (Abedi and colleagues
conducted most extensive research) - Psychometric issues (Geisinger, 1994)
- Legal issues (Phillips, 1994)
21Review of previous studies
- Sireci, Li, Scarpati (2003)
- Commissioned review of the effects of test
accommodation on test performance - NAS/NRCBOTA
- Looked at both SWD and ELL
22Characteristics of studies
Note Literature reviews and
issues papers are not included in this table.
23Results ELLs
- Glosses customized dictionaries
- Seem to have positive effect
- Linguistic modification
- Equivocal Some studies show gains, others do not
24Results ELL (2)
- Extended time
- Seems to help students, but confounded with other
conditions - Dual-language
- Unclear
- Most students used one language
- Test adaptations (translations)
- Studies looking at construct equivalence provide
mixed results.
25Discussion
- Review shows effects of test accommodations are
mixed. - Tremendous variability across
- accommodation conditions and how they were
implemented - Student groups (within and between)
- Results
26Discussion (2)
- Results for linguistic modification are
promising, but inconsistent - Future research should look at which types of
simplification seem to work best - Glosses and dictionaries had small, but
consistent effects across studies. - More studies are needed
27Discussion (3)
- For translated/adapted tests and dual-language
tests - Not sure of increase in validity, but if tests
are properly translated, results across different
language versions can be comparable - Solano-Flores et al. (2002) and others recommend
concurrent development of tests in multiple
languages.
28Discussion (4)
- For extended time, positive effects often seen
for ELL and non-ELL groups. - Speededness factor
29Future directions
- Universal test design
- Build tests that are accessible to all.
- i.e., that do not need to be accommodated.
- CBT could be particularly helpful in this regard.
30Future directions
- UTD and language simplification are closely
related to just plain good test development
practices.
31Future directions
- AERA et al. (1999) Standards it is important
to consider language background in developing,
selecting, and administering tests and in
interpreting test performance (p. 91). - UTD and concurrent development (translation) meet
this standard. - We need valid assessments of ELL to evaluate ELL
instruction.
32Suggestions in testing ELLs
- Coltrane (2002)
- Ensure tests reflect curriculum
- Teach test-taking skills to ELLs
- Use multiple measures
- Abedi (2001)
- Ensure students are tested in the language in
which they are instructed - Monitor accommodations
- More research is needed
33Recommended reading
- Jamals research!
- Strategies to Assess the Core Academic Knowledge
of English Language Learners - (Rabinowitz, Ananda, Bell, 2004)
- http//www.testpublishers.org/journal.htm
- The Technical Adequacy of Assessments for
Alternate Student Populations - (Rabinowitz Sato, 2005)
34Thanks!Please contact me (azenisky_at_educ.umass.ed
u)or Steve Sireci (sireci_at_acad.umass.edu)with
questions / comments!