Title: Assessing Structural and Metric Equivalence: A Case Study
1Assessing Structural and Metric Equivalence A
Case Study
- Fons J. R. van de Vijver
- Tilburg University, the Netherlands and
- North-West University (Potchefstroom Campus),
South Africa - Chantale Jeanrie
- Laval University, Canada
2Outline
- Theoretical and Methodological Background
- Structural and metric equivalence in
translations/adaptations - Example
- Adaptation of the California Personality
Inventory (CPU-434) for use among
French-Canadians - Conclusion
3Theoretical and Methodological Background
- Crucial concept in translations/adaptations is
equivalence - Linguistic
- Mapping of linguistic aspects of meaning (word
meaning, sentence meaning) - Psychological
- Mapping of psychological meaning (serves the same
psychological function in all languages?) - A good translation/adaptation combines these
considerations
4Equivalence in Adaptations
- Structural Equivalence
- Does the instrument measure the same underlying
construct in all language versions ? factor
analysis - Metric Equivalence
- Can scores be compared across all language
versions? ? Item Bias, also known as Differential
Item Functioning (DIF)
5Example
- Adaptation of the California Personality
Inventory (CPU-434) for use among
French-Canadians (Jeanrie Van de Vijver, in
preparation) - Project modeled along Guidelines on Adapting
Tests by the International Test Commission
(www.intestcom.org) (Hambleton, 1994)
6Participants
- 1129 English-speaking and 1018 French-speaking
Canadians - Mainly college and university students (social
science and law) - Majority of both language groups were female
- The English-Canadian group had an average age of
23.53 yrs (SD 7.53), the French-Canadian group
an average of 20.96 yrs (SD 5.94).
7Instrument
- The latest version of the California
Psychological Inventory (CPI Gough, 1996) - 434 items, measuring 20 basic folk scales and 3
vector scales - Scales Do (Dominance), Cs (Capacity for
Status), Sy (Sociability), Sp (Social Presence),
Sa (Self-Acceptance), In (Independence), Em
(Empathy), Re (Responsibility), So
(Socialization), Sc (Self-Control), Gi (Good
Impression), Cm (Communality), Wb (Well-being),
To (Tolerance), Ac (Achievement via Conformity),
Ai (Achievement via Independence), Ie
(Intellectual efficiency), Py (Psychological-Minde
dness), Fx (Flexibility), F/M (Femininity/Masculin
ity) - Vector scales are V1 (Externality/Internality),
V2, (Norm-Doubting/Norm-Favoring) and V3
(Realization). - Three scales are meant to detect response styles
faking good, faking bad, and random responding - The response scale is dichotomous (true/false).
8Translation/Adaptation Procedure
- Four independently working translators with an
academic background in psychology or education - Both English and French was present as the first
language in the group - All were given written instructions as to the
kind of translation that was expected from them,
as well as instructions on how to write test
items.
9Adaptation Procedure
- Step 1
- Each translated item was analyzed by a team of
five (other) bilingual judges - A four-point was used to rate conceptual
equivalence Compared to the meaning of the
original item, the meaning of the translated item
is 1) identical, 2) rather similar, 3) rather
different or 4) different. - Step 2
- Two researchers combined the results and prepared
preliminary version of the French CPI - Many items adapted, few items extensively changed
10- Step 3
- Pilot of the French version Two research
assistants conducted (two-hour) interviews with
twelve participants from Quebec and New Brunswick - Step 4
- Composition of final instrument
11Results Internal Consistencies
- Median Cronbachs alpha of 20 scales is .70 in
French-Canadian group and .69 in English-Canadian
group - Values quite comparable to
- each other (two scales showed significantly
higher values in French Canadian group) - U.S.A. values (reported by Gough)
12Results Construct Equivalence
- To what extent do the scales measure the same in
both cultural groups? - We did not find unequivocal support of Goughs
(empirically derived) scales - 20 scales Gough ? 31 scales current study
13(No Transcript)
14Equivalence Analyses
- Comparison of factors in 4 groups male and
female English-Canadian and French-Canadian
samples - Boxplot of values of Tuckers phi
Conclusion Strong evidence for structural
equivalence
15Item Bias/DIF
- Uniform and nonuniform bias studied
- Logistic regression analysis
- Independent Variable
- Culture (2 levels), Score Level (4 levels)
- Dependent Variable
- Item response (dichotomous)
- Indicators of Bias
- Effect size evaluated as partial correlation
between independent variables (culture or
interaction) and dependent variable Cohens
cutoff values (conservative) .10, .25, and .40 - Proportion of significantly biased items
16Mean Effect Size and (b) Proportion of Biased
Items
(a) Mean Effect Size M .03, SD .01
(b) Proportion of Biased Items M .61, SD .09
17Correlations of Bias Statistics and Item
Characteristics
aDouble apostrophes indicate non-literal word
usage.
18Effect Sizes Before and After Removal Biased
Items
19Conclusion
- Quality of an adaptation is the net result of the
quality of various stages and a long chain of
interdependent decisions - Structural Equivalence
- Strongly supported
- Metric Equivalence
- Many items showed small bias, their removal does
influence the size of the cross-cultural
differences observed
20- Analysis of nature of bias
- More bias in items
- that showed a larger difference in means across
the two groups, - that had lower endorsement rates,
- that contained words with apostrophes
- The removal of the biased items had a remarkably
small on the size of the mean differences of the
two groups. - Conclusion combined expertise/skills in
language, culture, and research methodology and
statistics can yield equivalent instruments