Title: Measurement class 11
1Measurement class 11
2Measuring students skills PISA
3Measuring students skills PISA
- Cf. class 10 OECD became the dominant source for
measurement of education understood as outcome
individual skills - 1995 International Adult Literacy Survey
- 2000 PISA
- Cornerstone of these measurement tools measuring
skills, not performance
4Measuring students skills PISA
- Psychometry
- Measured score real score measurement error
- Historically, attempts at measuring
intelligence (IQ) - The skill is the hidden factor, partially
revealed by tests
5Measuring students skills PISA
6Measuring students skills PISA
- Germany PISA shock (huge debate)
- England Are we not such dunces after all?
(Times, dec. 5, 2000) - Again, much debate about rankings
- So what does PISA tell us, and what does it not?
7Measuring students skills PISA
- Goal measuring ability to use knowledge in a
practical setting. PISAs aim of tapping
students preparedness for life. - Ex reading The capacity of an individual to
understand, use and reflect on written texts in
order to achieve ones goals, to develop ones
knowledge and potential, and to participate in
society.
8Measuring students skills PISA
- Math The capacity of an individual to
identify and understand the role that mathematics
plays in the world, to make well-founded
judgements and to use and engage with mathematics
in ways that meet the needs of that individuals
life as a constructive, concerned and reflective
citizen.
9Measuring students skills PISA
- Items are designed by experts, designed to
measure certain skills - Then, ranked on a difficulty scale according to
of test students who succeeded - Actual survey students (not the same students)
are then given a score - The students score in turn predicts the
probability of success on items of each
difficulty level (5 levels)
10Measuring students skills PISA
- Items are designed by experts, designed to
measure certain skills - Then, ranked on a difficulty scale according to
of test students who succeeded (not the same
students) - Evaluated students are then given a score
- The score in turn predicts the probability of
success on items of each difficulty level
11Measuring students skills PISA
12Measuring students skills PISA
- Scores are standardized
- Average score of OECD Countries 500
- Standard deviation 100 (20 of m)
13Measuring students skills PISA
- Items are tested in different countries,
eliminated if present a cultural, gender bias - In the end, cultural bias?
- Convincingly argued that no
- Stimuli are 15 longer in French, but French
Canadians do as well as the best English speaking
Canadian states - Differences such as US / Eng Canada
- Type of question? Not really (the French score
well on item choice questions)
14Measuring students skills PISA
- Other sources of bias?
- Only the 15-year-olds currently attending school
- UK 95
- France 90
- Brazil 55 (OECD partner state)
- Mexico 54 (OCDE member)
15Measuring students skills PISA
- Other sources of bias?
- Target population remote schools, disabled youth
or non-native speakers could be left out, but
within less than 5 of the population - But some countries left out more than 5
- Response rates? Very low in UK in 2000 and 2003
gt results excluded from PISA international
comparisons
16Measuring students skills PISA
- Main source of uncertainty sampling error
- Most countries /- 5 points on average score
- Ex France (Grenet, working paper)
- Ranked somewhere between 18 and 28 (on 56)
- Mean score not statistically significant from 13
other countries (on 55)
17Measuring students skills PISA
- Differences
- between France and Australia 32 points
- btw Australia and Argentina 136 points
- Difference between 2 levels 70 points
18Mesurer les compétences PISA
- Educational context
- skills and not contents of a curriculum but
the 2 are linked (skills are taught at school,
even if only a little!) - Ex 1st notions of probability come during 5th
year of high school in France
19Mesurer les compétences PISA
- 15 year olds sampled by PISA did not all receive
the same education gt what is being measured
isnt the quality of education but many things
(grade repetition). Ex probabilities
20Mesurer les compétences PISA
- Huge gap ex. on Reading scale, PISA 200
- on time students in non-vocational classes
score 560 on average ( Finnish score) - Students having repeated once 430 on average
bottom of international rankings - PISA sheds light on these structural differences
and asks relevant policy questions
21Mesurer les compétences PISA
- Motivation effect PISA 2003 asked students to
measure their effort in answering the
questionnaire, on a 1 to 10 scale - French students average 7/10
- 40th / 41 participating countries
22Mesurer les compétences PISA
- Binets joke (1 of designers of the IQ test)
What is intelligence? Well, its what my test
measures! - DESECO program (1999) asked non-psychologists
(philosophers, sociologists, economists,
ethnologist) what skills were needed to succeed
in todays world - need for a definition of what is being
measured, outside of the definition of the
measurement tool
23A word on non response
- PISA 3 causes at least
- Student cant answer gt measurement OK
- Student didnt have time gt intensity of the
test, not skill gt must be taken into account in
scoring - Student didnt even try gt motivation effect,
measurement pb - Relatively OK for assesment of skills within
school context - Students are used to comply. Pb with adults!
24A word on non response
- In general
- Total non response no survey
- Partial non response parts of the survey, items
- Total non response high ? probable bias
- Partial non response
- what does it say?
25A word on non response
- Refusal to answer some questions because too
private not most frequent case - Ex income. From brackets to actual amount
- Ex questions on divorce, getting along with
partner - Very often meaningless question for the
respondent (Bourdieu) - Ex what do you think of the US Foreign Policy
regarding Cuba?
26A word on non response
- Answering an opinion question meaningfully means
having an opinion having already thought about
the question - Conclusion for survey design and reading survey
answers - Always leave the possibility to answer dont
know vs. refusal to answer - When reading tables /regression results, check
the number of respondents
27C. Measuring adults skills IALS
28Measuring adults skills IALS
- IALS 1995. A case in point of measurement
failure - Idea same as what was said in part A. on PISA,
but on adults. - Items of various difficulties, level of the
individual level of questions he answers with
.8 probability - One single scale of literacy. Level 1 has
difficulties illiterate
29(No Transcript)
30Measuring adults skills IALS
- France left the IALS project and inflicted total
censorship on the results - Except that France was left in appendix tables ?
articles like the IHT one - How did that happen?
31Measuring adults skills IALS
32Measuring adults skills IALS
- Sampling method random path ? dwelling
- Adresses are sampled from a sampling frame
(census, phonebook). - To avoid bias due to dwellings not in the
sampling frame, another dwelling is actually
sampled, with an itinerary to go from the first
one to the actual one - Ex start from 48, bd Jourdan. Head North, take
the second street to the left, count 5 buildings
on the right side, enter the 6th one. Go to the
3rd floor and choose the 2nd aparatment on the
right side of the corridor
33Measuring adults skills IALS
- The Kalton, Lyberg, Rempp audit (1995)
- Replacement rates (replace the protocol dwelling
by another) very high - Refusal 45 (in addition to absent households,
etc)
34Measuring adults skills IALS
- The Kalton, Lyberg, Rempp audit (1995)
- Replacements
- Adress interviewed N percentage
- Protocol adress 1 1363 45,5
- 1st replacement 792 26,4
- 2nd replacement 841 28,1
- total 2996 100,0
- non-response 45,2
35Measuring adults skills IALS
- Probably upward bias when protocol not strictly
implemented - Germany 100 have no problem reading German
- HH were not selected at ranmdom
- Within HH, the most able / motivated HH member
was selected in 5 to 10 of cases, contrary to
protocol
36Measuring adults skills IALS
- Motivation effect very important
- The interviewer had nothing to do while
respondent filled booklet gt pressure on
respondent (theoretically, no time limit)
37Measuring adults skills IALS
- Work by A. Blum et F. Guerin (1998)
- Interviewed the interviewers
- In 20 of HH, people seemed to answer without
thinking to be done with the survey - Studied the items ambiguities, wrong when in
fact perfectly understood. The onion example
38Measuring adults skills IALS
- Conclusion on IALS
- Not one single defect, but a series of
dysfunctionings all along the measurement
production chain - Sampling,
- Items design
- Survey fieldwork
- Imputation of non response
- Coding and calculation of scores (1 single scale)
39Lessons to be learnt
- HMK 11
- always consider the VARIANCE, not just the mean /
median - Are levels relevant or only ranks (absolute //
relative measurement)
40Lessons to be learnt
- IALS HMK 11 non-response matters
- It is often non ignorable and induces biases, try
to assess them - Partial non response means something about the
question - PISA and IALS
- ALWAYS read the technical documents / annexes
before saying anything you can! - Rankings are relevant if variable values are
actually different - Same thing as look for the standard error