Title: Data Editing and Data Quality: Value for Money or Art for Arts Sake The Polish Experience with the E
1Data Editing and Data QualityValue for Money or
Art for Arts Sake? The Polish Experience with
the ESS
- Zbyszek Sawinski
- Polish Academy of Sciences
PRAGUE June 2528. 2007
2Outline
- The overview of data processing
- Three types of data comparisons
- Which data are be corrected
- How strongly data editing modifies the data
- Data transposition fieldwork histories
- Conclusions value for money or art for arts sake
3The overview of data processing
Contact Forms
Sampling data
Questionnaires
DATA ENTRY EDITING
Contact Formsdata file
Sampling file
Questionnairedata file
4Three types of data comparisons
- Data collected twice
- Interrelated questions
- Data juxtaposed with external sources
5Data collected twice
- Fieldwork data
- Interviews vs Contact Forms
- respondents ID
- interviewers ID
- date of interview
- start time
6Interrelated questions
42
7Data juxtaposed with external sources
- Key demographics
- Gender
- Age
- Size of domicile
8Which data are to be corrected?
- Contradictory interviewers recordings are to be
corrected in all cases. - eg different dates of interview recorded in the
Questionnaire and in the Contact Form - eg interview end time comes earlier than
interview start time - If there is no basis for correcting both
variables, then one or both should be coded as
missing.
9Which data are to be corrected?
- Contradictory answers come from the respondent
- (1) First check whether or not the error may
have been committed by the interviewer and if it
may be corrected - eg. in the household grid an interviewer recorded
the same gender for the respondent and for the
respondents partner. - (2) When an error cannot be corrected code one
or both answers as missing - eg. in the household grid an interviewer recorded
the current year (2006) as the year of a
partners birth.
10To correct or not to correct?
- Respondent refused to answer
- impute data, when it is known
- An answer is inconsistent with external data
- eg year of birth (household grid 1976, sampling
file 1967) - consistent with other answers ? do not change
- inconsistent ? change
- We can never exclude a situation that the
interview was conducted with a person other than
the selected one, which may have happened either
by mistake or intentionally.
11Summary of data corrections
Note Total number of interviews is 1721.
12Top 10 items most often corrected
13Transposition of data into fieldwork histories
Interviewer 1
day 1 time a visit/no intrv. time a visit/no
intrv. time an interview day 2 time an
interview time an interview time a visit/no
intrv. time an interview . . . . .
Contact Forms
Interviewer 2
. . . . . day 2 time an interview time a
visit/no intrv. time an interview day 3 time
a visit/no intrv. time a visit/no intrv.
time a visit/no intrv. time an interview .
. . . .
14Fieldwork history an interviewer following
procedures
15Fieldwork history an interviewer NOT following
procedures
16Conclusions
- Opportunities offered by consistency checks are
significantly limited. - Most interviewers errors and other types of
errors cannot be identified during data
processing. It is too late! - Data improvement opportunities are limited,
however... - 1. One can significantly reduce or even totally
eliminate missing data and errors pertaining to
key demographics. - 2. One can identify at least some questions which
may be understood ambiguously or which pose other
types of difficulties. - Non-standard analyses of fieldwork data can
providefresh and useful insights into
interviewers work.
17DATA EDITING
?
Value for money
Art for Arts sake