Title: New Developments in Nonresponse Adjustment Methods
1New Developments in Nonresponse Adjustment Methods
- Fannie Cobben
- Statistics Netherlands
- Department of Methodology and Quality
2In this presentation
- Response selection model
- Use of response propensities
- Application to POLS 2002
- Discussion
3The response selection model
- Consists of two equations
- Response equation (dichotomous)
- Survey item equation (continuous)
- The error terms are allowed to be correlated
- The outcome is adjusted for a possible selection
bias
4Extensions to response selection model
- Multiple selection equations, i.e. different
response types - Contact equation
- Participation equation
- Survey item equation
- Contact and participation are dependent and can
both introduce a bias for the survey item - If desirable, equations for other response types
- Categorical survey items
5Response selection model contact and
participation
6Advantages
- Model relationship with both R and Y
- Efficient use of auxiliary variables paradata
- Closely follow fieldwork process
Disadvantages
- Model based dependent on distributional
assumptions - Issues of identification exclusion restriction
7Response propensities (1)
- Assume Sample is selected from a sampling frame
by some random selection procedure - Two groups
- R1 response
- R0 non-response
- X auxiliary information, available for all
elements. For instance from the sampling frame. - ?(X) P(R1 X) is the propensity score, for
instance determined by a logistic regression,
i.e.
8Response propensities (2)
- We can use the response propensities to adjust
for nonresponse bias - Directly
- Response propensity weighting
- Response propensity stratification
- Indirectly
- In combination with linear weighting
9Direct use of response propensities
- Response propensity weighting, Särndal (1981)
Response propensity stratification
10Indirect use of response propensities
- GREG estimator with adopted inclusion
probabilities
11POLS 2002
- Integrated Survey on Household Living Conditions
(in Dutch Permanent Onderzoek LeefSituatie) - Monthly 3.000 persons are selected
- Questions on living conditions, safety, health
- Basic module for persons gt 12 years
- Datafile aggregated over 2002 n 35.594 and nr
20.168 (57) - Survey variables Employment, Education and
Religion - Numerous auxiliary variables, such as age,
region, house value, social insurance, ethnicity,
etc.
12Analysis POLS 2002
- Aim
- Compare different response propensity methods
- Compare regular GREG-estimator and other methods
- Two models
- Weighting model (relation Y and R Schouten, 2004)
Age15 Houseval14 Non-natives8 Ethnicity7
Region15 Type hh4 Telephone2
(1)
Response model (relation R psuedo R2 2,2)
Age15 Houseval14 Urbanicity5 Mar_staat4
Ethnicity7 Region15 Type hh4 Telephone2
(2)
13Results - Employment
- No significant differences between method clear
difference with response average - GREG and propensity weighting with same model
same results - Propensity weighting and propensity GREG highest
estimate employed labour force
14Results - Education
15Results - Religion
- More differences between methods
- Propensity weighting and propensity GREG highest
estimate no religion
16Conclusions
- Remarks
- Almost the same variables in weighting model and
response model - Low pseudo R2 response model
- Conclusions
- Small difference between methods
- No reduction of variance direct response
propensity methods reduction of variance
GREG-estimator - Propensity weighting and GREG-estimator with same
model gives same results for employed and
education
17Thank you for your attention!
18Total Survey Error
19Probability based surveys (2)
- Under-coverage in telephone surveys
- Two groups
- with telephone (C1)
- without telephone (C0)
- Non-response
- Two groups
- Respondents (C1)
- Non-respondents (C0)
20Discussion
- Which variables should be inserted in the
different equations, and how should these
variables be selected? - How to construct the hierarchical structure of
the model. For instance, which response types
should be distinguished? - Is it profitable to construct a model for every
survey item? If so, how should we deal with this
in practice? - The response selection assumes a correlation
structure between response types and survey
items. This correlation has a model-based
interpretation, but how should it be interpreted
in practice?
21Example relationship R and Y in the LFS
- Relation between the number of contact attempts
and estimated size of the labour force
all attempts
4 attempts
size of the labour force
2 attempts
1 attempt
Quarter