Measuring the Effects of Unit Nonresponse in Establishment Surveys

About This Presentation

Title:

Measuring the Effects of Unit Nonresponse in Establishment Surveys

Description:

... patterns of nonresponse using data from the 2003 CES and QCEW (state UI files) ... three have strengths and weaknesses; using multiple approaches simultaneously ... – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 55

Provided by: Tuc85

Category:

more less

Transcript and Presenter's Notes

Title: Measuring the Effects of Unit Nonresponse in Establishment Surveys

1
Measuring the Effects of Unit Nonresponse in
Establishment Surveys

Clyde Tucker and John Dixon
U.S. Bureau of Labor Statistics
David Cantor
Westat

2
Acknowledgement

We would like to thank Bob Groves and Mike Brick
for the use of their materials from their short
course Practical Tools for Nonresponse Bias
Studies.
We also thank Bob for the use of materials from
his 2006 POQ article Nonresponse Rates and
Nonresponse Bias in Household Surveys, Public
Opinion Quarterly, 70, 646-675.

3
Factors Affecting Nonresponse Outside Survey
Organization Control

Three clusters of factors identified as being
outside the control of the survey organization
(Willimack et al., 2002)
External environmental attributes (climate)
Characteristics of the sample unit
Characteristics of the establishment employee(s)
who decide whether to join a survey, priority of
responding, and length of participation
Effects attributable to all three factors are
widespread and substantially affect nonresponse
Even if nonresponse not increasing, these factors
make it hard to maintain the status quo

4
Some Examples of These Factors

Downsizing decreases staff available to provide
data
Increased firm size due to mergers and
acquisitions increases complexity reporting
burden
Gatekeeping poses significant barriers
Attitudes of owners or key managers toward
government and data confidentiality
Whether or not anyone in the firm actually uses
data products from the survey
Staff turnover
The growing accounting practice of using third
parties such as payroll processing or accounting
firms

5
Nonresponse Error for Sample Mean

In simplest terms

OR Respondent Mean Full Sample Mean
(Nonresponse Rate)(Respondent Mean
Nonrespondent Mean)
6
Thinking Causally About Nonresponse Rates and
Nonresponse Error

Key scientific question concerns mechanisms of
response propensity that create covariance with
survey variable
where is the covariance between the survey
variable, y, and the response propensity, p
What mechanisms produce the covariance?

7
Reporting Bias

The relative bias provides a measure of the
magnitude of the bias. Interpreted similar to a
percent, it is useful in comparing bias from
survey measures which are in different scales.
Where
Rel B ( ) the relative bias with respect to
the estimate, .

8
Reporting Bias

The bias ratio provides an indication of how
confidence intervals are affected by bias
Where the standard error.

9
What does the Stochastic View Imply?

Key issue is whether what influences survey
participation also influences the survey
variables
Increased nonresponse rates do not necessarily
imply increased nonresponse error. Although
lower propensity will tend to increase error.
Hence, investigations are necessary to discover
whether the estimates of interest might be
subject to nonresponse errors because of a
correlation between p and y

10
Alternative Causal Models for Studies of
Nonresponse Rates and Nonresponse Bias
11
A More Specific Theory Relating Nonresponse to
Bias

Levels of bias will differ by subpopulations
Differences between estimates from the total
sample and just respondents will be greatest on
either end of the nonresponse continuum, but
potential bias greatest when response rates are
low
For example Bias in a business survey may be
greatest in the Services sector because it often
has the lowest response rates

12
(No Transcript)
13
(No Transcript)
14
Nonresponse Bias Study Techniques

Comparison to other estimates (benchmarking)
Nonresponse bias for estimates based on variables
available on sample
Studying variation within the respondent set
Altering the weighting adjustments

15
Weights and Response Rates

A base or selection weight is the inverse of the
probability of selection of the unit. The sum of
all the sampled units base weights estimates the
population total.
When units are sampled using a complex sample
design, suggest using (base) weights to compute
response rates that reflect the percentage of the
sampled population that respond. Unweighted rates
are useful for other purposes, such as describing
the effectiveness of the effort.
Weighted response rates are computed by summing
the units base weights by disposition code
rather than summing the unweighted counts of
units.
In establishment surveys, it is useful to include
a measure of size (e.g., number of employees or
students) to account for the units relative
importance. The weight for computing response
rates is the base weight times the measure of
size.

16
Weights and Nonresponse Analysis

A general rule is that weights should be used in
nonresponse analysis studies so that
relationships at the population level can be
examined. Guides for choosing the specific
weights to use are
Use base weights for nonresponse bias studies
that compare all sampled respondents and
nonrespondents. Weights adjusted for nonresponse
may be misleading in this situation.
Use fully adjusted weights for nonresponse bias
studies that compare survey estimates with data
from external sources. One important exception is
when the survey weights are poststratified. In
this case, weights prior to poststratification
are generally more appropriate.

17
1. Comparison to Other Estimates -- Benchmarking

Data or estimates from another source that are
closely related to respondent estimates used to
evaluate bias due to nonresponse in the survey
estimates
Assume that alternative data source has different
sources of measurement error and/or is a superior
measure to target survey.

18
1. Benchmarking Survey Estimates to those from
Another Data Source

Another survey or administrative record system
may contain estimates of variables similar to
those being produced from the survey
Difference between estimates from survey and
other data source is an indicator of bias (both
nonresponse and other)

19
1. How to Conduct a Nonresponse Bias Benchmark
Study

Identify comparison estimates
surveys with very high response rates
administrative systems with different measurement
error properties
Assess major reasons why the survey estimates and
the estimates from the comparison sources differ
Compute estimates from the survey (using final
weights) and from the comparison source to be as
comparable as possible (often requires estimates
for domains)
The difference is an estimate of the direction,
or perhaps the magnitude, of the bias

20
Pros and Cons of Benchmark Comparison to
Estimate NR Bias

Pros
Relatively simple to do and often inexpensive
Estimates from survey use final weights and are
thus relevant
Gives an estimate of bias that may be important
to analysts
Cons
Estimated bias contains errors from the
comparison source as well as from the survey
this is why it is very important that the
comparison source be highly accurate
Measurement properties are generally not
consistent for survey and comparison source
often is largest source of error
Item nonresponse in both data sets reduces
comparability
Hard to find comparable data for establishment
surveys (IRS records?)
More common in household surveys

21
2. Using Variables on Respondents and
Non-respondents

Compare statistics available on both respondents
and non-respondents
The extent there is a difference is an indication
of the bias

22
Possible Sources of Data on Respondents and
Non-respondents

Sampling frame variables
Matched variables from other data-sets
Screener information

23
Pros and Cons of Using Data on both Respondents
and non-respondents

Pros
Measurement properties for the variables are
consistent for respondents and nonrespondents
Bias is strictly due to nonresponse
Provides data on correlation between propensity
to respond and the variables
Cons
Bias estimates are for the variables only
variables highly correlated with the key survey
statistics are relevant
The method assumes no nonresponse adjustments are
made in producing the survey estimates if
variables are highly correlated, then they could
be used in adjustment

24
The CES Study

J. Dixon and C. Tucker (ICES3), Assessing Bias
in Estimates of Employment
Collects employment, hours and earnings monthly
from a current sample of over 300,000
establishments
Tracks the gains and losses in jobs in various
sectors of the economy
In this paper, nonresponse bias work on this
survey focuses on estimating bias for
establishment subpopulations with different
patterns of nonresponse using data from the 2003
CES and QCEW (state UI files)

25
Link relative estimate of employment (Y)

Let Yt be the estimate for a primary cell for
month t, then Yt Rt,t-1 Yt-1
where Rt,t-1 is the ratio of the total sample
employment in month t to the total sample
employment in month t-1 for all sample units
reporting data for both months.

26
Estimate of Bias

Using the most recent employment reports in the
QCEW (not CES) for both responders and
nonresponders
Compare the link relative for respondents to that
for nonrespondents
Results presented are not weighted by probability
of selection, but weighted results show similar
patterns
At this point, not comparing the link relative of
responders to the entire sample

27
Quantile Regression

Bias analysis performed at the establishment
level on subpopulations defined by size and
industry
Testing for the difference in employment between
CES responders and nonresponders. YaBxe where
x is an indicator of nonresponse (essentially a
t-test).
Since size of firm is theorized to relate to
nonresponse, the coefficients relating
nonresponse to employment is likely to be
different for different size firms.
Quantile regression examines the coefficients for
different quantiles of the distribution of the
sizes of firms.
Since industries can be expected to have
different patterns, the quantile regressions are
done by industry group.

28
Distribution of size and the quantile regression
curve
29
Quantile regression using the log of size.
30
MSA percent bias predicted by response rate for
Mining
31
MSA percent bias predicted by response rate for
Food Manufacturing
32
MSA percent bias predicted by response rate for
Retail Trade
33
MSA percent bias predicted by response rate for
Accommodation and Food Services
34
Hing (1987). Nonresponse bias in expense data
from the 1985 national nursing home survey.
Proceedings of the Survey Research Methods
Section, American Statistical Association,
401-405.

Purpose Estimate cost of care in nursing homes
Target population Nursing home facilities in
U.S.
Sample design Stratified list sample of
facilities, facilities sampled with probabilities
proportionate to estimated number of beds, second
stage sample of residents and staff
Mode of data collection In-person interview of
facility administrator, with drop-off
self-administered Expense questionnaire for
accountant
Response rate Facility qnaire 93 Expense
qnaire 68 of those responding to Facility
interview
Target estimate Estimated cost of care
Nonresponse error measure Comparison of Facility
questionnaire items for respondents and
nonrespondents of Expense questionnaire

35
Using the Facility Questionnaire to Estimate
Nonresponse Bias based on Participation in
Expense Questionnaire
Tables 2 and 5 from Hing (1987)
36
Conclusions

Smaller nursing homes underrepresented thus,
respondent estimates overestimate averages on
size-related attributes
Analysis suggested poststratification by
ownership type would significantly reduce biases
Limitation
Nonresponse bias estimate does not reflect
nonresponse on Facility questionnaire

37
3. Weighting Adjustments

Alter estimation weights and compare the
estimates using the various weights to evaluate
nonresponse bias. Weighting methods may include
poststratification, raking, calibration, logistic
regression, or even imputation.

38
Adjust Weights Using Model of Characteristics

Weighting can reduce nonresponse bias if the
weights are correlated with the estimate.
Auxiliary data in weighting that are good
predictors of the characteristic may give
alternative weights that have less bias. If the
estimates using the alternative weights do not
differ from the original estimates, then either
the nonresponse is not resulting in bias or the
auxiliary data does not reduce the bias.
If the estimates vary by the weighting scheme,
then the weighting approach should be carefully
examined and the one most likely to have lower
nonresponse bias should be used.

39
How to Conduct Nonresponse Bias Analysis Using
Weights From Modeling Characteristics

Using weighting method such as calibration
estimation with these variables and produce
alternative weights.
Compute the difference between the estimates
using the alternative weights and the estimates
from the regular weights as a measure of
nonresponse bias for the estimate.

40
Pros and Cons of Comparing Alternative
Estimates Based on Modeling the Estimate

Pros
If good predictors are available, then it is
likely that the use of these in the weighting
will reduce the bias in the statistics being
evaluated
If the differences in the estimates are small, it
is evidence that nonresponse bias may not be
large
Cons
Recomputing weights may be expensive
If good correlates are not available then lack of
differences may be indicator of poor
relationships rather than the absence of bias
The approach is limited to statistics that have
high correlation with auxiliary data

41
4. STUDYING VARIATION WITHIN THE RESPONDENTS
Level of Effort

Some nonresponse models assume that those units
that require more effort to respond (more
callbacks, incentives, refusal conversion) are
similar to the units that do not respond
Characteristics are estimated for respondents by
level of effort (e.g., response propensity
scores)
Models fitted to see if it fits and can be used
to estimate characteristics of nonrespondents

42
Analyze Level of Effort

Associate level of effort data to respondents
(e.g., number of callbacks, ever refused, early
or late responder)
Compute statistics for each level of effort
separately (usually unweighted or base weights
only)
If there is a (linear) relationship between level
of effort and the statistic, then may decide to
extrapolate to estimate statistic for those that
did not respond
Often more appropriate to do the analysis
separately for major reasons for nonresponse

43
Pros and Cons of Using Level of Effort Analysis
to Estimate Bias

Pros
Simple to do, provided data collection systems
capture the pertinent information
In some surveys may provide a reasonable
indicator of the magnitude and direction of
nonresponse bias
Cons
Highly dependent on model assumptions that have
not been validated in many applications
Difficult to extrapolate to produce estimates of
nonresponse bias without other data

44
Brick et al (2007) Relationship between length
of data collection period, field costs, and data
quality Paper presented at ICES III, Montreal,
Canada

Purpose Estimate amount and condition of
research space for science and engineering
Target population Colleges and universities.
Sample design Census
Mode of data collection Web and mail
Response rate 94
Target estimate Square feet
Nonresponse error measure Relative bias when
level of effort is cut by 25

45
Relative bias for Facilities Survey estimates for
academic institutions, by field and response level
46
Summary

The patterns are not consistent
The 75 response level generally exhibits larger
bias than the other response levels, but
generally not statistically significant
No significant bias if data collection were
terminated at the 88 response level

47
5. Followup of Nonrespondents

Use of respondent data obtained through
extra-ordinary efforts as comparison to
respondent data obtained with traditional efforts
Effort may include callbacks, incentives,
change of mode, use of elite corps of interviewers

48
How to Do a Nonrespondent Followup Study

Define a set of recruitment techniques judged to
be superior to those in the ongoing effort
Determine whether budget permits use of those
techniques on all remaining active cases
If not, implement 2nd phase sample (described
later)
Implement enhanced recruitment protocol
Compare respondents obtained in enhanced protocol
with those in the initial protocol

49
Pros and Cons of Nonresponse Followup Study

Pros
Direct measures are obtained from previously
nonrespondent cases
Same measurements are used
Nonresponse bias on all variables can be
estimated
Cons
Rarely are followup response rates 100
Requires extended data collection period

50
Marker, et al (2005).Terrorism Risk Insurance
Program Policyholders Survey. Final report
prepared for the Department of Treasury. Westat
Rockville, MD

Purpose Estimate use of Terrorism Insurance by
Businesses
Target population Businesses and state/local
government offices in the U.S..
Sample design Stratified sample.
Mode of data collection Web and mail
Response rate17
Target estimate Estimated percent that have
insurance
Nonresponse error measure Indicators of use of
terrorism insurance

51
Follow-up Procedures

Contacted follow of 1000 non-resopndents to the
survey.
A shortened instrument was used to collect
critical measures
Interviews conducted by telephone

52
Selected comparisons from non-response follow-up
plt.000
53
Summary

Out of 8 estimates, one was statistically
significant
Some indication that non-response leads to
overestimate of certain types of insurance
The significant difference may have been due to
measurement error on the follow-up instrument

54
Five Things You Should Remember from this Lecture

The three principal types of nonresponse bias
studies are
- Comparing surveys to external data
- Studying internal variation within the data
collection, and
- Contrasting alternative postsurvey adjusted
estimates
All three have strengths and weaknesses using
multiple approaches simultaneously provides
greater understanding
Nonresponse bias is specific to a statistic, so
separate assessments may be needed for different
estimates
Auxiliary variables correlated with both the
likelihood of responding and key survey variables
are important for evaluation
Thinking about nonresponse before the survey is
important because different modes, frames, and
survey designs permit different types of studies