Measuring the Effects of Unit Nonresponse in Establishment Surveys - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Measuring the Effects of Unit Nonresponse in Establishment Surveys

Description:

... patterns of nonresponse using data from the 2003 CES and QCEW (state UI files) ... three have strengths and weaknesses; using multiple approaches simultaneously ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 55
Provided by: Tuc85
Category:

less

Transcript and Presenter's Notes

Title: Measuring the Effects of Unit Nonresponse in Establishment Surveys


1
Measuring the Effects of Unit Nonresponse in
Establishment Surveys
  • Clyde Tucker and John Dixon
  • U.S. Bureau of Labor Statistics
  • David Cantor
  • Westat

2
Acknowledgement
  • We would like to thank Bob Groves and Mike Brick
    for the use of their materials from their short
    course Practical Tools for Nonresponse Bias
    Studies.
  • We also thank Bob for the use of materials from
    his 2006 POQ article Nonresponse Rates and
    Nonresponse Bias in Household Surveys, Public
    Opinion Quarterly, 70, 646-675.

3
Factors Affecting Nonresponse Outside Survey
Organization Control
  • Three clusters of factors identified as being
    outside the control of the survey organization
    (Willimack et al., 2002)
  • External environmental attributes (climate)
  • Characteristics of the sample unit
  • Characteristics of the establishment employee(s)
    who decide whether to join a survey, priority of
    responding, and length of participation
  • Effects attributable to all three factors are
    widespread and substantially affect nonresponse
  • Even if nonresponse not increasing, these factors
    make it hard to maintain the status quo

4
Some Examples of These Factors
  • Downsizing decreases staff available to provide
    data
  • Increased firm size due to mergers and
    acquisitions increases complexity reporting
    burden
  • Gatekeeping poses significant barriers
  • Attitudes of owners or key managers toward
    government and data confidentiality
  • Whether or not anyone in the firm actually uses
    data products from the survey
  • Staff turnover
  • The growing accounting practice of using third
    parties such as payroll processing or accounting
    firms

5
Nonresponse Error for Sample Mean
  • In simplest terms

OR Respondent Mean Full Sample Mean
(Nonresponse Rate)(Respondent Mean
Nonrespondent Mean)
6
Thinking Causally About Nonresponse Rates and
Nonresponse Error
  • Key scientific question concerns mechanisms of
    response propensity that create covariance with
    survey variable
  • where is the covariance between the survey
  • variable, y, and the response propensity, p
  • What mechanisms produce the covariance?

7
Reporting Bias
  • The relative bias provides a measure of the
    magnitude of the bias. Interpreted similar to a
    percent, it is useful in comparing bias from
    survey measures which are in different scales.
  • Where
  • Rel B ( ) the relative bias with respect to
    the estimate, .

8
Reporting Bias
  • The bias ratio provides an indication of how
    confidence intervals are affected by bias
  • Where   the standard error.

9
What does the Stochastic View Imply?
  • Key issue is whether what influences survey
    participation also influences the survey
    variables
  • Increased nonresponse rates do not necessarily
    imply increased nonresponse error. Although
    lower propensity will tend to increase error.
  • Hence, investigations are necessary to discover
    whether the estimates of interest might be
    subject to nonresponse errors because of a
    correlation between p and y

10
Alternative Causal Models for Studies of
Nonresponse Rates and Nonresponse Bias
11
A More Specific Theory Relating Nonresponse to
Bias
  • Levels of bias will differ by subpopulations
  • Differences between estimates from the total
    sample and just respondents will be greatest on
    either end of the nonresponse continuum, but
    potential bias greatest when response rates are
    low
  • For example Bias in a business survey may be
    greatest in the Services sector because it often
    has the lowest response rates

12
(No Transcript)
13
(No Transcript)
14
Nonresponse Bias Study Techniques
  • Comparison to other estimates (benchmarking)
  • Nonresponse bias for estimates based on variables
    available on sample
  • Studying variation within the respondent set
  • Altering the weighting adjustments

15
Weights and Response Rates
  • A base or selection weight is the inverse of the
    probability of selection of the unit. The sum of
    all the sampled units base weights estimates the
    population total.
  • When units are sampled using a complex sample
    design, suggest using (base) weights to compute
    response rates that reflect the percentage of the
    sampled population that respond. Unweighted rates
    are useful for other purposes, such as describing
    the effectiveness of the effort.
  • Weighted response rates are computed by summing
    the units base weights by disposition code
    rather than summing the unweighted counts of
    units.
  • In establishment surveys, it is useful to include
    a measure of size (e.g., number of employees or
    students) to account for the units relative
    importance. The weight for computing response
    rates is the base weight times the measure of
    size.

16
Weights and Nonresponse Analysis
  • A general rule is that weights should be used in
    nonresponse analysis studies so that
    relationships at the population level can be
    examined. Guides for choosing the specific
    weights to use are
  • Use base weights for nonresponse bias studies
    that compare all sampled respondents and
    nonrespondents. Weights adjusted for nonresponse
    may be misleading in this situation.
  • Use fully adjusted weights for nonresponse bias
    studies that compare survey estimates with data
    from external sources. One important exception is
    when the survey weights are poststratified. In
    this case, weights prior to poststratification
    are generally more appropriate.

17
1. Comparison to Other Estimates -- Benchmarking
  • Data or estimates from another source that are
    closely related to respondent estimates used to
    evaluate bias due to nonresponse in the survey
    estimates
  • Assume that alternative data source has different
    sources of measurement error and/or is a superior
    measure to target survey.

18
1. Benchmarking Survey Estimates to those from
Another Data Source
  • Another survey or administrative record system
    may contain estimates of variables similar to
    those being produced from the survey
  • Difference between estimates from survey and
    other data source is an indicator of bias (both
    nonresponse and other)

19
1. How to Conduct a Nonresponse Bias Benchmark
Study
  • Identify comparison estimates
  • surveys with very high response rates
  • administrative systems with different measurement
    error properties
  • Assess major reasons why the survey estimates and
    the estimates from the comparison sources differ
  • Compute estimates from the survey (using final
    weights) and from the comparison source to be as
    comparable as possible (often requires estimates
    for domains)
  • The difference is an estimate of the direction,
    or perhaps the magnitude, of the bias

20
Pros and Cons of Benchmark Comparison to
Estimate NR Bias
  • Pros
  • Relatively simple to do and often inexpensive
  • Estimates from survey use final weights and are
    thus relevant
  • Gives an estimate of bias that may be important
    to analysts
  • Cons
  • Estimated bias contains errors from the
    comparison source as well as from the survey
    this is why it is very important that the
    comparison source be highly accurate
  • Measurement properties are generally not
    consistent for survey and comparison source
    often is largest source of error
  • Item nonresponse in both data sets reduces
    comparability
  • Hard to find comparable data for establishment
    surveys (IRS records?)
  • More common in household surveys

21
2. Using Variables on Respondents and
Non-respondents
  • Compare statistics available on both respondents
    and non-respondents
  • The extent there is a difference is an indication
    of the bias

22
Possible Sources of Data on Respondents and
Non-respondents
  • Sampling frame variables
  • Matched variables from other data-sets
  • Screener information

23
Pros and Cons of Using Data on both Respondents
and non-respondents
  • Pros
  • Measurement properties for the variables are
    consistent for respondents and nonrespondents
  • Bias is strictly due to nonresponse
  • Provides data on correlation between propensity
    to respond and the variables
  • Cons
  • Bias estimates are for the variables only
    variables highly correlated with the key survey
    statistics are relevant
  • The method assumes no nonresponse adjustments are
    made in producing the survey estimates if
    variables are highly correlated, then they could
    be used in adjustment

24
The CES Study
  • J. Dixon and C. Tucker (ICES3), Assessing Bias
    in Estimates of Employment
  • Collects employment, hours and earnings monthly
    from a current sample of over 300,000
    establishments
  • Tracks the gains and losses in jobs in various
    sectors of the economy
  • In this paper, nonresponse bias work on this
    survey focuses on estimating bias for
    establishment subpopulations with different
    patterns of nonresponse using data from the 2003
    CES and QCEW (state UI files)

25
Link relative estimate of employment (Y)
  • Let Yt be the estimate for a primary cell for
    month t, then Yt Rt,t-1 Yt-1
  • where Rt,t-1 is the ratio of the total sample
    employment in month t to the total sample
    employment in month t-1 for all sample units
    reporting data for both months.

26
Estimate of Bias
  • Using the most recent employment reports in the
    QCEW (not CES) for both responders and
    nonresponders
  • Compare the link relative for respondents to that
    for nonrespondents
  • Results presented are not weighted by probability
    of selection, but weighted results show similar
    patterns
  • At this point, not comparing the link relative of
    responders to the entire sample

27
Quantile Regression
  • Bias analysis performed at the establishment
    level on subpopulations defined by size and
    industry
  • Testing for the difference in employment between
    CES responders and nonresponders. YaBxe where
    x is an indicator of nonresponse (essentially a
    t-test).
  • Since size of firm is theorized to relate to
    nonresponse, the coefficients relating
    nonresponse to employment is likely to be
    different for different size firms.
  • Quantile regression examines the coefficients for
    different quantiles of the distribution of the
    sizes of firms.
  • Since industries can be expected to have
    different patterns, the quantile regressions are
    done by industry group.

28
Distribution of size and the quantile regression
curve
29
Quantile regression using the log of size.
30
MSA percent bias predicted by response rate for
Mining
31
MSA percent bias predicted by response rate for
Food Manufacturing
32
MSA percent bias predicted by response rate for
Retail Trade
33
MSA percent bias predicted by response rate for
Accommodation and Food Services
34
Hing (1987). Nonresponse bias in expense data
from the 1985 national nursing home survey.
Proceedings of the Survey Research Methods
Section, American Statistical Association,
401-405.
  • Purpose Estimate cost of care in nursing homes
  • Target population Nursing home facilities in
    U.S.
  • Sample design Stratified list sample of
    facilities, facilities sampled with probabilities
    proportionate to estimated number of beds, second
    stage sample of residents and staff
  • Mode of data collection In-person interview of
    facility administrator, with drop-off
    self-administered Expense questionnaire for
    accountant
  • Response rate Facility qnaire 93 Expense
    qnaire 68 of those responding to Facility
    interview
  • Target estimate Estimated cost of care
  • Nonresponse error measure Comparison of Facility
    questionnaire items for respondents and
    nonrespondents of Expense questionnaire

35
Using the Facility Questionnaire to Estimate
Nonresponse Bias based on Participation in
Expense Questionnaire
Tables 2 and 5 from Hing (1987)
36
Conclusions
  • Smaller nursing homes underrepresented thus,
    respondent estimates overestimate averages on
    size-related attributes
  • Analysis suggested poststratification by
    ownership type would significantly reduce biases
  • Limitation
  • Nonresponse bias estimate does not reflect
    nonresponse on Facility questionnaire

37
3. Weighting Adjustments
  • Alter estimation weights and compare the
    estimates using the various weights to evaluate
    nonresponse bias. Weighting methods may include
    poststratification, raking, calibration, logistic
    regression, or even imputation.

38
Adjust Weights Using Model of Characteristics
  • Weighting can reduce nonresponse bias if the
    weights are correlated with the estimate.
    Auxiliary data in weighting that are good
    predictors of the characteristic may give
    alternative weights that have less bias. If the
    estimates using the alternative weights do not
    differ from the original estimates, then either
    the nonresponse is not resulting in bias or the
    auxiliary data does not reduce the bias.
  • If the estimates vary by the weighting scheme,
    then the weighting approach should be carefully
    examined and the one most likely to have lower
    nonresponse bias should be used.

39
How to Conduct Nonresponse Bias Analysis Using
Weights From Modeling Characteristics
  • Using weighting method such as calibration
    estimation with these variables and produce
    alternative weights.
  • Compute the difference between the estimates
    using the alternative weights and the estimates
    from the regular weights as a measure of
    nonresponse bias for the estimate.

40
Pros and Cons of Comparing Alternative
Estimates Based on Modeling the Estimate
  • Pros
  • If good predictors are available, then it is
    likely that the use of these in the weighting
    will reduce the bias in the statistics being
    evaluated
  • If the differences in the estimates are small, it
    is evidence that nonresponse bias may not be
    large
  • Cons
  • Recomputing weights may be expensive
  • If good correlates are not available then lack of
    differences may be indicator of poor
    relationships rather than the absence of bias
  • The approach is limited to statistics that have
    high correlation with auxiliary data

41
4. STUDYING VARIATION WITHIN THE RESPONDENTS
Level of Effort
  • Some nonresponse models assume that those units
    that require more effort to respond (more
    callbacks, incentives, refusal conversion) are
    similar to the units that do not respond
  • Characteristics are estimated for respondents by
    level of effort (e.g., response propensity
    scores)
  • Models fitted to see if it fits and can be used
    to estimate characteristics of nonrespondents

42
Analyze Level of Effort
  • Associate level of effort data to respondents
    (e.g., number of callbacks, ever refused, early
    or late responder)
  • Compute statistics for each level of effort
    separately (usually unweighted or base weights
    only)
  • If there is a (linear) relationship between level
    of effort and the statistic, then may decide to
    extrapolate to estimate statistic for those that
    did not respond
  • Often more appropriate to do the analysis
    separately for major reasons for nonresponse

43
Pros and Cons of Using Level of Effort Analysis
to Estimate Bias
  • Pros
  • Simple to do, provided data collection systems
    capture the pertinent information
  • In some surveys may provide a reasonable
    indicator of the magnitude and direction of
    nonresponse bias
  • Cons
  • Highly dependent on model assumptions that have
    not been validated in many applications
  • Difficult to extrapolate to produce estimates of
    nonresponse bias without other data

44
Brick et al (2007) Relationship between length
of data collection period, field costs, and data
quality Paper presented at ICES III, Montreal,
Canada
  • Purpose Estimate amount and condition of
    research space for science and engineering
  • Target population Colleges and universities.
  • Sample design Census
  • Mode of data collection Web and mail
  • Response rate 94
  • Target estimate Square feet
  • Nonresponse error measure Relative bias when
    level of effort is cut by 25

45
Relative bias for Facilities Survey estimates for
academic institutions, by field and response level
46
Summary
  • The patterns are not consistent
  • The 75 response level generally exhibits larger
    bias than the other response levels, but
    generally not statistically significant
  • No significant bias if data collection were
    terminated at the 88 response level

47
5. Followup of Nonrespondents
  • Use of respondent data obtained through
    extra-ordinary efforts as comparison to
    respondent data obtained with traditional efforts
  • Effort may include callbacks, incentives,
    change of mode, use of elite corps of interviewers

48
How to Do a Nonrespondent Followup Study
  • Define a set of recruitment techniques judged to
    be superior to those in the ongoing effort
  • Determine whether budget permits use of those
    techniques on all remaining active cases
  • If not, implement 2nd phase sample (described
    later)
  • Implement enhanced recruitment protocol
  • Compare respondents obtained in enhanced protocol
    with those in the initial protocol

49
Pros and Cons of Nonresponse Followup Study
  • Pros
  • Direct measures are obtained from previously
    nonrespondent cases
  • Same measurements are used
  • Nonresponse bias on all variables can be
    estimated
  • Cons
  • Rarely are followup response rates 100
  • Requires extended data collection period

50
Marker, et al (2005).Terrorism Risk Insurance
Program Policyholders Survey. Final report
prepared for the Department of Treasury. Westat
Rockville, MD
  • Purpose Estimate use of Terrorism Insurance by
    Businesses
  • Target population Businesses and state/local
    government offices in the U.S..
  • Sample design Stratified sample.
  • Mode of data collection Web and mail
  • Response rate17
  • Target estimate Estimated percent that have
    insurance
  • Nonresponse error measure Indicators of use of
    terrorism insurance

51
Follow-up Procedures
  • Contacted follow of 1000 non-resopndents to the
    survey.
  • A shortened instrument was used to collect
    critical measures
  • Interviews conducted by telephone

52
Selected comparisons from non-response follow-up
plt.000
53
Summary
  • Out of 8 estimates, one was statistically
    significant
  • Some indication that non-response leads to
    overestimate of certain types of insurance
  • The significant difference may have been due to
    measurement error on the follow-up instrument

54
Five Things You Should Remember from this Lecture
  • The three principal types of nonresponse bias
    studies are
  • - Comparing surveys to external data
  • - Studying internal variation within the data
    collection, and
  • - Contrasting alternative postsurvey adjusted
    estimates
  • All three have strengths and weaknesses using
    multiple approaches simultaneously provides
    greater understanding
  • Nonresponse bias is specific to a statistic, so
    separate assessments may be needed for different
    estimates
  • Auxiliary variables correlated with both the
    likelihood of responding and key survey variables
    are important for evaluation
  • Thinking about nonresponse before the survey is
    important because different modes, frames, and
    survey designs permit different types of studies
Write a Comment
User Comments (0)
About PowerShow.com