Partially missing at random and ignorable inferences for parameter subsets with missing data - PowerPoint PPT Presentation

About This Presentation
Title:

Partially missing at random and ignorable inferences for parameter subsets with missing data

Description:

Title: Statistical Analysis of Repeated-Measures Data with Dropouts Author: Preferred Customer Last modified by: School of Public Health Created Date – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 32
Provided by: Preferred99
Category:

less

Transcript and Presenter's Notes

Title: Partially missing at random and ignorable inferences for parameter subsets with missing data


1
Partially missing at random and ignorable
inferences for parameter subsets with missing data
  • Roderick Little

2
Outline
  • Survey Bayesics in three slides
  • Inference with missing data Rubin's (1976) paper
    on conditions for ignoring the missing-data
    mechanism
  • Rubins standard conditions are sufficient but
    not necessary example
  • Propose definitions of MAR, ignorability for
    likelihood (and Bayes) inference for subsets of
    parameters
  • Examples
  • Joint work with Sahar Zanganeh

3
Calibrated Bayes
  • Frequentists should be Bayesian
  • Bayes is optimal under assumed model
  • Bayesians should be frequentist
  • We never know the model (and all models are
    wrong)
  • Inferences should have good repeated sampling
    characteristics
  • Calibrated Bayes (e.g. Box 1980, Rubin 1984,
    Little 2012)
  • Inference based on a Bayesian model
  • Model chosen to yield inferences that are
    well-calibrated in a frequentist sense
  • Aim for posterior probability intervals that have
    (approximately) nominal frequentist coverage

4
Calibrated Bayes models for surveys should
incorporate sample design features
  • All models are wrong, some models are useful
  • Design-assisted make the estimator more robust
  • Calibrated Bayes make the model more robust
    many models yield design-consistent estimates
  • Models that ignore features like survey weights
    are vulnerable to misspecification
  • But models can be successfully applied in survey
    setting, with attention to design features
  • Weighting, stratification, clustering
  • Capture design weights as covariates in the
    prediction model (e.g. Gelman 2007)

5
Benefits of Bayes
  • Unified approach to all problems
  • Avoids current approach -- inferential
    schizophrenia
  • Not asymptotic
  • Propagates errors in estimating parameters
  • Avoids frequentist pitfalls
  • Conditions on ancillaries
  • Obeys likelihood principle

6
v
7
There are those who predict
and those who weight
8
Rubin (1976 Biometrika)
  • Landmark paper (3700 citations, after being
    rejected by many journals!)
  • RL wrote his first (11 page) referee report, and
    an obscure discussion
  • Modeled the missing data mechanism by treating
    missingness indicators as random variables,
    assigning them a distribution
  • Sufficient conditions under which missing data
    mechanism can be ignored for likelihood and
    frequentist inference about parameters
  • Focus here on likelihood, Bayes

9
Ignoring the mechanism
  • Full likelihood
  • Likelihood ignoring mechanism
  • Missing data mechanism can be ignored for
    likelihood inference when

10
Rubins sufficient conditions for ignoring the
mechanism
  • Missing data mechanism can be ignored for
    likelihood inference when
  • (a) the missing data are missing at random (MAR)
  • (b) distinctness of the parameters of the data
    model and the missing-data mechanism
  • MAR is the key condition without (b), inferences
    are valid but not fully efficient

11
Sufficient for ignorable is not the same as
ignorable
  • These definitions have come to define
    ignorability (e.g. Little and Rubin 2002)
  • However, Rubin (1976) described (a) and (b) as
    the "weakest simple and general conditions under
    which it is always appropriate to ignore the
    process that causes missing data".
  • These conditions are not necessary for ignoring
    the mechanism in all situations.

12
Example 1 Nonresponse with auxiliary data
Or whole population N
0 0 0 1 1
? ?
? ?
Not linked
13
MAR, ignorability for parameter subsets
  • MAR and ignorability are defined in terms of the
    complete set of parameters in the data model for
    D
  • It would be useful to have a definition of MAR
    that applies to subsets of parameters, including
    parameters of substantive interest.
  • A trivial example It seems plausible that a
    nonignorable mechanism would be MAR for the
    parameters of distributions of variables that are
    not missing.

14
MAR, ignorability for parameter subsets
15
MAR, ignorability for parameter subsets
16
Partial MAR given a function of mechanism
17
Example 1 Auxiliary Survey Data
0 0 0 1 1
? ?
? ?
Not linked
18
Ex. 2 MNAR Monotone Bivariate Data
  • Paper presents more interesting case with Y1, Y2
    blocks of variables and missing data in each
    block

0 0 0 1 1
? ?
19
More generally
20
Ex. 3 Complete Case Analysis in Regression
0 0 0 0 1 1
? ?
? ?
21
Ex. 4A normal pattern-mixture model
0 0 0 1 1
? ?
22
Ex. 5 Subsample ignorable likelihood
Little and Zhang (2011) Columns could be
vectors v fully observed ? observed or missing
Pattern Z W X Y
P1 v v ? ?
P2 v ? ? ?
  • Interest concerns parameters of regression
    of Y on (Z,X,W)
  • Z complete, W and (X,Y) incomplete. W complete in
    P1.
  • Division of covariates into W, X is based on
    following MNAR assumptions about the missing data
    mechanism
  • Pr(W complete) fn(W,X,Z) (not Y)
  • (X,Y) MAR in subsample with W fully
    observed (that is, P1)

23
Ex. 6 Auxiliary data, survey nonresponse
1 . . r . . n . . N
? ?
? ?
Not linked
24
Simulation Study
25
Simulation Study methods
  • CC Complete Case estimates based on the
    responding units
  • M1 ML based on a logistic regression with
    interaction for Y3
  • M2 ML based on an additive logistic regression
    for Y3
  • NR Weighting class estimates where nonresponse
    weights are obtained based on Y1
  • PS Post-stratification weighted estimates (PS)
    based on Y2
  • NRPS Adjust weights using both Y1 and Y2. For
    the case of
  • categorical variable, this method is equivalent
    to Linear Calibration regression, or Generalized
    Raking estimates

26
(No Transcript)
27
Simulation summary findings
  • When response depends on Y1 Y2 interaction, all
    methods do poorly
  • When data are MCAR, all methods do similarly well
  • Model-based methods remove almost all the bias
    and perform better when response doesnt depend
    on Y1 Y2 interaction
  • Qualitative patterns hold for different sample
    sizes

28
Frequentist inference
  • Rubins (1976) sufficient conditions for
    ignorability for frequentist inference were even
    stronger (essentially MCAR)
  • These can be weakened too for example
    asymptotic frequentist inference based on ML and
    observed information matrix works under
    conditions given here
  • Small sample inference seems more problematic

29
Frequentist inference
  • Rubins (1976) sufficient conditions for
    ignorability for frequentist inference were even
    stronger (essentially MCAR)
  • These can be weakened too for example
    asymptotic frequentist inference based on ML and
    observed information matrix works under
    conditions given here
  • Small sample inference is more complex

30
Summary
  • Proposed definitions of partial MAR, ignorability
    for subsets of parameters
  • Expands range of situations where missing data
    mechanism can be ignored
  • Though, in some cases, MAR analysis entails a
    loss of information
  • How much is lost is an interesting question,
    varies by context

31
References
  • Harel, O. and Schafer, J.L. (2009). Partial and
    Latent Ignorability in missing data problems.
    Biometrika, 2009, 1-14
  • Little, R.J.A. (1993). Pattern-Mixture Models for
    Multivariate Incomplete Data. JASA, 88, 125-134.
  • Little, R. J. A., and Rubin, D. B. (2002).
    Statistical Analysis with Missing Data (2nd ed.)
    Wiley.
  • Little, R.J. and Zangeneh, S.Z. (2013). Missing
    at random and ignorability for inferences about
    subsets of parameters with missing data.
    University of Michigan Biostatistics Working
    Paper Series.
  • Little, R. J. and Zhang, N. (2011). Subsample
    ignorable likelihood for regression analysis with
    missing data. JRSSC, 60, 4, 591605.
  • Rubin, D. B. (1976). Inference and Missing Data.
    Biometrika 63, 581-592.
Write a Comment
User Comments (0)
About PowerShow.com