Title: Partially missing at random and ignorable inferences for parameter subsets with missing data
1Partially missing at random and ignorable
inferences for parameter subsets with missing data
2Outline
- Survey Bayesics in three slides
- Inference with missing data Rubin's (1976) paper
on conditions for ignoring the missing-data
mechanism - Rubins standard conditions are sufficient but
not necessary example - Propose definitions of MAR, ignorability for
likelihood (and Bayes) inference for subsets of
parameters - Examples
- Joint work with Sahar Zanganeh
3Calibrated Bayes
- Frequentists should be Bayesian
- Bayes is optimal under assumed model
- Bayesians should be frequentist
- We never know the model (and all models are
wrong) - Inferences should have good repeated sampling
characteristics - Calibrated Bayes (e.g. Box 1980, Rubin 1984,
Little 2012) - Inference based on a Bayesian model
- Model chosen to yield inferences that are
well-calibrated in a frequentist sense - Aim for posterior probability intervals that have
(approximately) nominal frequentist coverage
4Calibrated Bayes models for surveys should
incorporate sample design features
- All models are wrong, some models are useful
- Design-assisted make the estimator more robust
- Calibrated Bayes make the model more robust
many models yield design-consistent estimates - Models that ignore features like survey weights
are vulnerable to misspecification - But models can be successfully applied in survey
setting, with attention to design features - Weighting, stratification, clustering
- Capture design weights as covariates in the
prediction model (e.g. Gelman 2007)
5Benefits of Bayes
- Unified approach to all problems
- Avoids current approach -- inferential
schizophrenia - Not asymptotic
- Propagates errors in estimating parameters
- Avoids frequentist pitfalls
- Conditions on ancillaries
- Obeys likelihood principle
6v
7There are those who predict
and those who weight
8Rubin (1976 Biometrika)
- Landmark paper (3700 citations, after being
rejected by many journals!) - RL wrote his first (11 page) referee report, and
an obscure discussion - Modeled the missing data mechanism by treating
missingness indicators as random variables,
assigning them a distribution - Sufficient conditions under which missing data
mechanism can be ignored for likelihood and
frequentist inference about parameters - Focus here on likelihood, Bayes
9Ignoring the mechanism
- Full likelihood
- Likelihood ignoring mechanism
- Missing data mechanism can be ignored for
likelihood inference when
10Rubins sufficient conditions for ignoring the
mechanism
- Missing data mechanism can be ignored for
likelihood inference when - (a) the missing data are missing at random (MAR)
- (b) distinctness of the parameters of the data
model and the missing-data mechanism - MAR is the key condition without (b), inferences
are valid but not fully efficient
11Sufficient for ignorable is not the same as
ignorable
- These definitions have come to define
ignorability (e.g. Little and Rubin 2002) - However, Rubin (1976) described (a) and (b) as
the "weakest simple and general conditions under
which it is always appropriate to ignore the
process that causes missing data". - These conditions are not necessary for ignoring
the mechanism in all situations.
12Example 1 Nonresponse with auxiliary data
Or whole population N
0 0 0 1 1
? ?
? ?
Not linked
13MAR, ignorability for parameter subsets
- MAR and ignorability are defined in terms of the
complete set of parameters in the data model for
D - It would be useful to have a definition of MAR
that applies to subsets of parameters, including
parameters of substantive interest. - A trivial example It seems plausible that a
nonignorable mechanism would be MAR for the
parameters of distributions of variables that are
not missing.
14MAR, ignorability for parameter subsets
15MAR, ignorability for parameter subsets
16Partial MAR given a function of mechanism
17Example 1 Auxiliary Survey Data
0 0 0 1 1
? ?
? ?
Not linked
18Ex. 2 MNAR Monotone Bivariate Data
- Paper presents more interesting case with Y1, Y2
blocks of variables and missing data in each
block
0 0 0 1 1
? ?
19More generally
20Ex. 3 Complete Case Analysis in Regression
0 0 0 0 1 1
? ?
? ?
21Ex. 4A normal pattern-mixture model
0 0 0 1 1
? ?
22Ex. 5 Subsample ignorable likelihood
Little and Zhang (2011) Columns could be
vectors v fully observed ? observed or missing
Pattern Z W X Y
P1 v v ? ?
P2 v ? ? ?
- Interest concerns parameters of regression
of Y on (Z,X,W) - Z complete, W and (X,Y) incomplete. W complete in
P1. - Division of covariates into W, X is based on
following MNAR assumptions about the missing data
mechanism - Pr(W complete) fn(W,X,Z) (not Y)
- (X,Y) MAR in subsample with W fully
observed (that is, P1)
23Ex. 6 Auxiliary data, survey nonresponse
1 . . r . . n . . N
? ?
? ?
Not linked
24Simulation Study
25Simulation Study methods
- CC Complete Case estimates based on the
responding units - M1 ML based on a logistic regression with
interaction for Y3 - M2 ML based on an additive logistic regression
for Y3 - NR Weighting class estimates where nonresponse
weights are obtained based on Y1 - PS Post-stratification weighted estimates (PS)
based on Y2 - NRPS Adjust weights using both Y1 and Y2. For
the case of - categorical variable, this method is equivalent
to Linear Calibration regression, or Generalized
Raking estimates
26(No Transcript)
27Simulation summary findings
- When response depends on Y1 Y2 interaction, all
methods do poorly - When data are MCAR, all methods do similarly well
- Model-based methods remove almost all the bias
and perform better when response doesnt depend
on Y1 Y2 interaction - Qualitative patterns hold for different sample
sizes
28Frequentist inference
- Rubins (1976) sufficient conditions for
ignorability for frequentist inference were even
stronger (essentially MCAR) - These can be weakened too for example
asymptotic frequentist inference based on ML and
observed information matrix works under
conditions given here - Small sample inference seems more problematic
29Frequentist inference
- Rubins (1976) sufficient conditions for
ignorability for frequentist inference were even
stronger (essentially MCAR) - These can be weakened too for example
asymptotic frequentist inference based on ML and
observed information matrix works under
conditions given here - Small sample inference is more complex
30Summary
- Proposed definitions of partial MAR, ignorability
for subsets of parameters - Expands range of situations where missing data
mechanism can be ignored - Though, in some cases, MAR analysis entails a
loss of information - How much is lost is an interesting question,
varies by context
31References
- Harel, O. and Schafer, J.L. (2009). Partial and
Latent Ignorability in missing data problems.
Biometrika, 2009, 1-14 - Little, R.J.A. (1993). Pattern-Mixture Models for
Multivariate Incomplete Data. JASA, 88, 125-134. - Little, R. J. A., and Rubin, D. B. (2002).
Statistical Analysis with Missing Data (2nd ed.)
Wiley. - Little, R.J. and Zangeneh, S.Z. (2013). Missing
at random and ignorability for inferences about
subsets of parameters with missing data.
University of Michigan Biostatistics Working
Paper Series. - Little, R. J. and Zhang, N. (2011). Subsample
ignorable likelihood for regression analysis with
missing data. JRSSC, 60, 4, 591605. - Rubin, D. B. (1976). Inference and Missing Data.
Biometrika 63, 581-592.