Partially missing at random and ignorable inferences for parameter subsets with missing data - PowerPoint PPT Presentation

About This Presentation

Title:

Partially missing at random and ignorable inferences for parameter subsets with missing data

Description:

Title: Statistical Analysis of Repeated-Measures Data with Dropouts Author: Preferred Customer Last modified by: School of Public Health Created Date – PowerPoint PPT presentation

Number of Views:125

Avg rating:3.0/5.0

Slides: 32

Provided by: Preferred99

Learn more at: https://www.stat.colostate.edu

Category:

more less

Transcript and Presenter's Notes

Title: Partially missing at random and ignorable inferences for parameter subsets with missing data

1
Partially missing at random and ignorable
inferences for parameter subsets with missing data

Roderick Little

2
Outline

Survey Bayesics in three slides
Inference with missing data Rubin's (1976) paper
on conditions for ignoring the missing-data
mechanism
Rubins standard conditions are sufficient but
not necessary example
Propose definitions of MAR, ignorability for
likelihood (and Bayes) inference for subsets of
parameters
Examples
Joint work with Sahar Zanganeh

3
Calibrated Bayes

Frequentists should be Bayesian
Bayes is optimal under assumed model
Bayesians should be frequentist
We never know the model (and all models are
wrong)
Inferences should have good repeated sampling
characteristics
Calibrated Bayes (e.g. Box 1980, Rubin 1984,
Little 2012)
Inference based on a Bayesian model
Model chosen to yield inferences that are
well-calibrated in a frequentist sense
Aim for posterior probability intervals that have
(approximately) nominal frequentist coverage

4
Calibrated Bayes models for surveys should
incorporate sample design features

All models are wrong, some models are useful
Design-assisted make the estimator more robust
Calibrated Bayes make the model more robust
many models yield design-consistent estimates
Models that ignore features like survey weights
are vulnerable to misspecification
But models can be successfully applied in survey
setting, with attention to design features
Weighting, stratification, clustering
Capture design weights as covariates in the
prediction model (e.g. Gelman 2007)

5
Benefits of Bayes

Unified approach to all problems
Avoids current approach -- inferential
schizophrenia
Not asymptotic
Propagates errors in estimating parameters
Avoids frequentist pitfalls
Conditions on ancillaries
Obeys likelihood principle

6
v
7
There are those who predict
and those who weight
8
Rubin (1976 Biometrika)

Landmark paper (3700 citations, after being
rejected by many journals!)
RL wrote his first (11 page) referee report, and
an obscure discussion
Modeled the missing data mechanism by treating
missingness indicators as random variables,
assigning them a distribution
Sufficient conditions under which missing data
mechanism can be ignored for likelihood and
frequentist inference about parameters
Focus here on likelihood, Bayes

9
Ignoring the mechanism

Full likelihood
Likelihood ignoring mechanism
Missing data mechanism can be ignored for
likelihood inference when

10
Rubins sufficient conditions for ignoring the
mechanism

Missing data mechanism can be ignored for
likelihood inference when
(a) the missing data are missing at random (MAR)
(b) distinctness of the parameters of the data
model and the missing-data mechanism
MAR is the key condition without (b), inferences
are valid but not fully efficient

11
Sufficient for ignorable is not the same as
ignorable

These definitions have come to define
ignorability (e.g. Little and Rubin 2002)
However, Rubin (1976) described (a) and (b) as
the "weakest simple and general conditions under
which it is always appropriate to ignore the
process that causes missing data".
These conditions are not necessary for ignoring
the mechanism in all situations.

12
Example 1 Nonresponse with auxiliary data
Or whole population N
0 0 0 1 1
? ?
? ?
Not linked
13
MAR, ignorability for parameter subsets

MAR and ignorability are defined in terms of the
complete set of parameters in the data model for
D
It would be useful to have a definition of MAR
that applies to subsets of parameters, including
parameters of substantive interest.
A trivial example It seems plausible that a
nonignorable mechanism would be MAR for the
parameters of distributions of variables that are
not missing.

14
MAR, ignorability for parameter subsets
15
MAR, ignorability for parameter subsets
16
Partial MAR given a function of mechanism
17
Example 1 Auxiliary Survey Data
0 0 0 1 1
? ?
? ?
Not linked
18
Ex. 2 MNAR Monotone Bivariate Data

Paper presents more interesting case with Y1, Y2
blocks of variables and missing data in each
block

0 0 0 1 1
? ?
19
More generally
20
Ex. 3 Complete Case Analysis in Regression
0 0 0 0 1 1
? ?
? ?
21
Ex. 4A normal pattern-mixture model
0 0 0 1 1
? ?
22
Ex. 5 Subsample ignorable likelihood
Little and Zhang (2011) Columns could be
vectors v fully observed ? observed or missing
Pattern Z W X Y
P1 v v ? ?
P2 v ? ? ?

Interest concerns parameters of regression
of Y on (Z,X,W)
Z complete, W and (X,Y) incomplete. W complete in
P1.
Division of covariates into W, X is based on
following MNAR assumptions about the missing data
mechanism
Pr(W complete) fn(W,X,Z) (not Y)
(X,Y) MAR in subsample with W fully
observed (that is, P1)

23
Ex. 6 Auxiliary data, survey nonresponse
1 . . r . . n . . N
? ?
? ?
Not linked
24
Simulation Study
25
Simulation Study methods

CC Complete Case estimates based on the
responding units
M1 ML based on a logistic regression with
interaction for Y3
M2 ML based on an additive logistic regression
for Y3
NR Weighting class estimates where nonresponse
weights are obtained based on Y1
PS Post-stratification weighted estimates (PS)
based on Y2
NRPS Adjust weights using both Y1 and Y2. For
the case of
categorical variable, this method is equivalent
to Linear Calibration regression, or Generalized
Raking estimates

26
(No Transcript)
27
Simulation summary findings

When response depends on Y1 Y2 interaction, all
methods do poorly
When data are MCAR, all methods do similarly well
Model-based methods remove almost all the bias
and perform better when response doesnt depend
on Y1 Y2 interaction
Qualitative patterns hold for different sample
sizes

28
Frequentist inference

Rubins (1976) sufficient conditions for
ignorability for frequentist inference were even
stronger (essentially MCAR)
These can be weakened too for example
asymptotic frequentist inference based on ML and
observed information matrix works under
conditions given here
Small sample inference seems more problematic

29
Frequentist inference

Rubins (1976) sufficient conditions for
ignorability for frequentist inference were even
stronger (essentially MCAR)
These can be weakened too for example
asymptotic frequentist inference based on ML and
observed information matrix works under
conditions given here
Small sample inference is more complex

30
Summary

Proposed definitions of partial MAR, ignorability
for subsets of parameters
Expands range of situations where missing data
mechanism can be ignored
Though, in some cases, MAR analysis entails a
loss of information
How much is lost is an interesting question,
varies by context

31
References

Harel, O. and Schafer, J.L. (2009). Partial and
Latent Ignorability in missing data problems.
Biometrika, 2009, 1-14
Little, R.J.A. (1993). Pattern-Mixture Models for
Multivariate Incomplete Data. JASA, 88, 125-134.
Little, R. J. A., and Rubin, D. B. (2002).
Statistical Analysis with Missing Data (2nd ed.)
Wiley.
Little, R.J. and Zangeneh, S.Z. (2013). Missing
at random and ignorability for inferences about
subsets of parameters with missing data.
University of Michigan Biostatistics Working
Paper Series.
Little, R. J. and Zhang, N. (2011). Subsample
ignorable likelihood for regression analysis with
missing data. JRSSC, 60, 4, 591605.
Rubin, D. B. (1976). Inference and Missing Data.
Biometrika 63, 581-592.