Handling Missing Data on ALSPAC - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Handling Missing Data on ALSPAC

Description:

Outcome: Mood (ordinal, 3 categories) Depressive symptoms, maternally rated ... Ordinal logistic regression. Missing Value (MV) pattern1. 1All MV patterns 200 ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 21
Provided by: PaulC122
Category:

less

Transcript and Presenter's Notes

Title: Handling Missing Data on ALSPAC


1
Handling Missing Data on ALSPAC
  • Paul Clarke
  • (CMPO, University of Bristol)
  • ALSPAC Social Science User Group meeting
  • 21 May 2008

2
Outline
  • What causes missing data?
  • Types of missing data
  • Methods for missing data quick overview
  • ALSPAC Blitz on non-respondents
  • Investigating MNAR data in ALSPAC

3
Example ALSPAC analysis
  • At age 11
  • Outcome Mood (ordinal, 3 categories)
  • Depressive symptoms, maternally rated
  • Main exposure Physical activity (score)
  • Measured on actigraph, 3 days
  • Adjustment
  • BMI (score)
  • Sex, Age at screening
  • Ordinal logistic regression

4
Missing Value (MV) pattern1
1All MV patterns lt 200 cases ignored
5
What causes missing data?
Interviewer effectiveness Incentive for
participant Loyalty
Letter Telephone calls Interviewer visits
Non-contact
6
  • Result of processes leading to
  • Refusal to answer questions (item)
  • Refusal to participate (unit)
  • No contact (unit)
  • Longitudinal-specific attrition drop-out
  • Non-response mechanism(s) - NRM

7
Rubins definitions1
  • Missing Completely At Random (MCAR)
  • Independent of observed variables
  • Missing At Random (MAR)
  • NRM depends only on observed variables
  • Missing Not At Random (MNAR)
  • NRM depends on missing variables too

1Little Rubin (2002) Statistical Analysis with
Missing Data
8
Directed Acyclic Graph (DAG)
Y
X
R
C
  • R independent ? data MCAR

9
MAR data
Y
X
R
C
  • R indirectly related to Y through X and C

10
Methods for MAR data
  • Complete cases analysis/Listwise deletion
  • Weighting
  • Weighting classes, post-stratification
  • (Single) imputation methods
  • e.g. regression, hot-deck/nearest-neighbour
  • Multiple imputation methods
  • e.g. Norm, MICE
  • Semiparametric estimators

11
Imputation in practice pitfalls1
  • Omitting the outcome
  • Imputing non-normal variables
  • MAR completely implausible
  • Convergence of iterative procedures

1Sterne et al. (2008) British Medical Journal
12
Complex methods
  • Analysis model
  • e.g. Ordinal logistic regression
  • Imputation model Missing given Observed
  • ALL assume MAR data

13
MAR data in reality
?
Y
X
R
C
  • Unknown factors drive non-response
  • correlated with model predictors
  • but not with Y

14
Why is this important?
  • Weakness of MAR How do we know?
  • Central problem missing data is missing!
  • MAR is a leap of faith

15
MNAR data
?
Y
X
R
C
  • Unknowns directly correlated with Y?

16
Physical activity example
?
Mood
Phys Act
R
BMI, Sex, Age
  • NRM is mother-driven (child age 11)
  • Child must wear actigraph for 3 days
  • Mother must assess her childs mood

17
ALSPAC Blitz
  • Co-ordinated by Family Liaison Unit
  • 4 tranches Nov 2007-May 2008
  • Target 5000 teenagers not in last 2 waves
  • Mini-clinic for difficult to persuade

18
Proposed analysis
  • MAR is context dependent
  • Risky behaviours (Glyn Lewis, et al)
  • Outcomes Cannabis use, sexual practices, etc
  • Risk factors mental health, sensation seeking,
    etc
  • Basic analysis
  • Compare follow-up with main sample
  • Still differences after adjustment?

19
Unit non-response
  • 100 follow-up rate unlikely!
  • Directly model NRM
  • Continuum of non-response
  • Hard to contact less like main sample
  • Weighting scheme (Alho 1990 Wood et al. 2006)
  • Lower bound for MNAR bias

20
Item non-response
  • Parallel qualitative post
  • Items questions on risky behaviours
  • What mechanisms drive non-response?
  • Test hypotheses from this project
Write a Comment
User Comments (0)
About PowerShow.com