Compliance and Causal Analysis Lecture 2 Randomisationbased methods 1: likelihood estimation - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Compliance and Causal Analysis Lecture 2 Randomisationbased methods 1: likelihood estimation

Description:

Abdominal aortic aneurysms are often fatal if they rupture ... MASS trial: Aneurysm-related mortality. 7. Intention-to-treat analysis ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 73
Provided by: ianw166
Category:

less

Transcript and Presenter's Notes

Title: Compliance and Causal Analysis Lecture 2 Randomisationbased methods 1: likelihood estimation


1
Compliance and Causal AnalysisLecture 2
Randomisation-based methods 1 likelihood
estimation
  • Ian White
  • MRC Biostatistics Unit, Cambridge, UK
  • ian.white_at_mrc-bsu.cam.ac.uk

2
Causal analysis in randomised trials
  • A randomised controlled trial is used to evaluate
    an intervention
  • No confounding because randomised group R is
    independent of all predictors of outcome
  • But we get departures from randomised
    intervention actual treatment D differs from
    allocated treatment R
  • Want to infer causal effect of treatment D

3
Departures from randomised intervention
  • Sometimes called non-compliance
  • I prefer departures
  • avoids implicit value judgements
  • more precise includes both
  • non-adherence (randomised to X, clinician
    prescribes X, patient does Y)
  • changes in prescribed treatment (randomised to
    X, clinician prescribes Y, patient does Y)

4
Types of departure from randomised intervention
  • Switches to other trial treatment
  • Changes to non-trial treatment
  • Includes changes to nothingin a comparative
    trial
  • Departures may be
  • all-or-nothing (either always get A or always get
    B)
  • or quantitative (e.g. dose changes)
  • or time-dependent (e.g. emergency operation)

? FOCUS ON SWITCHES
5
Motivating example MASS trial
  • Abdominal aortic aneurysms are often fatal if
    they rupture
  • May be repaired if detected before rupture
  • Reliably detected by ultrasound screening
  • MASS trial (Lancet, 2002) 67,800 men were
    randomised to invitation to screening or control
  • Main outcome was aneurysm-related mortality

6
MASS trial Aneurysm-related mortality
7
Intention-to-treat analysis
  • Intention-to-treat analysis compares groups as
    randomised, ignoring any departures
  • respects the randomisation
  • avoids selection bias
  • Now the standard analysis and rightly so
  • Answers an important pragmatic question e.g. what
    is the public health impact of prescribing X?
  • Disadvantage this may be the wrong question!

8
Disadvantage of ITT
  • Doctor doctor, how much will taking this tablet
    reduce my risk of heart disease?
  • I dont know, but prescribing it reduces
    disease risk by 10
  • on average
  • thats on average over whether you take it or not

9
Example MASS trial (ctd)
  • Intention-to-treat (ITT) analysis invitation
    to screening reduced aneurysm-related death
  • hazard ratio 0.58 (95 CI, 0.42 to 0.78), P0.002
  • 20 of invited group didnt attend for screening
    (all-or-nothing non-compliance)
  • ITT measures the average benefit of screening in
    invitees
  • What is the benefit of screening in attenders?

10
Plan for this lecture
  • Basic idea for all-or-nothing compliance
  • Binary outcome estimation
  • Normal outcome
  • Simple vs. ML estimation
  • Negative weights method
  • Back door method

11
1. Basic idea for all-or-nothing compliance
12
Notation
  • Randomise to E (experimental) or S (standard)
    treatment
  • S could be nothing / placebo
  • Potential treatments are E and S assume everyone
    gets either all E or all S (all-or-nothing
    compliance)
  • Ri 1/0 indicates randomisation of ith
    individual to E/S
  • Di 1/0 indicates receipt of E/S

13
Theoretical framework
  • Ri 1/0 indicates randomisation of ith
    individual to E/S
  • Di 1/0 indicates actual receipt of E/S
  • Di(0) 1/0 indicates receipt of E/S if
    randomised to S observed in S arm,
    counterfactual in E arm
  • Di(1) 1/0 indicates receipt of E/S if
    randomised to E observed in E arm,
    counterfactual in S arm
  • So actual Di Di(Ri)
  • The pair Di(0), Di(1) define an individuals
    compliance type.

14
Compliance types
  • Always-takers (A) Di(0) Di(1) 1 always take
    E regardless of randomisation
  • Compliers (C) Di(0) 0, Di(1) 1 take whatever
    they were allocated to
  • Never-takers (N) Di(0) Di(1) 0 never take E
    regardless of randomisation
  • Defiers (D) Di(0) 1, Di(1) 0 take the
    opposite of what they were allocated to

15
Using compliance types
  • Note that compliance-type is a characteristic
    that is inherent to the individual before
    randomisation
  • unlike actual compliance, which is affected by
    randomisation
  • This means we can meaningfully adjust/stratify by
    compliance-type, but not by actual compliance
  • But unfortunately compliance-type is incompletely
    observed requires careful statistical methods
  • Compliance types are an example of Principal
    strata (Frangakis Rubin, 2002)

16
What do we observe about compliance-types?
17
What do we observe about compliance-types?
Simplification if S arm cant get E (no
always-takers and no defiers)
18
Defining true treatment effect
  • Could be defined in various ways
  • Use potential outcomes Y(r,d) outcome if
    randomised to r and received d
  • Y(r,d) may depend on r
  • e.g. never-takers of a counselling intervention
    might do worse in the E arm (where they refused
    the counselling) than in the S arm (where they
    werent offered the counselling)
  • but in many settings it wont (exclusion
    restriction)

19
Defining true treatment effect
  • Average effect of treatment
  • EY(1,1)-Y(0,0)
  • Average effect of treatment among the treated
  • EY(1,1)-Y(0,0) D(1)1
  • For an always-taker, we will observe
  • Y(0,1) if we randomise to S
  • Y(1,1) if we randomise to E
  • but we cannot observe Y(0,0)
  • Is there a measure of causal effect that involves
    potentially observable outcomes?

20
Complier average causal effect (CACE)
  • ITT effect EY(1,D(1)) Y(0,D(0))
  • Define average causal effect of treatment
    assignment in compliance-type t as EY(1,D(1))
    Y(0,D(0)) Tt
  • Complier Average Causal Effect (CACE)
  • EY(1,1) Y(0,0) TC
  • also Local Average Treatment Effect (LATE)
    (Angrist et al, 1996)
  • Never-taker average causal effect (NACE?)
  • EY(1,0) Y(0,0) TN
  • Always-taker average causal effect (AACE?)
  • EY(1,1) Y(0,1) TA
  • CACE measures treatment efficacy but only
    involves potentially observable outcomes (Imbens
    Rubin, 1997)

21
CACE (2)
  • Ignore defiers (mainly for simplicity)
  • Can write ITT wC CACE wN NACE wA AACE
  • where wC P(compliance-type C) etc.
  • Its often reasonable to assume NACEAACE0, so
    that ITT wC CACE
  • Give a simple estimate of the CACE

22
CACE (3)
  • For binary outcomes, we can define the CACE on
    different scales.
  • Risk difference scale
  • CACE EY(1,1) TC EY(0,0) TC
  • Risk ratio scale
  • CACE EY(1,1) TC / EY(0,0) TC
  • Odds ratio scale
  • CACE OY(1,1) TC / OY(0,0) TC
  • where OXEX/(1-EX)

23
2. Estimation
24
Vitamin A trial(Sommer and Zeger, 1991)
  • Vitamin A vs. control in Indonesian children
  • randomise villages (24000 children)
  • outcome is mortality
  • about 20 of villages didnt get their vitamin A
    supply
  • this analysis ignores clustering by village

25
ITT analysis
ITT odds ratio ? 46/77 0.60
26
Observed compliance (Vitamin A arm)
27
Inferred compliance (Control arm)
CACE odds ratio ? 12/43 0.28
28
Estimation method of subtraction
nNE
if nE nS
29
Formally
Hence write down and maximise (log-)likelihood. No
te 1-w, w previous wC, wN
30
MLE
In this simple case, pCE is estimated as
(dCE/nCE) pCS is estimated as (dS dNE
nS/nE) / (nS nNE nS/nE) provided both
terms 0 (Ill assume this is true)
31
MLE
  • Use estimates of pCE and pCS to estimate CACE
  • as pCE-pCS on RD scale
  • as pCE/pCS on RR scale
  • as pCE/(1-pCE) / pCS /(1-pCS) on OR scale
  • On RD scale only, can show that MLEs obey CACE
    ITT / (1-w)

32
CACE compared with other quantities
33
Vitamin A vs. control summary
  • ITT 0.38 vs. 0.64, RR0.60
  • CACE 0.12 vs. 0.45, RR0.28
  • Per-protocol 0.12 vs. 0.64, RR0.19
  • As-treated 0.12 vs. 0.77, RR0.16
  • On-treatment and as-treated are too extreme
    because of strong selection effect 1.41
    (untreated in treatment arm) vs. 0.64 (untreated
    in control arm)

34
Comparison of assumptions
  • Per-protocol and as-treated analyses assume
    random non-compliance
  • no association between compliance-type and
    outcome, once treatment effect is taken into
    account
  • CACE analysis assumes exclusion restriction
  • randomisation doesnt affect mean outcome for
    never-takers and always-takers
  • no assumption of comparability of different
    compliance-types
  • usually much more plausible

35
Extensions to CACE model (1)
  • Above we had all the S arm getting S
  • Easy to allow for S arm possibly getting E
    (Cuzick et al, 1997)
  • CACE is again estimable under
  • 2 exclusion restrictions NACEAACE0
  • Either no defiers or same causal effect in
    defiers as in compliers (DACE-CACE)

36
Extensions to CACE model (2)
  • Introduce covariates
  • Covariates that predict Y improve precision (as
    in ITT analysis)
  • Covariates that predict D
  • also improve precision (unlike in ITT analysis)
    (Jo, 2002)
  • enable estimation of NACE etc. as well as CACE
    (Hirano et al, 2000)

37
Extensions to CACE model (3)
  • Define g difference in outcome between
    never-takers and compliers after allowing for
    their differences in actual treatment
  • selection effect
  • difference in counterfactual outcomes
  • As-treated analysis assumes g0 (random
    non-compliance)
  • CACE and ITT analyses make no assumption about g
  • Could estimate g from data leads to CACE
    analysis
  • Instead, introduce appropriate prior information
    about g (White, 2005)

38
Model for log odds of death (observed risk)
Use informative prior gN(0,s2) for various
values of s
39
Vitamin A trial Bayesian CACE analyses
as-treated
CACE
40
3. Normal outcome
41
Normal outcome
  • Can simply modify the method of subtraction
    work with means instead of proportions
  • Link to instrumental variables (IV) method
    (lecture 5)

42
Model
CACE mCE-mCS
43
Likelihood
CACE mCE-mCS
44
ML estimation
  • No closed form solution
  • EM algorithm is easy compliance type as the
    missing data
  • usual problems in estimating standard errors
  • Newton-Raphson also fairly straightforward
  • No directly available software in Stata, but
    gllamm can be used see lecture 6.

45
4. Method comparisons
46
Comparison of CACE estimators
  • Weve looked at
  • Simple estimation using ITT wC CACE (for
    difference of means, not risk ratio or odds
    ratio)
  • Method of subtraction
  • Maximum likelihood
  • For binary outcomes, they all give the same
    answer
  • For continuous outcomes, ML estimation is
    different (potentially more efficient see later)

47
Simple CACE vs. ITT
  • They estimate different parameters
  • But they test the same null hypothesis
  • Significance levels are equal
  • obvious from CACE ITT / (1-w)
  • binary case explored in detail by Branson and
    Whitehead (2003) significance levels are equal
    when likelihood ratio test is performed

48
Vitamin A trial profile likelihoods
Likelihood ratio test has same value for both
models
49
.. and quadratic approximation (dotted)
Likelihood ratio test has same value for both
models but Wald test doesnt.
50
ML estimation of CACE vs. ITT
  • For binary outcome, significance levels are equal
  • For Normal outcome, significance levels arent
    equal
  • CACE is more efficient whenever theres a
    non-zero selection effect or a non-zero treatment
    effect
  • the next slides are thanks to Taeko Becque

51
Asymptotic relative efficiencyof CACE vs. ITT
(approximate)
q1 CACE, q2 selection effect, outcome SD 1
52
Power of CACE and ITT analyses
Compliance rate 50 Selection
effect-0.5 CACE-0.5 Standard deviation1.5
53
Power and compliance rate
Sample size 300 Selection effect-0.5 CACE-0.5
Standard deviation1.5
54
Power including covariateweak predictor of
compliance
55
Power including covariatestrong predictor of
compliance
56
Summary
57
5. Negative weights method(Kim and White, 2004)
58
Negative weights method
  • For simplicity take nEnS
  • Recall that we subtracted the number of
    events/people in the NE cell (never-takers
    randomised to E) from the number of events/people
    in the S cell (all randomised to S)
  • Can also achieve this by including them in the S
    arm but with a weight -1
  • nS/nE in general

59
Negative weights in Stata
  • Unfortunately many Stata commands are too
    sensible to allow negative weights
  • Exceptions are regress, logistic, cox
  • Illustration uses MASS data pretending outcome is
    binary

60
  • . use mass, clear
  • . l
  • rand screen event n
  • 1 1 0 27104
  • 1 1 1 43
  • 1 0 0 6670
  • 1 0 1 22
  • 0 0 0 33848
  • 0 0 1 113
  • . tab rand screen fwn, sum(event) mean
  • Means and Number of Observations of
    AAA-related death?
  • Invited to Screened?
  • screening? 0 1 Total
  • -------------------------------------------
  • 0 .00332735 . .00332735

61
  • . CACE via negative weights
  • . tab rand fwn
  • Invited to
  • screening? Freq. Percent Cum.
  • -----------------------------------------------
  • 0 33,961 50.09 50.09
  • 1 33,839 49.91 100.00
  • -----------------------------------------------
  • Total 67,800 100.00
  • . gen w 1
  • . replace w -33961/33839 if randgtscreen
  • (2 real changes made)
  • . l
  • rand screen event n w

62
  • . logistic event rand fwn, coef ITT analysis
  • Logistic regression Number
    of obs 67800
  • LR
    chi2(1) 12.97
  • Prob gt
    chi2 0.0003
  • Log likelihood -1229.0537 Pseudo
    R2 0.0052
  • --------------------------------------------------
    ---------------------
  • event Coef. Std. Err. z Pgtz
    95 Conf. Interval
  • -------------------------------------------------
    ---------------------
  • rand -.5508119 .1558631 -3.53 0.000
    -.856298 -.2453258
  • _cons -5.702247 .094229 -60.51 0.000
    -5.886933 -5.517562
  • --------------------------------------------------
    ---------------------
  • . logistic event screen iwnw, coef CACE
    analysis
  • --------------------------------------------------
    ---------------------
  • event Coef. Std. Err. z Pgtz
    95 Conf. Interval
  • -------------------------------------------------
    ---------------------

63
Bootstrap standard error
  • Now we bundle the previous commands into a file
    negwt.ado
  • To speed things up, I use only 1/30 of the
    controls

64
  • . . negwt
  • Logistic regression
    Number of obs 1210
  • LR
    chi2(1) 15.43
  • Prob
    gt chi2 0.0001
  • Log likelihood -410.28453
    Pseudo R2 0.0185
  • --------------------------------------------------
    ----------------------
  • event Coef. Std. Err. z Pgtz
    95 Conf. Interval
  • -------------------------------------------------
    ----------------------
  • screen -.7461635 .1952982 -3.82 0.000
    -1.128941 -.363386
  • _cons -1.787902 .1141955 -15.66 0.000
    -2.011721 -1.564083
  • --------------------------------------------------
    ----------------------
  • . bootstrap _bscreen, reps(1000) negwt
  • --------------------------------------------------
    ----------------------
  • Var Reps Observed Bias Std. Err.
    95 Conf. Interval
  • -------------------------------------------------
    ----------------------

65
Negative weights summary
  • Agrees exactly with direct method in this simple
    case
  • Easy to generalise e.g. to situations with
    covariates (weights would have to depend on
    covariates)
  • Naïve standard errors are too small bootstrap
    needed
  • Formal rationale is via unbiased estimating
    equations (Abadie 2002)

66
6. Back-door method (Nagelkerke 2000)
67
Back door method
  • Idea would like to regress Y on D, adjusting for
    U
  • Its enough to adjust for E that blocks every
    indirect path from D to Y (back door criterion)
  • Approximate E by the residual from a linear
    regression of D on R

68
Properties
  • Nagelkerke et al showed that the method agrees
    exactly with the instrumental variables method
    for a linear model
  • For non-linear models it only approximately
    agrees with the method of subtraction
  • Not clear whether standard errors are adequate or
    whether bootstrapping is needed
  • Easy to generalise to more complex settings

69
Back door method for MASS
  • . reg screen rand fwn
  • Source SS df MS
    Number of obs 67800
  • -------------------------------------------
    F( 1, 67798) .
  • Model 10908.799 1 10908.799
    Prob gt F 0.0000
  • Residual 5368.59021 67798 .079185082
    R-squared 0.6702
  • -------------------------------------------
    Adj R-squared 0.6702
  • Total 16277.3892 67799 .240083028
    Root MSE .2814
  • --------------------------------------------------
    ----------------------------
  • screen Coef. Std. Err. t
    Pgtt 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------
  • rand .80224 .0021614 371.16
    0.000 .7980037 .8064764
  • _cons 5.55e-17 .001527 0.00
    1.000 -.0029929 .0029929
  • --------------------------------------------------
    ----------------------------
  • . predict E, residual
  • . tab rand E fwn

70
Back door method results
  • . logistic event screen E fwn
  • Logistic regression
    Number of obs 67800
  • LR
    chi2(2) 20.04
  • Prob
    gt chi2 0.0000
  • Log likelihood -1225.5162
    Pseudo R2 0.0081
  • --------------------------------------------------
    ----------------------
  • event Odds Ratio Std. Err. z Pgtz
    95 Conf. Interval
  • -------------------------------------------------
    ----------------------
  • screen .4738008 .094594 -3.74 0.000
    .3203717 .7007087
  • E 1.015178 .295373 0.05 0.959
    .5739573 1.795582
  • --------------------------------------------------
    ----------------------
  • Note tiny effect of E no evidence of selection
    in these data

71
Method comparison
  • . logistic event rand fwn ITT analysis
  • --------------------------------------------------
    ----------------------------
  • event Odds Ratio Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------
  • rand .5764816 .0898522 -3.53
    0.000 .4247315 .7824496
  • --------------------------------------------------
    ----------------------------
  • . logistic event screen fwn As-treated
    analysis
  • --------------------------------------------------
    ----------------------------
  • event Odds Ratio Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------
  • screen .476156 .0834625 -4.23
    0.000 .3377127 .6713534
  • --------------------------------------------------
    ----------------------------
  • . logistic event screen E fwn Approximate
    CACE analysis
  • --------------------------------------------------
    ----------------------------
  • event Odds Ratio Std. Err. z
    Pgtz 95 Conf. Interval
  • -------------------------------------------------
    ----------------------------

72
Summary
  • Weve explored a variety of methods for
    all-or-nothing treatment switches
  • randomise to E or S everyone gets all E or all S
  • In lecture 4, we will extend to much more complex
    patterns of switching
  • e.g. get E just for 3 months, then S
  • Another problem is where some participants get no
    treatment at all (or a treatment other than E/S)
  • ITT difference depends on 2 effects (E vs.
    nothing, S vs. nothing)
  • Walter (2006) adapted the compliance-type
    approach but assumed equality between some
    compliance-types
Write a Comment
User Comments (0)
About PowerShow.com