Title: Interactions With Continuous Variables Extensions of the Multivariable Fractional Polynomial Approac
1Interactions With Continuous Variables
Extensions of the Multivariable Fractional
Polynomial Approach
Willi SauerbreiInstitut of Medical Biometry and
Informatics University Medical Center Freiburg,
Germany
Patrick Royston MRC Clinical Trials Unit,
London, UK
2Overview
- Issues in regression models
- (Multivariable) fractional polynomials (MFP)
- Interactions of continuous variable with
- Binary variable
- Continuous variable
- Time
- Summary
3Observational Studies
- Several variables, mix of continuous and
(ordered) categorical variables - Different situations
- prediction
- explanation
- Explanation is the main interest here
- Identify variables with (strong) influence on the
outcome - Determine functional form (roughly) for
continuous variables - The issues are very similar in different types of
regression models (linear regression model, GLM,
survival models ...)
Use subject-matter knowledge for modelling
... ... but for some variables, data-driven
choice inevitable
4Regression models
- X(X1, ...,Xp) covariate, prognostic factors
- g(x) ß1 X1 ß2 X2 ... ßp Xp (assuming
effects are linear) - normal errors (linear) regression model
- Y normally distributed
- E (YX) ß0 g(X)
- Var (YX) s2I
- logistic regression model
- Y binary
- Logit P (YX) ln
- survival times
- T survival time (partly censored)
- Incorporation of covariates
g(X)
(g(X))
5Central issue
- To select or not to select (full model)?
- Which variables to include?
6Continuous variables The problem
Quantifying epidemiologic risk factors using
non-parametric regression model selection
remains the greatest challenge Rosenberg PS et
al, Statistics in Medicine 2003
223369-3381 Discussion of issues in (univariate)
modelling with splines Trivial nowadays to fit
almost any model To choose a good model is much
harder
7Alcohol consumption as risk factor for oral cancer
Rosenberg et al, StatMed 2003
8Building multivariable regression models
- Before dealing with the functional form, the
easier problem of model selection -
- variable selection assuming that the effect of
each continuous variable is linear
9Multivariable models - methods for variable
selection
- Full model
- variance inflation in the case of
multicollinearity - Stepwise procedures ? prespecified (?in, ?out)
and - actual significance level?
- forward selection (FS)
- stepwise selection (StS)
- backward elimination (BE)
- All subset selection ? which criteria?
- Cp Mallows (SSE / ) - n p 2
- AIC Akaike Information Criterion n ln (SSE /
n) p 2 - BIC Bayes Information Criterion n ln (SSE /
n) p ln(n) - fit penalty
- Combining selection with Shrinkage
- Bayes variable selection
- Recommendations???
Central issue MORE OR LESS COMPLEX MODELS?
10Backward elimination is a sensible approach
- Significance level can be chosen
- Reduces overfitting
- Of course required
- Checks
- Sensitivity analysis
- Stability analysis
11Continuous variables what functional form?
Traditional approaches a) Linear
function - may be inadequate functional
form - misspecification of functional form may
lead to wrong
conclusions b) best standard
transformation c) Step function
(categorial data) - Loss of information - How
many cutpoints? - Which cutpoints? - Bias
introduced by outcome-dependent choice
12StatMed 2006, 25127-141
13Continuous variables newer approaches
- Non-parametric (local-influence) models
- Locally weighted (kernel) fits (e.g. lowess)
- Regression splines
- Smoothing splines
- Parametric (non-local influence) models
- Polynomials
- Non-linear curves
- Fractional polynomials
- Intermediate between polynomials and non-linear
curves
14Fractional polynomial models
- Describe for one covariate, X
- Fractional polynomial of degree m for X with
powers p1, , pm is given by FPm(X) ?1 X p1
?m X pm - Powers p1,, pm are taken from a special set
?2, ? 1, ? 0.5, 0, 0.5, 1, 2, 3 - Usually m 1 or m 2 is sufficient for a good
fit - Repeated powers (p1p2)
- ?1 X p1 ?2 X p1log X
- 8 FP1, 36 FP2 models
15Examples of FP2 curves- varying powers
16Examples of FP2 curves- single power, different
coefficients
17Our philosophy of function selection
- Prefer simple (linear) model
- Use more complex (non-linear) FP1 or FP2 model if
indicated by the data - Contrasts to more local regression modelling
- Already starts with a complex model
18 Example Prognostic factorsGBSG-study in
node-positive breast cancer
299 events for recurrence-free survival time
(RFS) in 686 patients with complete data 7
prognostic factors, of which 5 are continuous
19FP analysis for the effect of age
20Function selection procedure (FSP)Effect of age
at 5 level?
?2 df p-value Any effect? Best FP2
versus null 17.61 4 0.0015 Linear function
suitable? Best FP2 versus linear 17.03 3
0.0007 FP1 sufficient? Best FP2 vs. best
FP1 11.20 2 0.0037
21Many predictors MFP
- With many continuous predictors selection of best
FP for each becomes more difficult ? MFP
algorithm as a standardized way to variable and
function selection - (usually binary and categorical variables are
also available) - MFP algorithm combines
- backward elimination with
- FP function selection procedures
22Continuous factors Different results with
different analysesAge as prognostic factor in
breast cancer (adjusted)
P-value 0.9 0.2
0.001
23Results similar? Nodes as prognostic factor in
breast cancer (adjusted)
P-value 0.001 0.001 0.001
24Multivariable FP
Final Model in breast cancer example
age grade nodes
progesterone
- Model choosen out of
- more than a million possible models,
- one model selected
- Model - Sensible?
- - Interpretable?
- - Stable?
- Bootstrap stability analysis (see R S 2003)
25Example Risk factors
- Whitehall 1
- 17,370 male Civil Servants aged 40-64 years,
1670 (9.7) died - Measurements include age, cigarette smoking, BP,
cholesterol, height, weight, job grade - Outcomes of interest all-cause mortality at 10
years ? logistic regression
26Whitehall 1 Systolic blood pressure Deviance
difference in comparison to a straight line for
FP(1) and FP(2) models
27Similar fit of several functions
28Presentation of models for continuous covariates
- The function 95 CI gives the whole story
- Functions for important covariates should always
be plotted - In epidemiology, sometimes useful to give a more
conventional table of results in categories - This can be done from the fitted function
29Whitehall 1 Systolic blood pressure Odds ratio
from final FP(2) model LogOR 2.92 5.43X-2
14.30 X 2 log X Presented in categories
30Whitehall 1MFP analysis
No variables were eliminated by the MFP
algorithm Assuming a linear function weight is
eliminated by backward elimination
31InteractionsMotivation I
- Detecting predictive factors (interaction with
treatment) - Dont investigate effects in separate subgroups!
- Investigation of treatment/covariate interaction
requires statistical tests - Care is needed to avoid over-interpretation
- Distinguish two cases
- Hypothesis generation searching several
interactions - Specific predefined hypothesis
- For current bad practise - see Assmann et al
(Lancet 2000)
32Motivation - II
- Continuous by continuous interactions
- usually linear by linear product term
- not sensible if main effect (prognostic
effect) is non-linear - mismodelling the main effect may introduce
spurious interactions
33Detecting predictive factors(treatment
covariate interaction)
- Most popular approach
- Treatment effect in separate subgroups
- Has several problems (Assman et al 2000)
- Test of treatment/covariate interaction required
- For binarycovariate standard test for
interaction available - Continuous covariate
- Often categorized into two groups
34Categorizing a continuous covariate
- How many cutpoints?
- Position of the cutpoint(s)
- Loss of information ? loss of power
35Standard approach
- Based on binary predictor
- Need cut-point for continuous predictor
- Illustration - problem with cut-point approach
- TAMER interaction in breast cancer (GBSG-study)
36Treatment effect by subgroup
37New approaches for continuous covariates
- STEPP
- Subpopulation treatment effect pattern plots
- Bonetti Gelber 2000
- MFPI
- Multivariable fractional polynomial
- interaction approach
- Royston Sauerbrei 2004
38STEPP
- Sequences of overlapping subpopulations
- Sliding window Tail oriented
Contin. covariate
2g-1 subpopulations (here g8)
39STEPP
Estimates in subpopulations
- No interaction ?
- treatment effects similar in all
subpopulations - Plot effects in subpopulations
40STEPP
- Overlapping populations, therefore correlation
between treatment effects in subpopulations - Simultaneous confidence band and tests proposed
41MFPI
- Have one continuous factor X of interest
- Use other prognostic factors to build an
adjustment model, e.g. by MFP - Interaction part with or without adjustment
- Find best FP2 transformation of X with same
powers in each treatment group - LRT of equality of reg coefficients
- Test against main effects model(no interaction)
based on ?2 with 2df - Distinguish
- predefined hypothesis - hypothesis searching
42RCT Metastatic renal carcinoma
Comparison of MPA with interferon N
347, 322 Death
43Overall Interferon is better (plt0.01)
- Is the treatment effect similar in all patients?
- Sensible questions?
- Yes, from our point of view
- Ten factors available for the investigation of
treatment covariate interactions
44MFPI
- Treatment effect function for WCC
- Only a result of complex (mis-)modelling?
45Does the MFPI model agree with the data?Check
proposed trend
Treatment effect in subgroups defined by WCC
HR (Interferon to MPA adjusted values similar)
overall 0.75 (0.60 0.93) I 0.53 (0.34
0.83) II 0.69 (0.44 1.07) III 0.89
(0.57 1.37) IV 1.32 (0.85 2.05)
46STEPP Interaction with WCC
SLIDING WINDOW (n1 25, n2 40)
TAIL ORIENTED (g 8)
47STEPP as check of MFPI
STEPP tail-oriented, g 6
48MFPI Type I error
- Random permutation of a continuous covariate
- (haemoglobin)
- ? no interaction
- Distribution of P-value from test of interaction
- 1000 runs, Type I error 0.054
49Continuous by continuous interactionsMFPIgen
- Have Z1 , Z2 continuous and X confounders
- Apply MFP to X, Z1 and Z2, forcing Z1 and Z2 into
the model. FP functions f1(Z1) and f2(Z2) will be
selected for Z1 and Z2 - Add term f1(Z1) f2(Z2) to the model chosen and
use LRT for test of interaction - Often f1(Z1) and/or f2(Z2) are linear
- Check all pairs of continuous variables for an
interaction - Check (graphically) interactions for artefacts
- Use forward stepwise if more than one interaction
remains - Low significance level for interactions
50InteractionsWhitehall 1
- Consider only age and weight
- Main effects
- age linear
- weight FP2 (-1,3)
- Interaction?
- Include ageweight-1 ageweight3
- into the model
- LRT ?2 5.27 (2df, p 0.07)
- ? no (strong) interaction
51- Erroneously assume that the effect of weight is
linear - Interaction?
- Include ageweight into the model
- LRT ?2 8.74 (1df, p 0.003)
- ? hightly significant interaction
52- Model check
- categorize age in 4 equal sized groups
- Compute running line smooth of the binary outcome
on weight in each group
53Whitehall 1 check of age x weight interaction
54- Running line smooth are about
- parallel across age groups ?
- no (strong) interactions
- smoothed probabilities are about equally spaced ?
effect of age is linear
55- Erroneously assume that the effect of
- weight is linear
- Estimated slopes of weight in age-groups
indicates strong qualitative interaction between
age und weight -
-
56Whitehall 1
- P-values for two-way interactions from MFPIgen
FP transformations
?? cholage highly significant
57Presentation of interactionsWhitehall 1
agechol interaction
Effect (adjusted) for 10th, 35th, 65th and 90th
centile
58AgeChol interaction
- Chol low age has an effect
- Chol high age has no effect
- Age low chol has an effect
- Age high chol has no effect
59AgeChol interaction
- Does the model fit? Check in 4 subgroups
Linearity of chol ok But Slopes are not
monotonically ordered Lack of fit of
linearlinear interaction
60- More complicated model?
- Interaction real?
- Validation in new data!
61Survival data
- effect of a covariate may vary in time ?
- time by covariate interaction
62Extending the Cox model
- Cox model
- ?(t X) ?0(t) exp (?X)
- Relax PH-assumption
- dynamic Cox model
- ?(t X) ?0(t) exp (?(t) X)
- HR(x,t) function of X and time t
- Relax linearity assumption
- ?(t X) ?0(t) exp (? f (X))
63Causes of non-proportionality
- Effect gets weaker with time
- Incorrect modelling
- omission of an important covariate
- incorrect functional form of a covariate
- different survival model is appropriate
64Non-PH What can be done?
- Non-PH - Does it matter ?
- - Is it real ?
- Non-PH is large and real
- stratify by the factor
- ?(tX, Vj) ?j (t) exp (X? )
- effect of V not estimated, not tested
- for continuous variables grouping necessary
- Partition time axis
- Model non-proportionality by time-dependent
covariate
65Example Time-varying effectsRotterdam breast
cancer data
- 2982 patients
- 1 to 231 months follow-up time
- 1518 events for RFI (recurrence free interval)
- Adjuvant treatment with chemo- or hormonal
therapy according to clinic guidelines - 70 without adjuvant treatment
- Covariates
- continuous
- age, number of positive nodes, estrogen,
progesterone - categorical
- menopausal status, tumor size, grade
66- Treatment variables ( chemo , hormon) will be
analysed as usual covariates - 9 covariates , partly strong correlation
(age-meno estrogen-progesterone chemo,
hormon nodes ) - variable selection
- Use multivariable fractional polynomial approach
for model selection in the Cox proportional
hazards model
67Assessing PH-assumption
- Plots
- Plots of log(-log(S(t))) vs log t should be
parallel for groups - Plotting Schoenfeld residuals against time to
identify patterns in regression coefficients - Many other plots proposed
- Tests
- many proposed, often based on Schoenfeld
residuals, - most differ only in choice of time transformation
- Partition the time axis and fit models seperatly
to each time interval - Including time-by-covariate interaction terms in
the model and estimate the log hazard ratio
function
68Smoothed Schoenfeld residuals multivariable MFP
model assuming PH
69Selected model with MFP
test of time-varying effect for different time
transformations
estimates
70Including time by covariate interaction(Semi-)
parametric models for ß(t)
- model ?(t) x ? x ? x g(t)
- calculate time-varying covariate x g(t)
- fit time-varying Cox model and test for ? 0
- plot ?(t) against t
- g(t) which form?
- usual function, eg t, log(t)
- piecewise
- splines
- fractional polynomials
71MFPTime algorithmMotivation
- Multivariable strategy required to select
- Variables which have influence on outcome
- For continuous variables determine functional
form of the influence (usual linearity
assumption sensible?) - Proportional hazards assumption sensible or does
a time-varying function fit the data better?
72MFPTime algorithm (1)
- Determine (time-fixed) MFP model M0
- possible problems
- variable included, but effect is not
constant in time - variable not included because of short
term effect only
- Consider short term period only
- Additional to M0 significant variables?
- This gives M1
73MFPTime algorithm (2)
- For all variables (with transformations) selected
from full time-period - and short time-period
- Investigate time function for each covariate in
forward stepwise fashion - may use small P value - Adjust for covariates from selected model
- To determine time function for a variable
- compare deviance of models ( ?2) from
- FPT2 to null (time fixed effect) 4 DF
- FPT2 to log 3 DF
- FPT2 to FPT1 2 DF
- Use strategy analogous to stepwise to add
- time-varying functions to MFP model M1
74Development of the model
75Time-varying effects in final model
76Final model includes time-varying functions
for progesterone ( log(t) ) and
tumor size ( log(t) ) Prognostic ability of the
Index vanishes in time
77Software sources MFP
- Most comprehensive implementation is in Stata
- Command mfp is part since Stata 8 (now Stata 10)
- Versions for SAS and R are available
- SAS
- www.imbi.uni-freiburg.de/biom/mfp
- R version available on CRAN archive
- mfp package
- Extensions to investigate interactions
- So far only in Stata
78Concluding comments MFP
- FPs use full information - in contrast to a
priori categorisation - FPs search within flexible class of functions
(FP1 and FP(2)-44 models) - MFP is a well-defined multivariate model-building
strategy combines search for transformations
with BE - Important that model reflects medical knowledge,
- e.g. monotonic / asymptotic functional forms
79Towards recommendations for model-building by
selection of variables and functional forms for
continuous predictors under several assumptions
Sauerbrei et al. SiM 2007
80Interactions
- Interactions are often ignored by analysts
- Continuous ? categorical has been studied in FP
context because clinically very important - Continuous ? continuous is more complex
- Interaction with time important for long-term FU
survival data
81MFP extensions
- MFPI treatment/covariate interactions
- In contrast to STEPP it avoids categorisation
- MFPIgen interaction between two continuous
variables - MFPT time-varying effects in survival data
82Summary
- Getting the big picture right is more important
than optimising aspects and ignoring others - strong predictors
- strong non-linearity
- strong interactions
- strong non-PH in survival model
83References
Harrell FE jr. (2001) Regression Modeling
Strategies. Springer. Royston P, Altman DG.
(1994) Regression using fractional polynomials
of continuous covariates parsimonious parametric
modelling (with discussion). Applied Statistics,
43, 429-467. Royston P, Altman DG, Sauerbrei W
(2006) Dichotomizing continuous predictors in
multiple regression a bad idea. Statistics in
Medicine, 25, 127-141. Royston P, Sauerbrei W.
(2004) A new approach to modelling interactions
between treatment and continuous covariates in
clinical trials by using fractional polynomials.
Statistics in Medicine, 23, 2509-2525. Royston P,
Sauerbrei W. (2005) Building multivariable
regression models with continuous covariates,
with a practical emphasis on fractional
polynomials and applications in clinical
epidemiology. Methods of Information in Medicine,
44, 561-571. Royston P, Sauerbrei W. (2007)
Improving the robustness of fractional polynomial
models by preliminary covariate transformation a
pragmatic approach. Computational Statistics and
Data Analysis, 51 4240-4253. Royston P,
Sauerbrei W (2008) Multivariable Model-Building
- A pragmatic approach to regression analysis
based on fractional polynomials for continuous
variables. Wiley. Sauerbrei W. (1999) The use
of resampling methods to simplify regression
models in medical statistics. Applied Statistics,
48, 313-329. Sauerbrei W, Meier-Hirmer C, Benner
A, Royston P. (2006) Multivariable regression
model building by using fractional polynomials
Description of SAS, STATA and R programs.
Computational Statistics Data Analysis, 50,
3464-3485. Sauerbrei W, Royston P. (1999)
Building multivariable prognostic and diagnostic
models transformation of the predictors by using
fractional polynomials. Journal of the Royal
Statistical Society A, 162, 71-94. Sauerbrei W,
Royston P, Binder H (2007) Selection of
important variables and determination of
functional form for continuous predictors in
multivariable model building. Statistics in
Medicine, 265512-28. Sauerbrei W, Royston P,
Look M. (2007) A new proposal for multivariable
modelling of time-varying effects in survival
data based on fractional polynomial
time-transformation. Biometrical Journal, 49
453-473. Sauerbrei W, Royston P, Zapien K.
(2007) Detecting an interaction between
treatment and a continuous covariate a
comparison of two approaches. Computational
Statistics and Data Analysis, 51
4054-4063. Schumacher M, Holländer N, Schwarzer
G, Sauerbrei W. (2006) Prognostic Factor
Studies. In Crowley J, Ankerst DP (ed.), Handbook
of Statistics in Clinical Oncology,
ChapmanHall/CRC, 289-333.