Title: Prognostic Factor Analyses
1Prognostic Factor Analyses
- ? ? ? (Yue-Cune Chang)
- ???????????
- ? ? ? ? ? ? ?
2Introduction
- Prognostic factor analyses (PFAs) are studies
that attempt to assess the relative importance of
several predictor variables simultaneously. The
need to prognosticate is basic to clinical
reasoning, but most of us are unable to account
quantitatively for the effects of more than one
or two variables at a time. Using the formality
of a PFA, the additional structure provided by a
statistical model, and thoughtful displays of
data and effect estimates, one can extend
quantitative accounting to many predictor
variables. - PFAs are usually based on data in which
investigators did not control the predictor
variables or confounders as in experimental
designs. - The validity of PFAs depend on the absence of
strong selection bias, on the correctness of the
statistical models employed, and on having
observed, recorded, and analyzed appropriately
the important predictor variables.
3Why do we need prognostic factor analyses
- To learn the relative importance of several
variables that are associated with disease
outcome simultaneously. This is especially
important for diseases that are treated
imperfectly such as AIDS, cardiovascular disease,
and cancer. - To improve the design of clinical trial, e.g.
adjusting the severity of the disease and
treating the severity as a prognostic factor. - To detect the possible interaction effects
between treatment and covariates or between
prognostic factors themselves. - To assess clinical landmarks during the course of
an illness and to decide if changes in treatment
strategy are warranted. For example, when
monitoring the time course of CD4 lymphocyte
count in HIV positive patients, when the count
drops below some threshold, a change in treatment
may be indicated. A threshold such as this could
be determined by an appropriate PFA.
4Types of Prognostic Factors
- Prognostic factors can be continuous measures,
ordinal, binary, or categorical. Most often,
prognostic factors are recorded at study entry
or time 0 (baseline) with respect to follow-up
time and remain constant, e.g. sex, treatment. - Some prognostic factors may change their value
over time, named time dependent covariates
(TDC). - There are two types of time dependent covariates
- intrinsic or internal (i.e. those that exist
within study subject), and extrinsic or external
(i.e. those that exist independently of the study
subject, e.g. in cancer study, the environmental
levels of toxins).
5Model-Based Methods
- One of the most powerful and flexible methods for
assessing the effects of more than one prognostic
factor simultaneously is a statistical model. - Statistical models can be used to describe a
plausible mathematical relationship between the
predictors and the observed endpoint in terms of
one or more model parameters which have handy
clinical interpretations.
6Using statistical models, the investigator must
- be knowledgeable about the subject matter and
interpretation - collect and verify complete data
- consult an experienced statistical expert to
guide/do the analysis - make a plan for dealing with decision points
during the analysis.
7- Models combine theory and data
- A model is any construct which combines
theoretical knowledge, represented by
equation(s), with empirical knowledge,
represented by data. - The scale of measurements (coding) may be
important - The first step in a PFA is to code the
measurements (variable values) in a numerically
appropriate way. Even qualitative variables can
be represented by the proper choice of numerical
coding. Ordinal and qualitative variables often
need to be re-coded as binary indicator or
dummy variables that facilitate group
comparisons. A variable with N levels requires
N-1 binary dummy variables to compare levels in a
regression model. Each dummy variable implies a
comparison between a specific level and the
reference level (which is omitted from the
model).
8Some widely used statistical models in clinical
trials The appropriate statistical models to
employ in PFAs are dictated by the specific
type of data and biological questions.
- Generalized Linear Models
- General Linear Models
- Regression, ANOVA, ANCOVA
- Logistic Regression
- Log-linear Model
- Proportional Hazards Model (PH Models)
- Coxs regression models (Coxs PH Model)
9No
No
Y ?Normal, Binomial, or Poisson ??
????????
??????
Yes
Yes
Y ?Normal, Binomial, or Poisson??
GEE Methods of Generalized Linear Model
Yes
No
???????? ? Wilcoxon Ranks Sum test, Kruskal
Wallis Test
Generalized Linear Model
10Use Appropriate Statistical Model
- Depends on the specific type of data and
biological questions (study purposes) - General Linear Model
- Logistic Regression
- Log-linear Model
11Coxs Proportional Hazards Model
- In the Coxs PH model, the logarithm of the
hazard rates ratio is assumed to be a constant
related to a linear combination of the predictor
variables - Hazard rate Instantaneous failure (death) rate
- Age specific death rate
12Coding of Variables for Brain Tumor Clinical Trail
- Treat 0 placebo 1 polymer
- Resect75 0 lt75 resection 1 gt75 resection
- Age Age in years
- Interval Years from diagnosis
- Karn (PS) 0 lt70 1 ?70
- Race 1 White 0 other
- Local 1 Local 0 Whole Brain Irradiation
- Sex 1 Male 0 Female
- Nitro 1 Previous Nitrosurea 0 None
- Weeks Survival time in weeks
- Event 1 Death 0 Alive
- Path 1 Glioblastoma 2 Anaplastic Astrocytoma
- 3 Oligodendroglioma 4 other
- Grade 1 Active 0 Quiescent
13(No Transcript)
14(No Transcript)
15Coxs Proportional Hazards Model Using SPSS V.10.0
Age/10
P-values
Hazard Rates Ratio
16(No Transcript)
17Logistic Regression Models Using STATA V8.0
18(No Transcript)
19General Linear Models (ANCOVA) Using STATA V8.0
- Example Effect of Three Teaching Methods
- on Students Score after adjusting
IQ. - Method 1 Uses the standard lecture format
- Method 2 Uses short movie clips at the
- beginning of each period.
- Method 3 Use a short interactive computer
- module at the end of the
period.
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26Building parsimonious models
- Quantitative prognostic factor assessment can be
thought of as the process of constructing
parsimonious statistical model. These models are
most useful when - (1) they contain a few clinically relevant and
interpretable predictors, - (2) the parameters or coefficients are
estimated with a reasonable - high degree of precision,
- (3) the predictive factors each carry
independent information about - prognosis,
- (4) the model is consistent with other
clinical and biological data. - Constructing models that meet these criteria is
usually not a simple or automatic process. We can
use information in the data themselves
(data-based variable selection), clinical
knowledge (clinically based variable selection),
or both.
27- Dont use automated procedures.
- Resolve missing data
- Screen factors for importance in univariable
regressions - Build multiple regressions
- Correlated predictors may be a problem (Depends)
- The only reasonable way for the methodologist to
solve these difficulties is to work with
clinicians who have expert knowledge of the
predictor variables, based on either other
studies or pre-clinical data. Their opinion, when
guided by statistical evidence, is necessary for
building a good model. Even when a particular set
of predictors appears to offer slight statistical
improvement in fit over another, one should
generally prefer the set with the most clear
clinical interpretation. Of course, if the
statistical evidence concerning a particular
predictor or set of predictors is very strong,
then these models should be preferred or studied
very carefully to understand the mechanisms,
which may lead to new biological findings.
28- Adjusted Analyses of Comparative Trials
- Investigators might consider adjusting
estimated treatment effects for prognostic
factors that meet one of the following criteria - (1) Factors that (by chance) are
statistically significantly - unbalanced between the treatment
groups, - (2) Factors that are strongly associated with
the outcome, - whether (significantly) unbalanced or
not, - (3) To demonstrate that a particular
prognostic factor - does not artificially create the
treatment effects, - (4) To illustrate and quantify the effects of
factors of - known clinical importance.
29Thanks for Your Attention