Title: Attribute Interactions in Medical Data Analysis
1Attribute Interactionsin Medical Data Analysis
- A. Jakulin1, I. Bratko1,2, D. Smrke3, J.
Demšar1, B. Zupan1,2,4 - University of Ljubljana, Slovenia.
- Jožef Stefan Institute, Ljubljana, Slovenia.
- Dept. of Traumatology, University Clinical
Center, Ljubljana, Slovenia. - Dept. of Human and Mol. Genetics, Baylor College
of Medicine, USA.
2Overview
- Interactions
- Correlation can be generalized to more than 2
attributes, to capture interactions -
higher-order regularities. - Information theory
- A non-parametric approach for measuring
association and uncertainty. - Applications
- Automatic selection of informative visualizations
uncover previously unseen structure in medical
data. - Automatic constructive induction of new features.
- Results
- Better predictive models for hip arthroplasty.
- Better understanding of the data.
3Attribute Dependencies
label (outcome, diagnosis)
C
importance of attribute B
importance of attribute A
B
A
attribute (feature)
attribute (feature)
4Shannons Entropy
A
C
5Interaction Information
I(ABC)
I(ABC)
- I(BC)
- I(AC)
I(ABC) - I(AB)
- Interaction information can be
- NEGATIVE redundancy among attributes (negative
int.) - NEGLIGIBLE no interaction
- POSITIVE synergy between attributes (positive
int.)
6History of Interaction Information
- (Partial) history of independent reinventions
- McGill 54 (Psychometrika) - interaction
information - Han 80 (Information Control) - multiple
mutual information - Yeung 91 (IEEE Trans. Inf. Theory) - mutual
information - Grabisch Roubens 99 (game theory) - Banzhaf
interaction index - Matsuda 00 (Physical Review E) - higher-order
mutual inf. - Brenner et al. 00 (Neural Computation) - average
synergy - Demšar 02 (machine learning) - relative
information gain - Bell 03 (NIPS02, ICA2003) - co-information
- Jakulin 03 (machine learning) - interaction
gain
7Utility of Interaction Information
- Visualization of interactions in data
- Interaction graphs, dendrograms
- Construction of predictive models
- Feature construction, combination, selection
- Case studies
- Predicting the success of hip arthroplasty (HHS).
- Predicting the contraception method used from
demographic data (CMC). - Predictive modeling helps us focus only on
interactions that involve the outcome.
8Interaction Matrix for CMC Domain
An attributes information gain
Illustrates the interaction information for all
pairs of attributes. red positive, blue
negative, green independent.
9Interaction Graphs
10Interaction Dendrogram
weakly interacting
strongly interacting
cluster tightness
loose
tight
11Interpreting the Dendrogram
12Application to the Harris hip score prediction
(HHS)
13Attribute Structure for HHS
Bipolar endoprosthesis and short duration of
operation significantly increases the chances of
a good outcome.
Presence of neurological disease is a high risk
factor only in the presence of other
complications during operation.
late complications
rehabilitation
Discovered from data
Designed by the physician
14A Positive Interaction
Both attributes are useless alone, but useful
together.They should be combined into a single
feature (e.g. with a classification tree, a rule
or a Cartesian product attribute).These two
attributes are also correlated correlation
doesnt imply redundancy.
15A Negative Interaction
very fewinstances!
Once we know the wifes or the husbands
education,the other attribute will not provide
much new information. But they do provide some,
if you know how to use it! Feature combination
may work feature selection throws data away.
16Prediction of HHS
- Brier score - probabilistic evaluation (K
classes, N instances) - Models
- Tree-Augmented NBC 0.227 0.018
- Naïve Bayesian classifier 0.223
0.014 - General Bayesian net 0.208 0.006
- Simple feature selection with NBC 0.196 0.012
- FSS with background concepts 0.196 0.011
- 10 top interactions ? FSS 0.189 0.011
- Tree-Augmented NB 0.207 0.017
- Search for feature comb. 0.185 0.012
17The Best Model
These two (not very logical) combinations of
features are only worth 0.2 loss in performance.
The endoprosthesis and operation duration
interaction provides little information that
wouldnt already be provided by these attributes
it interacts negatively with the model.
18A Causal Diagram
pulmonary disease
loss of consciousness
sitting ability
injury operation time
late luxation
HHS
luxation
diabetes
neurological disease
hospitalization duration
19Orange
20Summary
- Visualization methods attempt to
- Summarize the relationships between attributes in
data (interaction graph, interaction dendrogram,
interaction matrix). - Assist the user in exploring the domain and
constructing classification models (interactive
interaction analysis). - What to do with interactions
- Do make use of interactions! (rules, trees,
dependency models) - Myopia naïve Bayesian classifier, linear SVM,
perceptron, feature selection, discretization. - Do not assume an interaction when there isnt
one! - Fragmentation classification trees, rules,
general Bayesian networks, TAN.