Attribute Interactions in Medical Data Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Attribute Interactions in Medical Data Analysis

Description:

Correlation can be generalized to more than 2 ... luxation. late. luxation. moderator. effect. cause. Orange. Summary. Visualization methods attempt to: ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 21
Provided by: AleksJ5
Category:

less

Transcript and Presenter's Notes

Title: Attribute Interactions in Medical Data Analysis


1
Attribute Interactionsin Medical Data Analysis
  • A. Jakulin1, I. Bratko1,2, D. Smrke3, J.
    Demšar1, B. Zupan1,2,4
  • University of Ljubljana, Slovenia.
  • Jožef Stefan Institute, Ljubljana, Slovenia.
  • Dept. of Traumatology, University Clinical
    Center, Ljubljana, Slovenia.
  • Dept. of Human and Mol. Genetics, Baylor College
    of Medicine, USA.

2
Overview
  • Interactions
  • Correlation can be generalized to more than 2
    attributes, to capture interactions -
    higher-order regularities.
  • Information theory
  • A non-parametric approach for measuring
    association and uncertainty.
  • Applications
  • Automatic selection of informative visualizations
    uncover previously unseen structure in medical
    data.
  • Automatic constructive induction of new features.
  • Results
  • Better predictive models for hip arthroplasty.
  • Better understanding of the data.

3
Attribute Dependencies
label (outcome, diagnosis)
C
importance of attribute B
importance of attribute A
B
A
attribute (feature)
attribute (feature)
4
Shannons Entropy
A
C
5
Interaction Information
I(ABC)
I(ABC)
- I(BC)
- I(AC)
I(ABC) - I(AB)
  • Interaction information can be
  • NEGATIVE redundancy among attributes (negative
    int.)
  • NEGLIGIBLE no interaction
  • POSITIVE synergy between attributes (positive
    int.)

6
History of Interaction Information
  • (Partial) history of independent reinventions
  • McGill 54 (Psychometrika) - interaction
    information
  • Han 80 (Information Control) - multiple
    mutual information
  • Yeung 91 (IEEE Trans. Inf. Theory) - mutual
    information
  • Grabisch Roubens 99 (game theory) - Banzhaf
    interaction index
  • Matsuda 00 (Physical Review E) - higher-order
    mutual inf.
  • Brenner et al. 00 (Neural Computation) - average
    synergy
  • Demšar 02 (machine learning) - relative
    information gain
  • Bell 03 (NIPS02, ICA2003) - co-information
  • Jakulin 03 (machine learning) - interaction
    gain

7
Utility of Interaction Information
  • Visualization of interactions in data
  • Interaction graphs, dendrograms
  • Construction of predictive models
  • Feature construction, combination, selection
  • Case studies
  • Predicting the success of hip arthroplasty (HHS).
  • Predicting the contraception method used from
    demographic data (CMC).
  • Predictive modeling helps us focus only on
    interactions that involve the outcome.

8
Interaction Matrix for CMC Domain
An attributes information gain
Illustrates the interaction information for all
pairs of attributes. red positive, blue
negative, green independent.
9
Interaction Graphs
10
Interaction Dendrogram
weakly interacting
strongly interacting
cluster tightness
loose
tight
11
Interpreting the Dendrogram
12
Application to the Harris hip score prediction
(HHS)
13
Attribute Structure for HHS
Bipolar endoprosthesis and short duration of
operation significantly increases the chances of
a good outcome.
Presence of neurological disease is a high risk
factor only in the presence of other
complications during operation.
late complications
rehabilitation
Discovered from data
Designed by the physician
14
A Positive Interaction
Both attributes are useless alone, but useful
together.They should be combined into a single
feature (e.g. with a classification tree, a rule
or a Cartesian product attribute).These two
attributes are also correlated correlation
doesnt imply redundancy.
15
A Negative Interaction
very fewinstances!
Once we know the wifes or the husbands
education,the other attribute will not provide
much new information. But they do provide some,
if you know how to use it! Feature combination
may work feature selection throws data away.
16
Prediction of HHS
  • Brier score - probabilistic evaluation (K
    classes, N instances)
  • Models
  • Tree-Augmented NBC 0.227 0.018
  • Naïve Bayesian classifier 0.223
    0.014
  • General Bayesian net 0.208 0.006
  • Simple feature selection with NBC 0.196 0.012
  • FSS with background concepts 0.196 0.011
  • 10 top interactions ? FSS 0.189 0.011
  • Tree-Augmented NB 0.207 0.017
  • Search for feature comb. 0.185 0.012

17
The Best Model
These two (not very logical) combinations of
features are only worth 0.2 loss in performance.
The endoprosthesis and operation duration
interaction provides little information that
wouldnt already be provided by these attributes
it interacts negatively with the model.
18
A Causal Diagram
pulmonary disease
loss of consciousness
sitting ability
injury operation time
late luxation
HHS
luxation
diabetes
neurological disease
hospitalization duration
19
Orange
20
Summary
  • Visualization methods attempt to
  • Summarize the relationships between attributes in
    data (interaction graph, interaction dendrogram,
    interaction matrix).
  • Assist the user in exploring the domain and
    constructing classification models (interactive
    interaction analysis).
  • What to do with interactions
  • Do make use of interactions! (rules, trees,
    dependency models)
  • Myopia naïve Bayesian classifier, linear SVM,
    perceptron, feature selection, discretization.
  • Do not assume an interaction when there isnt
    one!
  • Fragmentation classification trees, rules,
    general Bayesian networks, TAN.
Write a Comment
User Comments (0)
About PowerShow.com