Results of the Causality Challenge - PowerPoint PPT Presentation

About This Presentation
Title:

Results of the Causality Challenge

Description:

Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. Andr Elisseeff and Jean-Philippe Pellet ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 35
Provided by: Isabell108
Category:

less

Transcript and Presenter's Notes

Title: Results of the Causality Challenge


1
Results of the Causality Challenge
  • Isabelle Guyon, Clopinet
  • Constantin Aliferis and Alexander Statnikov,
    Vanderbilt Univ.
  • André Elisseeff and Jean-Philippe Pellet, IBM
    Zürich
  • Gregory F. Cooper, Pittsburg University
  • Peter Spirtes, Carnegie Mellon

2
Causal discovery
What affects
  • Which actions will have beneficial effects?

3
Systemic causality
4
Feature Selection
Y
X
Predict Y from features X1, X2, Select most
predictive features.
5
Causation
Predict the consequences of actions Under
manipulations by an external agent, some
features are no longer predictive.
6
Challenge Design
7
Available data
  • A lot of observational data.
  • Correlation ? Causality!
  • Experiments are often needed, but
  • Costly
  • Unethical
  • Infeasible
  • This challenge, semi-artificial data
  • Re-simulated data
  • Real data with artificial probes

8
Four tasks
9
On-line feed-back
10
Difficulties
  • Violated assumptions
  • Causal sufficiency
  • Markov equivalence
  • Faithfulness
  • Linearity
  • Gaussianity
  • Overfitting (statistical complexity)
  • Finite sample size
  • Algorithm efficiency (computational complexity)
  • Thousands of variables
  • Tens of thousands of examples

11
Evaluation
  • Fulfillment of an objective
  • Prediction of a target variable
  • Predictions under manipulations
  • Causal relationships
  • Existence
  • Strength
  • Degree

12
Setting
  • Predict a target variable (on training and test
    data).
  • Return the set of features used.
  • Flexibility
  • Sorted or unsorted list of features
  • Single prediction or table of results
  • Complete entry xxx0, xxx1, xxx2 results (for at
    least one dataset).

13
Metrics
  • Results ranked according to the test set
  • target prediction performance Tscore
  • We also assess directly the feature set with a
    Fscore, not used for ranking.

14
Toy Examples
15
Causality assessmentwith manipulations
16
LUCAS1 manipulated
Causality assessmentwith manipulations
17
Causality assessmentwith manipulations
LUCAS2 manipulated
18
Goal driven causality
  • We define
  • Vvariables of interest
  • (e.g. MB, direct causes, ...)
  • We assess causal relevance Fscoref(V,S).

19
Causality assessmentwithout manipulation?
20
Using artificial probes
Anxiety
Peer Pressure
Born an Even Day
Smoking
Genetics
Yellow Fingers
Lung Cancer
Attention Disorder
Allergy
LUCAP0 natural
Coughing
Fatigue
Car Accident
21
Using artificial probes
LUCAP12 manipulated
22
Scoring using probes
  • What we can compute (Fscore)
  • Negative class probes (here, all non-causes,
    all manipulated).
  • Positive class other variables (may include
    causes and non causes).
  • What we want (Rscore)
  • Positive class causes.
  • Negative class non-causes.
  • What we get (asymptotically)
  • Fscore (NTruePos/NReal) Rscore 0.5
    (NTrueNeg/NReal)

23
Results
24
Challenge statistics
  • Start December 15, 2007.
  • End April 30, 2000
  • Total duration 20 weeks.
  • Last (complete) entry ranked

Number of ranked entrants
Number of ranked submissions
25
Learning curves
26
AUC distribution
27
REGED
28
SIDO
29
CINA
30
MARTI
31
Pairwise comparisons
32
Top ranking methods
  • According to the rules of the challenge
  • Yin Wen Chang SVM gt best prediction accuracy on
    REGED and CINA. Prize 400 donated by Microsoft.
  • Gavin Cawley Causal explorer linear ridge
    regression ensembles gt best prediction accuracy
    on SIDO and MARTI. Prize 400 donated by
    Microsoft.
  • According to pairwise comparisons
  • Jianxin Yin and Prof. Zhi Gengs group Partial
    Orientation and Local Structural Learning gt best
    on Pareto front, new original causal discovery
    algorithm. Prize free WCCI 2008 registration.

33
Pairwise comparisons
REGED
SIDO
CINA
MARTI
34
Conclusion
  • We have found good correlation between causation
    and prediction under manipulations.
  • Several algorithms have demonstrated
    effectiveness of discovering causal
    relationships.
  • We still need to investigate what makes then fail
    in some cases.
  • We need to capitalize on the power of classical
    feature selection methods.
Write a Comment
User Comments (0)
About PowerShow.com