Results of the Causality Challenge - PowerPoint PPT Presentation

About This Presentation

Title:

Results of the Causality Challenge

Description:

Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. Andr Elisseeff and Jean-Philippe Pellet ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 35

Provided by: Isabell108

Category:

more less

Transcript and Presenter's Notes

Title: Results of the Causality Challenge

1
Results of the Causality Challenge

Isabelle Guyon, Clopinet
Constantin Aliferis and Alexander Statnikov,
Vanderbilt Univ.
André Elisseeff and Jean-Philippe Pellet, IBM
Zürich
Gregory F. Cooper, Pittsburg University
Peter Spirtes, Carnegie Mellon

2
Causal discovery
What affects

Which actions will have beneficial effects?

3
Systemic causality
4
Feature Selection
Y
X
Predict Y from features X1, X2, Select most
predictive features.
5
Causation
Predict the consequences of actions Under
manipulations by an external agent, some
features are no longer predictive.
6
Challenge Design
7
Available data

A lot of observational data.
Correlation ? Causality!
Experiments are often needed, but
Costly
Unethical
Infeasible
This challenge, semi-artificial data
Re-simulated data
Real data with artificial probes

8
Four tasks
9
On-line feed-back
10
Difficulties

Violated assumptions
Causal sufficiency
Markov equivalence
Faithfulness
Linearity
Gaussianity
Overfitting (statistical complexity)
Finite sample size
Algorithm efficiency (computational complexity)
Thousands of variables
Tens of thousands of examples

11
Evaluation

Fulfillment of an objective
Prediction of a target variable
Predictions under manipulations

Causal relationships
Existence
Strength
Degree

12
Setting

Predict a target variable (on training and test
data).
Return the set of features used.
Flexibility
Sorted or unsorted list of features
Single prediction or table of results
Complete entry xxx0, xxx1, xxx2 results (for at
least one dataset).

13
Metrics

Results ranked according to the test set
target prediction performance Tscore
We also assess directly the feature set with a
Fscore, not used for ranking.

14
Toy Examples
15
Causality assessmentwith manipulations
16
LUCAS1 manipulated
Causality assessmentwith manipulations
17
Causality assessmentwith manipulations
LUCAS2 manipulated
18
Goal driven causality

We define
Vvariables of interest
(e.g. MB, direct causes, ...)

We assess causal relevance Fscoref(V,S).

19
Causality assessmentwithout manipulation?
20
Using artificial probes
Anxiety
Peer Pressure
Born an Even Day
Smoking
Genetics
Yellow Fingers
Lung Cancer
Attention Disorder
Allergy
LUCAP0 natural
Coughing
Fatigue
Car Accident
21
Using artificial probes
LUCAP12 manipulated
22
Scoring using probes

What we can compute (Fscore)
Negative class probes (here, all non-causes,
all manipulated).
Positive class other variables (may include
causes and non causes).
What we want (Rscore)
Positive class causes.
Negative class non-causes.
What we get (asymptotically)
Fscore (NTruePos/NReal) Rscore 0.5
(NTrueNeg/NReal)

23
Results
24
Challenge statistics

Start December 15, 2007.
End April 30, 2000
Total duration 20 weeks.
Last (complete) entry ranked

Number of ranked entrants
Number of ranked submissions
25
Learning curves
26
AUC distribution
27
REGED
28
SIDO
29
CINA
30
MARTI
31
Pairwise comparisons
32
Top ranking methods

According to the rules of the challenge
Yin Wen Chang SVM gt best prediction accuracy on
REGED and CINA. Prize 400 donated by Microsoft.
Gavin Cawley Causal explorer linear ridge
regression ensembles gt best prediction accuracy
on SIDO and MARTI. Prize 400 donated by
Microsoft.
According to pairwise comparisons
Jianxin Yin and Prof. Zhi Gengs group Partial
Orientation and Local Structural Learning gt best
on Pareto front, new original causal discovery
algorithm. Prize free WCCI 2008 registration.

33
Pairwise comparisons
REGED
SIDO
CINA
MARTI
34
Conclusion

We have found good correlation between causation
and prediction under manipulations.
Several algorithms have demonstrated
effectiveness of discovering causal
relationships.
We still need to investigate what makes then fail
in some cases.
We need to capitalize on the power of classical
feature selection methods.

Write a Comment

User Comments (0)