Judea Pearl - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Judea Pearl

Description:

Judea Pearl – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 42
Provided by: CSD5153
Category:
Tags: dax | judea | pearl

less

Transcript and Presenter's Notes

Title: Judea Pearl


1
THE MATHEMATICS OF CAUSE AND EFFECT
  • Judea Pearl
  • University of California
  • Los Angeles
  • (www.cs.ucla.edu/judea)

2
REFERENCES ON CAUSALITY
Home page Tutorials, Lectures, slides,
publications and blog www.cs.ucla.edu/judea/
Background information and comprehensive
treatment, Causality (Cambridge University Press,
2000) General introduction http//bayes.cs.ucl
a.edu/IJCAI99/ Gentle introductions for
empirical scientists ftp//ftp.cs.ucla.edu/pub/st
at_ser/r338.pdf ftp//ftp.cs.ucla.edu/pub/stat_ser
/Test_pea-final.pdf Direct and Indirect
Effects ftp//ftp.cs.ucla.edu/pub/stat_ser/R271.p
df
3
OUTLINE
  • Causality Antiquity to robotics
  • Modeling Statistical vs. Causal
  • Causal Models and Identifiability
  • Inference to three types of claims
  • Effects of potential interventions
  • Claims about attribution (responsibility)
  • Claims about direct and indirect effects

4
ANTIQUITY TO ROBOTICS
I would rather discover one causal relation than
be King of Persia Democritus (430-380
BC)
Development of Western science is
based on two great achievements the invention of
the formal logical system (in Euclidean geometry)
by the Greek philosophers, and the discovery of
the possibility to find out causal relationships
by systematic experiment (during the
Renaissance). A. Einstein, April 23, 1953
5
THE BASIC PRINCIPLES
Causation encoding of behavior
under interventions Interventions surgeries
on mechanisms Mechanisms
stable functional relationships
equations graphs
6
TRADITIONAL STATISTICAL INFERENCE PARADIGM
e.g., Infer whether customers who bought product
A would also buy product B. Q P(B A)
7
FROM STATISTICAL TO CAUSAL ANALYSIS 1. THE
DIFFERENCES
Probability and statistics deal with static
relations
P Joint Distribution
P? Joint Distribution
Q(P?) (Aspects of P?)
Data
change
Inference
What happens when P changes? e.g., Infer whether
customers who bought product A would still buy A
if we were to double the price.
8
FROM STATISTICAL TO CAUSAL ANALYSIS 1. THE
DIFFERENCES
What remains invariant when P changes say, to
satisfy P? (price2)1
P Joint Distribution
P? Joint Distribution
Q(P?) (Aspects of P?)
Data
change
Inference
Note P? (v) ? P (v price 2) P does not
tell us how it ought to change e.g. Curing
symptoms vs. curing diseases e.g. Analogy
mechanical deformation
9
FROM STATISTICAL TO CAUSAL ANALYSIS 1. THE
DIFFERENCES (CONT)

10
FROM STATISTICAL TO CAUSAL ANALYSIS 1. THE
DIFFERENCES (CONT)
  • Causal assumptions cannot be expressed in the
    mathematical language of standard statistics.

11
FROM STATISTICAL TO CAUSAL ANALYSIS 1. THE
DIFFERENCES (CONT)
  • Causal assumptions cannot be expressed in the
    mathematical language of standard statistics.

12
FROM STATISTICAL TO CAUSAL ANALYSIS 2. THE
MENTAL BARRIERS
  • Every exercise of causal analysis must rest on
    untested, judgmental causal assumptions.
  • Every exercise of causal analysis must invoke
    non-standard mathematical notation.

13
TWO PARADIGMS FOR CAUSAL INFERENCE
Observed P(X, Y, Z,...) Conclusions needed
P(Yxy), P(Xyx Zz)... How do we connect
observables, X,Y,Z, to counterfactuals Yx, Xz,
Zy, ?
N-R model Counterfactuals are primitives, new
variables Super-distribution P(X, Y,, Yx,
Xz,) X, Y, Z constrain Yx, Zy,
Structural model Counterfactuals are derived
quantities Subscripts modify a data-generating
model
14
THE STRUCTURAL MODEL PARADIGM
Joint Distribution
Data Generating Model
Q(M) (Aspects of M)
Data
Inference
M Oracle for computing answers to
Qs. e.g., Infer whether customers who bought
product A would still buy A if we were to double
the price.
15
FAMILIAR CAUSAL MODEL ORACLE FOR MANIPILATION
X
Y
Z
INPUT
OUTPUT
16
STRUCTURAL CAUSAL MODELS
  • Definition A structural causal model is a
    4-tuple
  • ?V,U, F, P(u)?, where
  • V V1,...,Vn are observable variables
  • U U1,...,Um are background variables
  • F f1,..., fn are functions determining V,
  • vi fi(v, u)
  • P(u) is a distribution over U
  • P(u) and F induce a distribution P(v) over
    observable variables

17
CAUSAL MODELS AND COUNTERFACTUALS
Definition The sentence Y would be y (in
situation u), had X been x, denoted Yx(u) y,
means The solution for Y in a mutilated model
Mx, (i.e., the equations for X replaced by X
x) with input Uu, is equal to y.

18
APPLICATIONS
  • . Predicting effects of actions and policies
  • . Learning causal relationships from
  • assumptions and data
  • . Troubleshooting physical systems and plans
  • . Finding explanations for reported events
  • . Generating verbal explanations
  • . Understanding causal talk
  • . Formulating theories of causal thinking

19
AXIOMS OF CAUSAL COUNTERFACTUALS
Y would be y, had X been x (in state U u)
  • Definiteness
  • Uniqueness
  • Effectiveness
  • Composition
  • Reversibility

20
RULES OF CAUSAL CALCULUS
  • Rule 1 Ignoring observations
  • P(y dox, z, w) P(y dox, w)
  • Rule 2 Action/observation exchange
  • P(y dox, doz, w) P(y dox,z,w)
  • Rule 3 Ignoring actions
  • P(y dox, doz, w) P(y dox, w)

21
DERIVATION IN CAUSAL CALCULUS
Genotype (Unobserved)
Smoking
Tar
Cancer
Probability Axioms
P (c dos) ?t P (c dos, t) P (t dos)
Rule 2
?t P (c dos, dot) P (t dos)
Rule 2
?t P (c dos, dot) P (t s)
Rule 3
?t P (c dot) P (t s)
Probability Axioms
?s???t P (c dot, s?) P (s? dot) P(t s)
Rule 2
?s???t P (c t, s?) P (s? dot) P(t s)
Rule 3
?s? ?t P (c t, s?) P (s?) P(t s)
22
THE BACK-DOOR CRITERION
Graphical test of identification P(y do(x)) is
identifiable in G if there is a set Z
of variables such that Z d-separates X from Y in
Gx.
Z1
Z1
Z2
Z2
Z
Z3
Z3
Z5
Z4
Z5
Z4
X
X
Z6
Y
Y
Z6
23
RECENT RESULTS ON IDENTIFICATION
  • do-calculus is complete
  • Complete graphical criterion for identifying
  • causal effects (Shpitser and Pearl, 2006).
  • Complete graphical criterion for empirical
  • testability of counterfactuals
  • (Shpitser and Pearl, 2007).

24
DETERMINING THE CAUSES OF EFFECTS (The
Attribution Problem)
  • Your Honor! My client (Mr. A) died BECAUSE
  • he used that drug.

25
DETERMINING THE CAUSES OF EFFECTS (The
Attribution Problem)
  • Your Honor! My client (Mr. A) died BECAUSE
  • he used that drug.
  • Court to decide if it is MORE PROBABLE THAN
  • NOT that A would be alive BUT FOR the drug!
  • P(? A is dead, took the drug) gt 0.50

PN
26
THE PROBLEM
  • Semantical Problem
  • What is the meaning of PN(x,y)
  • Probability that event y would not have
    occurred if it were not for event x, given that x
    and y did in fact occur.

27
THE PROBLEM
  • Semantical Problem
  • What is the meaning of PN(x,y)
  • Probability that event y would not have
    occurred if it were not for event x, given that x
    and y did in fact occur.
  • Answer
  • Computable from M

28
THE PROBLEM
  • Semantical Problem
  • What is the meaning of PN(x,y)
  • Probability that event y would not have
    occurred if it were not for event x, given that x
    and y did in fact occur.

29
TYPICAL THEOREMS (Tian and Pearl, 2000)
  • Bounds given combined nonexperimental and
    experimental data
  • Identifiability under monotonicity (Combined
    data)

corrected Excess-Risk-Ratio
30
CAN FREQUENCY DATA DECIDE LEGAL RESPONSIBILITY?
Experimental Nonexperimental do(x)
do(x?) x x? Deaths (y) 16
14 2 28 Survivals (y?) 984
986 998 972 1,000 1,000 1,000 1,000
  • Nonexperimental data drug usage predicts longer
    life
  • Experimental data drug has negligible effect on
    survival
  • Plaintiff Mr. A is special.
  • He actually died
  • He used the drug by choice
  • Court to decide (given both data)
  • Is it more probable than not that A would be
    alive
  • but for the drug?

31
SOLUTION TO THE ATTRIBUTION PROBLEM
  • Combined data tell more that each study alone

32
EFFECT DECOMPOSITION
  • What is the semantics of direct and indirect
    effects?
  • What are their policy-making implications?
  • Can we estimate them from data? Experimental
    data?

33
WHY DECOMPOSE EFFECTS?
  • Direct (or indirect) effect may be more
    transportable.
  • Indirect effects may be prevented or controlled.
  • Direct (or indirect) effect may be forbidden

?
Pill
Pregnancy


Thrombosis
Gender
Qualification
Hiring
34
SEMANTICS BECOMES NONTRIVIAL IN NONLINEAR
MODELS (even when the model is completely
specified)
X
Z
z f (x, ?1) y g (x, z, ?2)
Y
Dependent on z?
Void of operational meaning?
35
THE OPERATIONAL MEANING OF DIRECT EFFECTS
X
Z
z f (x, ?1) y g (x, z, ?2)
Y
Natural Direct Effect of X on Y The expected
change in Y per unit change of X, when we keep Z
constant at whatever value it attains before the
change. In linear models, NDE Controlled
Direct Effect
36
THE OPERATIONAL MEANING OF INDIRECT EFFECTS
X
Z
z f (x, ?1) y g (x, z, ?2)
Y
Natural Indirect Effect of X on Y The expected
change in Y when we keep X constant, say at x0,
and let Z change to whatever value it would have
under a unit change in X. In linear models,
NIE TE - DE
37
POLICY IMPLICATIONS OF INDIRECT EFFECTS
indirect
What is the direct effect of X on Y?
The effect of Gender on Hiring if sex
discrimination is eliminated.
X
Z
IGNORE
f
Y
38
SEMANTICS AND IDENTIFICATION OF NESTED
COUNTERFACTUALS
Consider the quantity Given ?M, P(u)?, Q is
well defined Given u, Zx(u) is the solution for
Z in Mx, call it z
is the solution for Y in Mxz Can Q be
estimated from
data?
39
GENERAL PATH-SPECIFIC EFFECTS (Def.)
X
X
Z
W
Z
W
Y
Y
Form a new model, , specific to active
subgraph g
Definition g-specific effect
Nonidentifiable even in Markovian models
40
EFFECT DECOMPOSITION SUMMARY
  • Graphical conditions for estimability from
  • experimental / nonexperimental data.
  • Graphical conditions hold in Markovian models
  • Useful in answering new type of policy
    questions
  • involving mechanism blocking instead of variable
    fixing.

41
CONCLUSIONS
  • Structural-model semantics, enriched with logic
  • and graphs, provides
  • Complete formal basis for causal reasoning
  • Powerful and friendly causal calculus
  • Lays the foundations for asking more difficult
    questions What is an action? What is free
    will? Should robots be programmed to have this
    illusion?
Write a Comment
User Comments (0)
About PowerShow.com