Title: Causality
1Causality
- Computational Systems Biology Lab
- Arizona State University
- Michael Verdicchio
- 26 March 2008
- With some slides and slide content from
- Judea Pearl, Chitta Baral, Xin Zhang
2Talking Points
- Conditional Independence
- Definitions
- Interpretation
- Notation
- d-Separation
- Graphical Probabilistic Models
- Causal Graphs
- Causal Modeling Framework
3Conditional Independence
4Conditional Independence
5Conditional Independence (Notation)
6Talking Points
- Conditional Independence
- d-Separation
- Three rules
- Probabilistic implications
- Graphical Probabilistic Models
- Causal Graphs
- Causal Modeling Framework
7d-Separation
- d-separation is a criterion for deciding, from a
given a causal graph, whether a set X of
variables is independent of another set Y, given
a third set Z
8d-Separation Rule 1
- Rule 1 Unconditional Separation
- Two nodes are d-connected if there is an
unblocked path between them - Path edges without directionality
- Unblocked no head-to-head arrows
9d-Separation Rule 1
- One collider at t
- x-r-s-t unblocked, so x and t are d-connected
- t-u-v-y unblocked, so t and y are d-connected
- So are all the pairs, x-r, x-s, r-s, t-u, etc.
- x and y are not d-connected since we cant trace
a path without hitting the collider hence they
are d-separated - So too are x-u, x-v, r-u, etc.
10d-Separation Rule 2
- Rule 2 x and y are d-connected, conditioned on a
set Z if there is a collider-free path between x
and y that traverses no member of Z - If no such path exists, we say that x and y are
d-separated by Z - We also say then that every path between x and y
is "blocked" by Z
11d-Separation Rule 2
- Let Z be the set r,v
- By Rule 2, x and y are d-separated by Z, along
with x-s, u-y, s-u, etc. - The path x-r-s is blocked by Z, along with u-v-y
and s-t-u. - Only s-t and u-t remain d-connected conditioned
on Z - The path s-t-u is also blocked Z since t is a
collider, and is blocked by Rule 1
12d-Separation Rule 3
- Rule 3 If a collider is a member of the
conditioning set Z, or has a descendant in Z,
then it no longer blocks any path that traces
this collider - Called the common effect of two independent
causes explaining away one - dead battery ? car wont start ? no gas
13d-Separation Rule 3
- Let Z be the set r, p
- By Rule 3 s and y are d-connected by Z
- the collider at t has a descendant (p) in Z
- This unblocks the path s-t-u-v-y
- x and u are still d-separated by Z
- the linkage at t is unblocked
- but the one at r is blocked by Rule 2 (since r is
in Z).
14Theorem 1
- Probabilistic implication of d-separation
15Theorem 2
16Talking Points
- Conditional Independence
- d-Separation
- Graphical Probabilistic Models
- Bayesian Networks
- Causation vs. Correlation
- Causal Graphs
- Causal Modeling Framework
17Graphical Probabilistic Models
- Provide convenient means of expressing
substantive assumptions - Facilitate economical representation of joint
probability functions - Facilitate efficient inferences from observations
- ? Bayesian Networks
18Bayesian Networks
- Chain Rule of Probability
- With Markov Assumption
19Causation vs. Correlation
- Correlation
- Rain and a falling barometer
- Rain does not cause barometer to fall
- Barometer falling does not cause rain
- Causation
- Rain causes mud
- Other events may also cause mud
- Denote causal relationship graphically with
directed edges
20Talking Points
- Conditional Independence
- d-Separation
- Graphical Probabilistic Models
- Causal Graphs
- Effects of intervention
- Causal Bayesian Networks
- Causal Stability
- Causal Modeling Framework
21Directed (causal) Graphs
- A and B are causally independent
- C, D, E, and F are causally dependent on A and B
- A and B are direct causes C
- A and B are indirect causes D, E and F
- If C is prevented from changing with A and B,
then A and B will no longer cause changes in D, E
and F.
22Causal Graphs
- DAGs as conditional independence carriers does
not imply causation - So build DAG models around causal rather than
associative information - If conditional independence judgments are
byproducts of stored causal relationships, then
representing these causal relationships directly
would be a more natural and more reliable way of
expressing what we know or believe about the
world. --Judea Pearl
23Causal Graphs
- Another advantage
- Probabilities do not predict effects of
interventions - Causal networks can be oracles for interventions
- Example turn the sprinkler on remove edges
24Causal Graphs
- The ability of causal graphs to predict the
effects of actions requires a stronger set of
assumptions in network construction - These assumptions must rest on causal knowledge,
not just associational - These assumptions are encapsulated in the
following definition of Causal Bayesian Networks
25Definition
26Definition
27Definition
28Causal Stability
- We now have a semantic basis for things like
causal effects or causal influence - Causal relationships are ontologicaldescribing
objective physical constraints in the our world - Probabilistic relationships are
epistemicreflecting what we know or believe
about the world - Thus, causal relationships should remain
unaltered as long as no changes have happened in
the environment -- even when knowledge about the
environment changes
29Causal Stability
- For example, causal statement S1
- turning the sprinkler on would not affect the
rain - Versus probabilistic statement S2
- the state of the sprinkler is independent of (or
unassociated with) the state of the rain - The network above shows two ways S2 can become
false, but S1 remains true
30Talking Points
- Conditional Independence
- d-Separation
- Graphical Probabilistic Models
- Causal Graphs
- Causal Modeling Framework
- Causal Structure
- Causal Model
- IC Algorithm
31Causal Structure
32Causal Model
33Causal Model
- Once a causal model M is formed, it defines a
joint probability distribution P(M) over the
variables in the system - This distribution reflects some features of the
causal structure - Each variable must be independent of its
grandparents, given the values of its parents - We may recover the topology D of the DAG, from
features of the probability distribution
34IC algorithm (Inductive Causation)
- IC algorithm (Pearl)
- Based on variable dependencies
- Find all pairs of variables that are dependent of
each other (applying standard statistical method
on the database) - Eliminate (as much as possible) indirect
dependencies - Determine directions of dependencies
35Comparing abduction, deduction and induction
A gt B A --------- B
- Deduction major premise All balls in the
box are black - minor premise
These balls are from the box - conclusion
These balls are black - Abduction rule All balls
in the box are black - observation
These balls are black - explanation These balls
are from the box - Induction case These
balls are from the box - observation
These balls are black - hypothesized rule All
ball in the box are black -
A gt B B ------------- Possibly A
Whenever A then B but not vice versa -------------
Possibly A gt B
Induction from specific cases to general
rules Abduction and deduction both from
part of a specific case to other part of
the case using general rules (in different ways)
Source from httpwww.csee.umbc.edu/ypeng/F02671/le
cture-notes/Ch15.ppt
36IC Algorithm (Contd)
- Input
- P a stable distribution on a set V of
variables - Output
- A pattern H(P) compatible with P
- Pattern is a partially directed DAG
- some edges are directed and
- some edges are undirected
37IC Algorithm Step 1
- For each pair of variables a and b in V, search
for a set Sab such that (a-b Sab) holds in P
in other words, a and b should be independent in
P, conditioned on Sab . - Construct an undirected graph G such that
vertices a and b are connected with an edge if
and only if no set Sab can be found.
38IC Algorithm Step 2
- For each pair of nonadjacent variables a and b
with a common neighbor c, check if c? Sab. - If it is, then continue
- Else add arrowheads at c
- i.e a? c ? b
39Example
40IC Algorithm Step 3
- In the partially directed graph that results,
orient as many of the undirected edges as
possible subject to two conditions - The orientation should not create a new
v-structure - The orientation should not create a directed
cycle
41Rules required to obtain a maximally oriented
pattern
- R1 Orient b c into b?c whenever there is an
arrow a?b such that a and c are non adjacent
42Rules required to obtain a maximally oriented
pattern
- R2 Orient a b into a?b whenever there is a
chain a?c?b
43Rules required to obtain a maximally oriented
pattern
- R3 Orient a b into a?b whenever there are two
chains ac?b and ad?b such that c and d are
nonadjacent
44Rules required to obtain a maximally oriented
pattern
- R4 Orient a b into a?b whenever there are two
chains ac?d and c?d?b such that c and b are
nonadjacent
c
a
d
d
c
b
45Next Time
- Using a variant of the IC algorithm to learn
causal gene regulatory networks - Again, thanks to Chitta Baral, Andrew Moore and
Xin Zhang, all of whom got most of their stuff
from Judea Pearl