Cause and Independence - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

Cause and Independence

Description:

Minimal Spanning tree approach. A variant on the spanning tree for cases ... 1. Build a spanning tree and obtain an ordering of the nodes starting at the root. ... – PowerPoint PPT presentation

Number of Views:25

Avg rating:3.0/5.0

Slides: 29

Provided by: dfg8

Category:

more less

Transcript and Presenter's Notes

Title: Cause and Independence

1
Lecture 7

Cause and Independence

2
Cause in trees

We noted previously that cause can be found by
identifying the root nodes of network. (An expert
is required for this)
However it is also possible (in theory) to
determine cause statistically

3
Possible configurations for a triplet
4
Conditional Independence

Remember that, for configurations of type 1 and
type 2 the nodes A and B are conditionally
independent (given C)

5
Marginal Independence

However for the type 3 triplet, the nodes A and B
are only independent if there is no information
on C.

6
Determining marginal independence

Given a data set for our triplet A-C-B we can
measure the dependence of A and B using all the
data. (ie with no information on C)
If this is low we may suspect a multiple parent

7
Determining marginal independence

Alternatively we can partition our data according
to the states of C, and then compute a set of
dependency values (one for each state of C).
If any of these is high we may suspect a multiple
parent

8
Practical Computation

Partition the data according to the states of the
middle node, calculate the dependency for each
set.
if Dep(A,B) small, and some Dep(A,B)Ccj is large
assume multiple parent

9
Algorithm for determining causal directions

For each triplet A-C-B in the tree
Test to see if A and B are independent, but have
some dependency given C
For any such triplet set the arrow directions
A?C?B

10
Continuing to find causal links

Having obtained some arrows in the network we can
now extend the procedure to resolve the case
where a node has a known parent.
If A and C are independent given B, B is the
parent of C otherwise vice versa.

11
We start with an undirected tree
12
Test for multiple parents
13
Propagate the arrows where possible
14
Continue until no propagation is possible
15
Problems in determining cause

Mutual entropy is a continuous function, it tells
us only the degree to which variables are
dependent. Thus we need thresholds to decide
whether a node has multiple parents.
We may find a few (or no) cases of multiple
parents.

16
Problem break

Given the following data, what arc directions
would you give to the triple A-B-C

If we ignore B then A and C are completely
independent. However given Bb0 or Bb1 there is
complete dependence Thus the causal picture is
A?B?C
If you believe Pearl!
17
Structure and Parameter Learning

One of the good features of Bayesian networks is
that they combine both structure and parameters.
We can express our knowledge (if any) about the
data by choosing a structure.
We then optimise the performance by adjusting the
parameters (link matrices).

18
Pure Parameter Learning

Neural networks are a class of inference systems
which offer just parameter learning.
Generally it is very difficult to embed knowledge
into a neural net, or infer a structure once the
learning phase is complete.

19
Pure structural Learning

Traditional rule based inference systems have
just structure (sometimes with a rudimentary
parameter mechanism).
They do offer structure modification through
methods such as rule induction.
However, they are difficult to optimise using
large data sets.

20
Small is beautiful

The joint probability of the variables in a
Bayesian Network is simply the product of the
conditional probabilities and the priors of the
root(s).
If the network is an exact model of the data then
it must represent the dependency exactly.
However, using a spanning tree algorithm this may
not be the case

21
Small is Beautiful

In particular, we may not be able to insert an
arc between two nodes with some dependency
because it would form a loop.
The effect of unaccounted dependencies is likely
to be more pronounced as the number of variables
in the network increases.

22
Minimal Spanning tree approach

A variant on the spanning tree for cases where
the class node is known was proposed by Enrique
Sucar
This requires a measure of quality of the tree.
A simple approach to this is to test the
predictive ability of the network.

23
The steps are as follows

1. Build a spanning tree and obtain an ordering
of the nodes starting at the root.
2. Remove all arcs
3. Add arcs in the order of the magnitude of
their dependency
4. If the predictive ability of the network is
good enough (or the nodes are all joined) stop,
otherwise go to step 3

24
Multi-Trees

An interesting way of reducing the size of
Bayesian classifier networks was proposed by
Heckerman
Here the data set is partitioned according to the
states of the root node(s)

25
Example of Multi trees

Given a data set with the root identified as D
a1,b1,c1,d1 a2,b1,c1,d1 a2,b1,c2,d1
a2,b2,c1,d1
a1,b2,c2,d2 a2,b1,c1,d2 a2,b2,c2,d2
a1,b1,c1,d2
a1,b1,c1,d3 a2,b2,c2,d3 a2,b2,c1,d3
a2,b2,c2,d3
Since D has three states we split the data into 3
sets

26
Example of Multi trees, part 2

Data set for Dd1
a1,b1,c1 a2,b1,c1 a2,b1,c2 a2,b2,c1
Data set for Dd2
a1,b2,c2 a2,b1,c1 a2,b2,c2 a1,b1,c1
Data set for Dd3
a1,b1,c1, a2,b2,c2 a2,b2,c1 a2,b2,c2
And we use the spanning tree methodology to find
three trees with 3 variables rather than one tree
with 4 variables

27
Example of Multi trees, part 3
The resulting trees will have different structure
and different conditional probabilities (when
some causal relation has been established)
28
Using multi-trees

For a given data point (ai,bj,ck) we calculate
the joint probability using each tree found
Evidence for d1 is PDd1(aibjck)
Evidence for d2 is PDd2(aibjck)
Evidence for d3 is PDd3(aibjck)
The evidence can be normalised to form a
distribution.

Write a Comment

User Comments (0)