Title: Exact Inference
1Lecture 11
2Methods of Exact Inference
- Exact Inference means modelling all the
dependencies in the data. - This is only computationally feasible for small
or sparse networks. - Methods
- Cutset Conditioning (Pearl)
- Node Combination
- Join Trees (Lauritizen and Speigelhalter)
3Cutset Conditioning
- Pearl suggested a method for propagating
probabilities in multiply connected networks. - The first step is to identify a cut set of nodes,
which if instantiated makes propagation safe.
4Cutset Conditioning Example 1
- Given the Asia network, the minimum cutset
consists of node L. If L is instantiated
propagation is safe.
5Cutset Conditioning Example 2
- If data is not available on L the network can be
broken into 2 networks, one for each possible
instantiated state of L. - Both networks are singly connected
6Cutset Conditioning Example 3
- Probabilities can be propagated in both networks
safely. - A final value for the probability can be found by
weighting the results according to the prior
probability distribution for L.
7Problems with cutset conditioning 1
- The method relies on finding a small cutset.
- If the cutset consists of three nodes each with 8
states then the probability propagations need to
be carried out 83 512 times. - Thus the computation time expands quickly
8Problems with cutset conditioning 2
- For the previous example we need to make 512
networks each with a separate instantiation of
the three cutset variables. - We also need to find priors for all these
possible instantiations. - There may not be enough data to do this reliably.
9Joining dependent variables
- If C and D are dependent (given A) we can combine
them into one new variable by joining them.
10Instantiating Joined Variables
- If we instantiate one of C or D, but not the
other, (eg Dd2) we must either re-calculate the
link matrix - P(CDA) gt P(CA)Dd2
- or instantiate several states
- Given CD has states
- c1d1, c2d1, c3d1,c1d2,c2d2, c3d2
- If we instantiate Dd2 we have
- l(BC) 0,0,0,1,1,1
11Limitation of joining variables
- If two variables are to be joined, C having C
states and D having D states the new variable
CD will have CD states. - The increased number of states is undesirable and
limits the applicability of this approach - In the limit a fully connected network
degenerates into one node!
12Join Trees
- Join trees are a generalisation of the previous
method. - Given a network, with all dependencies included
as arcs, the objective is to find a way of
joining the variables such that propagation is
possible. - This idea was originally pioneered by Lauritzen
and Speigelhalter.
13Example
- We shall use the following medical example which
is taken from Neapolitans first book
14The join tree
- The join tree has the same joint probability
distribution as the original tree
15Join tree nodes
- The nodes of the original tree are joined
according to how the probability matrices are
grouped.
16Instantiation and propagation
- Propagation is done in just two passes, up
followed by down. The messages are not the same
as in Pearls algorithm
17Potential Representations
- Grouping the link matrices of the original tree
to form a join tree could be done in a number of
ways, and each is called a potential
representation. - A potential representation is a number of subsets
Wi (nodes in the join tree) of our variables
(nodes in the original tree), and a function ?
with the property
18Potential Representations
- In our previous example
- y(W1) P(A)P(BA)P(CA)
- y(W2) P(DBC)
- y(W3) P(EC)
- P(V) P(ABCDE)
19Intersections
- For a potential representation, given an
ordering, we can define some intersection sets - Si Wi ? (W1 ? W2 ? W3 . . . . ? Wi-1)
- (The variables in Wi which are in any lower index
set) - Ri Wi - Si
- (The variables in Wi that have not appeared
already) - NB Lower index gt parent
20Intersection Sets
21The running intersection property
- To permit propagation, the potential
representation must have the running intersection
property - Given an ordered set of subsets of our variables
V, for any adjacent sets Vi and Vj such that jlti - Si Wi ? (W1 ? W2 ? W3 . . . . ? Wi-1) ? Wj
- If this is so we can write
22Running Intersection Property
First Possibility
First Possible Join Tree
23Running Intersection Property - another tree
24Summary
- The ordering is set up so that each node of the
join tree has an associated set of variables R?S - R is a set of variables that have not been seen
above - S is a set of variables which must appear in the
immediate parents. - Thus we will have a link matrix P(RS)
25Finding the ordered subset of the nodes
- An ordered subset of the variables can always be
found from a given causal graph as follows - 1. Moralise the graph (join any unjoined parents)
- 2. Triangulate the graph
- 3. Identify the cliques of the resulting graph
- 4. For each variable X choose one clique CLi such
that - X ? Pa(X) ? CLi (Pa(X) means
parents of X) - 5. Define the function ? as follows
261,2. Moralising and triangulation
- After moralising this graph is triangulated so
there is no need for a triangulation step
273. Finding the Cliques
- A clique is a maximal set of nodes in which every
node is connected to every other
284. Allocating variables to Cliques
- We need to allocate the variables to cliques such
that their original parents are in the same
clique
295. Initialising Potential functions
- After allocating the variables we initialise
clique potential functions from the conditional
probabilities
30Defining the tree of cliques
- We now find an ordering of the cliques with the
running intersection property. That is - There is some ordering such that if jlti
- Si Cli ? (Cl1 ? Cl2 ? . . . ? Cli-1) ? Clj
- Clj can be taken as a parent of Cli
31Applying this to our example
- S2 Cl2 ? Cl1 B,C,D? A,B,C B,C
- Clearly B,C? A,B,C
- So we choose Cl1 to be the parent of Cl2
- S3 Cl3 ? (Cl1?Cl2) C,E?A,B,C,D C
- We have that C is a subset of both A,B,C and
B,C,D so either Cl1 or Cl2 can be the parent of
Cl3
32The Join tree
- The join tree is now completed.
- In all future probability calculations we use
just the join tree, disregarding our original
network