Title: Inference in Bayesian Networks
1Inference in Bayesian Networks
MINDLab. Seminars on Reasoning and Planning under
Uncertainty
- Ugur Kuter
- MIND Lab.
- 8400 Baltimore Avenue, Ste. 200
- College Park, Maryland, 20742
- Web Site for the seminars http//www.cs.umd.edu/u
sers/ukuter/uncertainty/
2From Last Episode
- A Bayesian Network is a DAG
- Nodes represent the events
- Arcs represent the causal influences between the
linked events - The strength (i.e. the quantification) of a
causal link is defined directly by the
conditional probabilities on the linked events - Conditional Independence in Bayesian Networks
- The event a d-separates the event b from c, if
- along every undirected link between b and c,
there is an event w that satisfies the following - if w does not have converging arrows then it is
equal to a - if w has converging arrows then a is not equal to
w or any of ws descendants
3Inference over Bayesian Networks
- Inference ? computing/updating our belief in some
designated query events, given the values for
some evidence - Given that I know a and c occurred in the world,
what is the probability that e and b will
occur/has occurred? - Exact vs. Approximate Inference
- Predictive vs. Diagnostic Inference
- Different inference algorithms for different
structures of the network models - Singly-Connected Networks
- Multiply-Connected Networks
a
b
c
d
e
4Exact Inference in Singly-Connected Networks
- Singly-connected networks
- there is at most one path between each pair of
events - e.g, chains, trees, poly-trees (a.k.a., forests)
- Note no loops
- Predictive inference is done via the chain rule
- of the probability theory
- P(a,b, , f) P(a) P(b a) . P(f a, b, ,
e) - Diagnostic inference is done via the chain rule
and the Bayes rule - P(a b) P(b a) P(a) / P(b)
a
b
b
c
f
e
d
5Exact Inference over Chains
- Example Three-event chain
- Prediction Given an evidence on , what is
the probability of ? - P( c a ) ?Btrue,falseP(c B, a)
?Btrue,falseP(c B) P(B a) - Diagnosis Given an evidence on , what is
the probability of ? - Use Bayes Rule to compute this conditional
probability - P(a c) P(c a) P(a) / P(c)
a
b
c
a
c
c
a
6Exact Inference over Trees and Poly-Trees
- Pearls Message Passing Techniques
- Three specific parameters for computing the
belief of an event, say c - Conditional Probability Table for c
- i.e., P(c parents(c))
- Predictive Support
- the probability of the parents of c,
- given all the evidence connected to a and b,
- except via c
- ?(c) P(c parents(c)) P(parents(c) all
evidence) - Diagnostic Support
- The probability of all of the evidence connected
to the children of c, except via c - ?(c) P(all evidence except via c c)
a
b
?(c)
c
f
?(c)
e
d
7Characteristics of Message Passing
- The algorithm uses the locality principle, i.e.,
it considers information from a events immediate
parents and children - If the number of parents is small then the
message passing quickly converges to an
equilibrium (or near-equilibrium) - Otherwise, the algorithm is not feasible
- Since the computation of messages are exponential
in the number of parents of an event
8Summary Exact Inference Algorithms over
Singly-Connected Networks
- Inference is done based on
- Predictive support for the occurrence of an event
- Given the ancestors, how is our belief of the
event affected? - Diagnostic support for the occurrence of an
event - Given the descendants, how is our belief of the
event affected? - Different inference mechanisms for different
network structures - Chains Simple application of the Bayes Rule and
the Chain Rule - Trees and Poly-Trees Message passing techniques
- Computationally expensive in most cases,
infeasible in some cases
9Multiply-Connected Networks (MCNs)
- Chains and trees are the most simple
probabilistic network models/networks - Given any two events in the model, there is one
and only one causal path in the network from the
one of those events to the other - This makes inference easier
- In many problems, we need to consider
- multiple paths of information flow
- between events
- Example Cancer network
Cancer
Chemical Disorders
Brain Tumor
Coma
Headaches
10Exact Inference over MCNs
- The inference techniques for singly-connected
networks do not work for MCNs - They are prone to counting the same information
multiple times - most popular ways to deal with this problem in
MCNs - Clustering Methods
- Ad-hoc clustering techniques
- Junction trees
11Clustering Methods
- Clustering methods transform an MCN into a
probabilistically equivalent poly-tree - Such a transformation is done by merging several
events in MCNs into a single compound event in
order to break the information flow over multiple
paths - Probabilistically equivalence is guaranteed by
computing the joint probability distribution of
the events that are merged into a compound event
12Ad-Hoc Clustering Example
a
a
c
b
bc
e
d
e
d
13After Clustering
- Once we cluster the events of an MCN, we can use
any exact inference algorithms developed for
singly-connected networks - Clustering reduces the size of the network,
sometimes exponentially - However the computation required for inference is
not necessarily reduced - Building the compound CPTs may still take
exponential time
14Junction Trees
- The transformation in the clustering example we
have discussed is ad hoc - We just looked at the network and merged events
such that we avoided information flow to the same
event through multiple paths - The motivation behind the junction tree methods
is to provide a systematic and an efficient way
to do clustering - Moralization
- Triangulation
- Restructuring
- Belief Update
a
c
b
d
e
f
15Moralization Triangulation
- Considering the undirected network, marry the
parent nodes that have a common child - Thentriangulate every cycle produced from
marriages - The objective of moralization and triangulation
is to have cycles with length 3
a
a
c
b
c
b
d
d
e
f
e
f
16Restructuring
- Identify all maximal cliques in the network
- In this example, we have
- abc, bcd, bde, and df
- Identify the separators between the maximal
cliques - bc between abc and bcd
- bd between bcd and bde
- d between (1) bde and df, and (2) bcd and
df
a
c
b
d
e
f
17Restructuring (contd)
- Create a new network where the cliques of the
original networks are compound nodes
a
abc
c
b
bcd
bde
d
df
e
f
Generated network is always a poly-tree (or a
poly-graph), called a Junction Tree (or a
Junction graph)
18Belief Update in Junction Trees
- The CPTs for the nodes in the junction tree are
computed by the cross-products of the CPTs from
the original network -
- Example
- Then, the belief update can be done by using
exact inference methods for singly-connected
trees
abc
bcd
bde
df
P(b a) P (c a) P (d b c)
19Summary
- We have covered the basics of exact inference in
singly- and multiply-connected networks - Pros The outcome of the inference is exact
probability distributions - Cons The computations are generally exponential
(either the inference or building the compact
CPTs) - Next Week Approximate Inference Methods