Title: ?Homework 3 returned; solutions posted
110/22
- ?Homework 3 returned solutions posted
- ?Homework 4 socket opened
- ?Project 3 assigned
- ?Mid-term on Wednesday
- ?(Optional) Review session Tuesday
2Conjunctive queries are essentially computing
joint distributions on sets of query variables. A
special case of computing the full joint on
query variables is finding just the query
variable configuration that is Most likely given
the evidence. There are two special cases
here Also MPEMost Probable Explanation Most
likely assignment to all other variables given
the evidence ?Mostly involves max/product
MAPMaximum a posteriori Most likely
assignment to some variables given the evidence
?Can involve, max/product/sum operations
3Overview of BN Inference Algorithms
TONS OF APPROACHES
- Exact Inference
- Complexity
- NP-hard (actually P-Complete since we count
models) - Polynomial for Singly connected networks (one
path between each pair of nodes) - Algorithms
- Enumeration
- Variable elimination
- Avoids the redundant computations of Enumeration
- Many others such as message passing
algorithms, Constraint-propagation based
algorithms etc.
- Approximate Inference
- Complexity
- NP-Hard for both absolute and relative
approximation - Algorithms
- Based on Stochastic Simulation
- Sampling from empty networks
- Rejection sampling
- Likelihood weighting
- MCMC And many more
4Network Topology Complexity of Inference
The size of the merged network can
be Exponentially larger (so polynomial
inference On that network isnt exactly gods
gift ?
Cloudy
Multiply- connected Inference NP-hard
Sprinklers
Rain
Wetgrass
Can be converted to singly-connected (by merging
nodes)
Cloudy
Singly Connected Networks (poly-trees At most
one path between any pair of nodes) Inference
is polynomial
SprinklersRain (takes 4 values 2x2)
Wetgrass
5Examples of singly connected networks include
Markov Chains and Hidden Markov Models
6(No Transcript)
7(No Transcript)
8(No Transcript)
9fA(a,b,e)fj(a)fM(a) fA(a,b,e)fj(a)fM(a)
10Complexity depends on the size of the largest
factor which in turn depends on the order
in which variables are eliminated..
11(No Transcript)
12Variable Elimination and Irrelevant Variables
- Suppose we asked the query P(JAt)
- Which is probability that John calls given that
Alarm went off - We know that this is a simple lookup into the CPT
in our bayes net. - But, variable elimination algorithm is going to
sum over the three other variables unnecessarily - In those cases, the factors will be degenerate
(will sum to 1 see next slide) - This problem can be even more prominent if we had
many other variables in the network - Qn How can we make variable elimination wake-up
and avoid this unnecessary work? - General answer is to
- (a) identify variables that are irrelevant given
the query and evidence - In the P(JA), we should be able to see that
e,b,m are irrelevant and remove them - (b) remove the irrelevant variables from the
network - A variable v is irrelevant for a query P(XE) if
X v E (i.e., X is conditionally independent
of v given E). - We can use BayesBall or DSEP notions to figure
out irrelevant variables v - There are a couple of easier sufficient
conditions for irrelevance (both of which are
special cases of BayesBall/DSep).
13Sufficient Condition 1
In general, any leaf node that is not a query
or evidence variable is irrelevant (and can
be removed) (once it is removed, others may be
seen to be irrelevant)
Can drop irrelevant variables from the network
before starting the query off..
14Sufficient Condition 2
Note that condition 2 doesnt subsume condition
1. In particular, it wont allow us to say that
M is irrelevant for the query P(JB)
15Notice that sampling methods could in general be
used even when we dont know the bayes net
(and are just observing the world)! ?We
should strive to make the sampling more efficient
given that we know the bayes net
16(No Transcript)
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23Generating a Sample from the Network
ltC, S, R, Wgt
Network ?Samples ?Joint distribution
24(No Transcript)
25That is, the rejection sampling method doesnt
really use the bayes network that much
26(No Transcript)
27Notice that to attach the likelihood to the
evidence, we are using the CPTs in the bayes
net. (Model-free empirical observation, in
contrast, either gives you a sample or not we
cant get fractional samples)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37MCMC not covered
38(No Transcript)
39(No Transcript)
40Note that the other parents of zj are part of
the markov blanket
41(No Transcript)
42Case Study Pathfinder System
- Domain Lymph node diseases
- Deals with 60 diseases and 100 disease findings
- Versions
- Pathfinder I A rule-based system with logical
reasoning - Pathfinder II Tried a variety of approaches for
uncertainity - Simple bayes reasoning outperformed
- Pathfinder III Simple bayes reasoning, but
reassessed probabilities - Parthfinder IV Bayesian network was used to
handle a variety of conditional dependencies. - Deciding vocabulary 8 hours
- Devising the topology of the network 35 hours
- Assessing the (14,000) probabilities 40 hours
- Physician experts liked assessing causal
probabilites - Evaluation 53 referral cases
- Pathfinder III 7.9/10
- Pathfinder IV 8.9/10 Saves one additional life
in every 1000 cases! - A more recent comparison shows that Pathfinder
now outperforms experts who helped design it!!