Title: Bayesian networks practice
1Bayesian networks practice
2Semantics
We order them according to the topological of the
given BayesNet
Suppose we have the variables X1,,Xn. The
probability for them to have the values x1,,xn
respectively is P(xn,,x1)
P(xn,,x1) is short for P(Xnxn,, Xn x1)
- e.g.,
- P(j ? m ? a ? ?b ? ?e)
- P(j a) P(m a) P(a ?b, ?e) P(?b) P(?e)
-
3Inference in Bayesian Networks
- Basic task is to compute the posterior
probability for a query variable, given some
observed event - that is, some assignment of values to a set of
evidence variables. - Notation
- X denotes query variable
- E denotes the set of evidence variables E1,,Em,
and e is a particular event, i.e. an assignment
to the variables in E. - Y will denote the set of the remaining variables
(hidden variables). - A typical query asks for the posterior
probability P(xe1,,em) - E.g. We could ask Whats the probability of a
burglary if both Mary and John call, P(burglary
johhcalls, marycalls)?
4Classification
- We compute and compare the following
- However, how do we compute
What about the hidden variables Y1,,Yk?
5Inference by enumeration
Example P(burglary johhcalls, marycalls)?
(Abbrev. P(bj,m))
6Another example
Once the right topology has been found. the
probability table associated with each node is
determined. Estimating such probabilities is
fairly straightforward and is similar to the
approach used by naïve Bayes classifiers.
7High Blood Pressure
- Suppose we get to know that the new patient has
high blood pressure. - Whats the probability he has heart disease under
this condition?
8High Blood Pressure (Contd)
9High Blood Pressure (?)
10High Blood Pressure, Healthy Diet, and Regular
Exercise
11High Blood Pressure, Healthy Diet, and Regular
Exercise (Contd)
12High Blood Pressure, Healthy Diet, and Regular
Exercise (Contd)
The model therefore suggests that eating
healthily and exercising regularly may reduce a
person's risk of getting heart disease.
13Weather data
What is the Bayesian Network corresponding to
Naïve Bayes?
14Effects and Causes vs. Evidence and Class
- Why Naïve Bayes has this graph?
- Because when we compute in Naïve Bayes
- P(playyes E)
- P(OutlookSunny playyes)
- P(TempCool playyes)
- P(HumidityHigh playyes)
- P(WindyTrue playyes)
- P(playyes) / P(E)
- we are interested in computing P(playyes),
which are probabilities of our evidence
observations given the class.
- Of course, play isnt a cause for outlook,
temperature, humidity, and windy. - However, play is the class and knowing that it
has a certain value, will influence the
observational evidence probability values. - For example, if playyes, and we know that the
playing happens indoors, then it is more probable
(than without this class information) the outlook
to be observed rainy.
15Right or Wrong Topology?
- In general, there is no right or wrong graph
topology. - Of course the calculated probabilities (from the
data) will be different for different graphs. - Some graphs will induce better classifiers than
some other. - If you reverse the arrows in the previous figure,
then you get a pure causal graph, - whose induced classifier might have estimated
error (through cross-validation) better or worse
than the Naïve Bayes one (depending on the data).
- If the topology is constructed manually, we
(humans) tend to prefer the causal direction. - In domains such as medicine the graphs are
usually less complex in the causal direction.
16Weka suggestion
How Weka finds the shape of the graph? Fixes an
order of attributes (variables) and then adds and
removes arcs until it gets the smallest estimated
error (through cross-validation). By default it
starts with a Naïve Bayes network. Also, it
maintains a score of graph complexity, trying to
keep the complexity low.
17(No Transcript)
18Laplace correction. Better change it to 1, to be
compatible with the counter initialization in
Naïve Bayes.
It is going to start with a Naïve Bayes graph and
then try to add/remove arcs.
You can change to 2 for example. If you do, then
the max number of parents for a node will be 2.
19Play probability table
Based on the data
P(playyes) 9/14 P(playno) 5/14
Lets correct with Laplace
P(playyes) (91)/(142) .625 P(playyes)
(51)/(142) .375
20Outlook probability table
Based on the data
P(outlooksunnyplayyes) (21)/(93)
.25 P(outlookovercastplayyes) (41)/(93)
.417 P(outlookrainyplayyes) (31)/(93)
.333 P(outlooksunnyplayno) (31)/(53)
.5 P(outlookovercastplayno) (01)/(53)
.125 P(outlookrainyplayno) (21)/(53) .375
21Windy probability table
Based on the datalets find the conditional
probabilities for windy
P(windytrueplayyes,outlooksunny)
(11)/(22) .5
22Windy probability table
Based on the data
P(windytrueplayyes,outlooksunny)
(11)/(22) .5 P(windytrueplayyes,outlookov
ercast) 0.5 P(windytrueplayyes,outlookrai
ny) 0.2 P(windytrueplayno,outlooksunny)
0.4 P(windytrueplayno,outlookovercast)
0.5 P(windytrueplayno,outlookrainy) 0.75
23Final figure
Classify it
Classify it
24Classification I
Classify it
P(playyesoutlooksunny, tempcool,humidityhigh
, windytrue) ?P(playyes) P(outlooksunnypl
ayyes) P(tempcoolplayyes,
outlooksunny) P(humidityhighplayyes,
tempcool) P(windytru
eplayyes,
outlooksunny) ?0.6250.250.40.20.5
?0.00625
25Classification II
Classify it
P(playnooutlooksunny, tempcool,humidityhigh,
windytrue) ?P(playno) P(outlooksunnyplay
no) P(tempcoolplayno,
outlooksunny) P(humidityhighplay no,
tempcool) P(windytruepl
ayno,
outlooksunny) ?0.3750.50.1670.3330.4
?0.00417
26Classification III
Classify it
P(playyesoutlooksunny, tempcool,humidityhigh
, windytrue) ?0.00625 P(playnooutlooksunn
y, tempcool,humidityhigh, windytrue)
?.00417 ? 1/(0.006250.00417)
95.969 P(playyesoutlooksunny,
tempcool,humidityhigh, windytrue)
95.9690.00625 0.60
27Classification IV (missing values or hidden
variables)
P(playyestempcool, humidityhigh, windytrue)
??outlookP(playyes) P(outlookplayyes) P(
tempcoolplayyes,outlook) P(humidityhighplay
yes,
tempcool) P(windytrueplayyes,outlook) (nex
t slide)
28Classification V (missing values or hidden
variables)
P(playyestempcool, humidityhigh, windytrue)
??outlookP(playyes)P(outlookplayyes)P(te
mpcoolplayyes,outlook)
P(humidityhighplayyes,tempcool)P(windytrue
playyes,outlook) ? P(playyes)P(outlook
sunnyplayyes)P(tempcoolplayyes,outlooksunny
) P(humidityhighplayyes,tempcool)P(windytru
eplayyes,outlooksunny) P(playyes)P(outlook
overcastplayyes)P(tempcoolplayyes,outlooko
vercast) P(humidityhighplayyes,tempcool)P(wi
ndytrueplayyes,outlookovercast) P(playyes)
P(outlook rainyplayyes)P(tempcoolplayyes,ou
tlookrainy) P(humidityhighplayyes,tempcool)
P(windytrueplayyes,outlookrainy) ?
0.6250.250.40.20.5 0.6250.4170.2860.20.5
0.6250.330.3330.20.2 ?0.01645
29Classification VI (missing values or hidden
variables)
P(playnotempcool, humidityhigh, windytrue)
??outlookP(playno)P(outlookplayno)P(temp
coolplayno,outlook) P(humidityhighpl
ayno,tempcool)P(windytrueplayno,outlook)
? P(playno)P(outlooksunnyplayno)P(tempcoo
lplayno,outlooksunny) P(humidityhighplayno,
tempcool)P(windytrueplayno,outlooksunny) P
(playno)P(outlook overcastplayno)P(tempcool
playno,outlookovercast) P(humidityhighplayn
o,tempcool)P(windytrueplayno,outlookovercast
) P(playno)P(outlook rainyplayno)P(tempco
olplayno,outlookrainy) P(humidityhighplayno
,tempcool)P(windytrueplayno,outlookrainy)
? 0.3750.50.1670.3330.4
0.3750.1250.3330.3330.5 0.3750.3750.40.33
30.75 ?0.0208
30Classification VII (missing values or hidden
variables)
P(playyestempcool, humidityhigh, windytrue)
?0.01645 P(playnotempcool, humidityhigh,
windytrue) ?0.0208 ?1/(0.01645 0.0208)
26.846 P(playyestempcool, humidityhigh,
windytrue) 26.846 0.01645
0.44 P(playnotempcool, humidityhigh,
windytrue) 26.846 0.0208 0.56 I.e.
P(playyestempcool, humidityhigh, windytrue)
is 44 and P(playnotempcool,
humidityhigh, windytrue) is 56 So, we predict
playno.