Title: Representation and Inference of Probabilistic Knowledge
1Representation and Inference of Probabilistic
Knowledge
2Conditional Probability
- Let A, and B be two events. The conditional
probability P(AB) is defined to be -
- Bayes theorem
- P(AB) P(B) P(BA) P(A)
- Multiplication rule
- P(A1,A2,,An) P(A1) P(A2A1) P(A3A1, A2) ? ?
P(AnA1, A2,,An-1)
3- Let A and B1, B2, , Bn be events defined on the
same space and B1, B2, , Bn satisfy the
following 3 conditions - Then, we have
4- Proof of (1)
- Proof of (2)
5Conditional Independence
- Let A, B, C be events defined on the same space.
- We say that A and B are conditionally independent
given C if and only ifP(AB,C) P(AC) - Another sufficient and necessary condition of
conditional independence isP(A,BC) P(AC)
P(BC)
6(No Transcript)
7 8Conditionally Independent Random Variables
- Two random variables X and Y are said to be
conditionally independent given random variable Z
if - Prob(X x,Y yZ z)
- Prob(X xZ z) Prob(Y yZ z),
- for all possible combinations (x,y,z)
- We will use P(x, yz) to denote the probability
of Prob(Xx,YyZz)
9Theorems Regarding Conditionally Independent
Random Variables
- If P(xy,z,w) P(xw) for all possible
combinations (w,x,y,z),then P(xy,w) P(xw) - Proof
10Definition of the Bayesian Network
- A Bayesian network of a set random variables X1,
X2, ,Xn employs an acyclic directed network
along with associated conditional probability
tables to record the essential information for
computing the probability of every possible
instanceltX1 s1, X2 s2, , Xn sngt.
11Continues.
- In a Bayesian network, every random variable is
represented by a node. Every node is associated
with a conditional probability table that
specifies the conditional probability
distribution of the random variable.
12Theoretical Background of the Bayesian Network
- According to the properties of conditional
probability, - Therefore, by recording all the probability
distribution functions P(XkX1,X2, ,Xk-1), we
can compute the probability of every instance
P(s1,s2, ,sn)
13An Example of the Bayesian Network
X1
X2
X3
Familyhistory of highblood pressure
HealthyDiet
Doingexerciseregularly
X4
X5
High bloodpressure atage of 50
Overweightat age of 50
X6
X7
Diedue toheart attach
Diedue tostroke
14An Example of Complete Bayesian Network
X1
X2
X3
X4
15Continues.
- Let dk denote the number of possible outcomes of
random variable Xk.Then, the conditional
probability table of Xk has rows
and dk1 columns,or
entries in total.
16- Let Parents(Xk) ?X1,X2,,Xk1 denote the set of
random variables that makesfor every possible
instance of X1,X2,,Xk1.That is, Xk and
X1,X2,,Xk1Parents(Xk) are conditionally
independent given instances of Parents(Xk). - In the worst case, we have Parents(Xk) ? for
every random variables Xk. - With Parents(Xk), we can write
17An Example of the Bayesian Network
X1
X2
X3
Familyhistory of highblood pressure
HealthyDiet
Doingexerciseregularly
X4
X5
High bloodpressure atage of 50
Overweightat age of 50
X6
X7
Diedue toheart attach
Diedue tostroke
18Constructing a Good Bayesian Network
- Because the order of the random variables has a
significant impact on the complexity of the
Bayesian network constructed, it is desirable to
find an optimal order.
19- For example, let Y and Z be two independent
Bernoulli random variables. We want to transmit
the outcomes of Y and Z to a remote site and want
to add a error detection code. Let W denote the
value of the error detection code. Then, we can
use the following Bayesian network to model this
process.
Y
Z
W
20- However, if we place W,Y,Z in another order.
W
Y
Z
21Continues.
- In practical world, people employ the
cause-and-effect reasoning in constructing
Bayesian networks that can be easily interpreted.
22Continues.
- Automatic construction of Bayesian networks is a
complicated problem and the process is likely to
yield a network that is hard to interpret, due to
reversed partial order of cause-and-effect.
23The Singly Connected Bayesian Network
- In a singly connected Bayesian network, also
known as polytree, there exists at most one
undirected path between any two nodes in the
network.
24An Example of Polytrees
25Independence and Conditional Independent Cases in
a Singly-connected Bayesian Network
- There are 3 basic types of independence and
conditional independence in a polytree. - Type 1
Xi and Xj are connectedthrough a common
descendant. Then, Xi and Xj are independent.
26- Note that Xi and Xj may be conditionally
dependent given an instance of Xk as the
following example demonstrates.
Conditional probability table at X3
27Xi and Xj are conditionally independent given an
instance of Xk.
28Xi and Xj are conditionally independentgiven an
instance of Xk.
29A Special Case of Type-1 Conditional Independence.
Both Xi and Xj have no parent. Then, Xi and Xj
are independent.
30- Proof
- without loss of generality, we can assume thati
gt j.
31Corollary of the Special Case
Xi has no parent and Xj has only one level of
ancestors. Then, Xi and Xj are independent.
32- Proof There are two possibilities
- If i gt j, then
- If i lt j, then
33Extension of the Corollary
Here, Xi have no parent. Then, Xi and Xj are
independent.
34Proof of Type-1 Independence
We can apply the procedure the conducted before
the branch containing Xi.
35Generalization of the Type-1 Independence
- The type-1 independence can be generalized to
Xk1, Xk2, , Xkn are connected through a common
descendant, but on different paths. Then, Xk1,
Xk2, , Xkn are jointly independent.
36- Note that, even we have
- P(sk1, sk2) P(sk1)P(sk2)P(sk2, sk3)
P(sk2)P(sk3)P(sk1, sk3) P(sk1)P(sk3) - Then, it is not necessary true that
- P(sk1, sk2, sk3) P(sk1) P(sk2)P(sk3)
37- On the other hand, jointly independent implies
pairwise independence. For example,
38An Example of Pairwise Independence
- Let X and Y are two random variables that
correspond to tossing a unbiased coin two times.
Let Z X ? Y.Then - Prob(Z0) Prob(X0,Y0) Prob(X1,Y1) ½
- Prob(X0,Z0) Prob(X0,Y0) ¼
Prob(X0)Prob(Z0). - Therefore, X, Y and Z are pairwise
independent.However, Prob(X0,Y0,Z1) 0 and
Prob(X0)Prob(Y0)Prob(Z1) 1/8 - Hence, X, Y and Z are not jointly independent.
39- Let us only prove the joint independent case that
involves only 3 nodes. The cases that involves
more that 3 nodes can be proved based on the same
idea. - Without loss of generality, we can assume that k3
gt k2 gt k1.
40- We start with the case in which Xk1, Xk2 and Xk3
all have no parents. - Since Xk1 and Xk2 are not parents of
Xk3,P(Xk3Xk1, Xk2) P(Xk3)?P(Xk1, Xk2, Xk3)
P(Xk1, Xk2)P(Xk3) P(Xk1)P(Xk2)P(Xk3) - Then, we can apply the procedure we used before
in proving the 2-node case to complete the proof.
41Another Extension of The Type-1 Independence
Xi and Xj are connected to through the same
parent of Xcand Xk is connect to Xc through
another parent of Xc. ThenP(si, sj, sk) P(si,
sj) P(sk)P(sksi, sj) P(sk)
42A Special Case of Type-2 Conditional Independence
Xi and Xj are conditionally independentgiven an
instance of Xk.
43(No Transcript)
44A Special Case of Type-3 Conditional Independence
Xi and Xj are conditionally independentgiven an
instance of Xk.
45(No Transcript)
46D-Separation in General Bayesian Networks
- We have proved 3 basic types of
independence/conditional independence in
polytrees. - Let S1, S2 and E be 3 sets of nodes in a general
Bayesian network. If every undirected path from
S1 to a node in S2 is d-separated by E, then a
subset of S1 and a subset of S2 are conditionally
independent given E.
47- S1 and S2 are d-separated by E, if for evey
undirected path from a node X in S1, to a node Y
in S2, there is a node Z on the path for which
one of the following 3 conditions holds. - Z is in E and Z has one arrow on the path leading
in and one arrow leading out. - Z is in E and Z has both arrows on the path
leading out. - Neither Z nor any descendant of Z is in E, and
both arrows on the path lead into Z.
48 49Examples of D-Separation
Z ? E
Z ? E
50Examples of D-Separation
Z and all its descendants are not in E