Evidence and Message Passing - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Evidence and Message Passing

Description:

Evidence can further be divided into two parts: Prior evidence, from the ... Once we have amassed all the evidence for a variable, we can convert it into a ... – PowerPoint PPT presentation

Number of Views:46

Avg rating:3.0/5.0

Slides: 34

Provided by: Dun683

Category:

more less

Transcript and Presenter's Notes

Title: Evidence and Message Passing

1
Lecture 3

Evidence and Message Passing

2
Evidence

It is useful to introduce the notion of evidence.
P(ESD) ? P(E) P(SE) P(DE)?
?(E) P(E) P(SE) P(DE)
is the evidence for E and has a value for each
state of E

3
Evidence

Evidence can further be divided into two parts
Prior evidence, from the parents of a node
?(E) P(E)?
Likelihood evidence from the children
?(E) P(SE) P(DE)

4
Conventions for writing evidence

We write it as a vector ?(E) ?(e1), ?(e2),
?(e3)
or as scalar values for the individual states
????????????(e1), ?(e2), ?(e3)
and at any node we combine the evidence by
multiplication
?(ej) ?(ej) ?(ej)?

5
Why use evidence

Evidence is simply un-normalised probability.
Once we have amassed all the evidence for a
variable, we can convert it into a posterior
probability.
Using evidence gives us a mathematical
simplification in developing equations for
complex networks.

6
Calculating ? Evidence

Given that S and D have been instantiated, say
Ss4 and Dd2 We can look up the ? evidence for E
in the link matrix
P(ESD) ? P(E) P(SE) P(DE)?
?(e1) P(s4e1) P(d2e1)?
?(e2) P(s4e2) P(d2e2)?
?(e3) P(s4e3) P(d2e3)?

7
Calculating Evidence

Given evidence for E we can now calculate the
evidence for C by using a weighted average of the
probabilities
P(CEF) ? P(C) P(EC) P(FC)?
?(c1) ?(e1) P(e1c1)
?(e2) P(e2c1)
????????????????(e3) P(e3c1) P(f2c1)?

8
The Conditioning Equation

Generalising we calculate the evidence using the
conditioning equation

9
Conditioning at the leaf nodes

For node E, with the leaf nodes instantiated the
conditioning equation becomes

10
Instantiation and Evidence

In the simple case, for leaf nodes we have a
known state for that node. Recall that we defined
the eye separation measure as having seven
states

11
Instantiation and Evidence 2

So if we take a measurement, which for example is
0.61 we instantiate the corresponding state (S4).
This is equivalent to setting the evidence as
follows.

12
Virtual evidence

Sometimes, when we make a measurement it is
possible to express uncertainty about it by
distributing the evidence values. For example,
instead of setting ?(s4) 1 we could use

13
Virtual Evidence requires conditioning

If we use virtual evidence then we must use the
conditioning equation

14
Problem Break

Given the following virtual evidence, write down
an expression for the ? evidence for state e1 of
E (which has two children S and D)

15
Solution

Putting in the virtual evidence gives
?(e1) (P(s1e1)0.2P(s2e1))(0.5P(d3e1)P(d4e
1))?
(I can't be bothered to multiply this out)?

16
No evidence

Sometimes, we may not have data for a node.
Propagation can still be carried out, but for all
states the evidence for the node is the same, ie

17
No evidence and the conditioning equation
18
Upward Propogation

In the last lecture we discussed a very simple
network - namely the Bayesian Classifier
In Bayesian classifiers the top of the tree is
usually a hypothesis and the evidence all
propagated upwards.
Tree structured networks can be used in other ways

19
Consider again the cat example
Previously we found the evidence of there being a
cat in the picture. Suppose now we want to ask
whether there is a pair of eyes.
20
Case 1

Suppose we know there is a cat in the picture
Node C is in state C1 (cattrue) and thus its
other children (just F in this case) cannot
affect its value.

21
Looking at the link matrix

The link matrix reduces to a vector, since the
state of C is known. This is, in effect a prior
probability of E
In vector notation P(C) (1,0)?
In effect P(E) P(EC) P(C) is one column of the
link matrix

22
Simplified Network

Since we know the state of C, P(EC) effectively
gives us a prior probability of E P(E)?

23
Case 2

More interestingly we might not know for certain
that there was a cat in the picture.
Clearly the geometric evidence from below still
stands, but instead of having a prior probability
of a pair of eyes we need to determine the
evidence from the cat node that there is a pair
of eyes.
This is the most general case for inference in
trees

24
A ? message from C to E

Suppose for a given picture we calculate
the ? evidence for C from F as
?(C) ?(c1), ?(c2) 0.3,0.2,
and the prior probability of C
P(C) 0.6, 0.4
the evidence for C (excluding that from E) is
?(C) 0.18, 0.08
then the ? evidence at E is calculated using

25
Normalisation of evidence in a tree

Although not necessary, the evidence can always
be normalised to a posterior probability
distribution indicated P(C). We could also
calculate ?(e) as follows
P(cj) ? P(cj)??F(cj)?
Where ? is a normalising constant making ? P'(cj)
1
The ? message sent to E is simply
?(ei) ? P(eicj) P(cj)?

26
Magnitude and evidence

If we don't normalise the evidence at node C then
we will send a different ? message to E than if
we do normalise.
In the example it is larger if we normalise.
However, the magnitude of the evidence is not
relevant, it is the relative magnitudes of the
evidence for the states of a node that carries
the information

27
General form of the ? evidence
28
General form of the ? evidence

However ? evidence can also be computed with a
matrix multiplication
????????(E) P(EC) ?E(C)?
where
?(E) is a vector expressing the ? evidence for E
?E(C) is a vector expressing all the evidence for
C except that from E. (In lecture 2 we used
P-E(C))?

29
Generality of Propagation in trees

Probability propagation is completely flexible.
We can instantiate any subset of the nodes and
calculate the probability distribution over the
states of the other nodes.

30
Priors and Likelihood in Networks

Note that now we can associate the notion of
prior and likelihood with the evidence being
propagated
?(Bi) The likelihood evidence for Bi
?(Bi) The prior evidence for Bi

31
The network structure is prior knowledge

This notion of prior and likelihood is slightly
different from our previous usage and reflects
the fact that the network represents our prior
knowledge for inference.
The ? message to a root node is the same as the
prior probability of that root node.

32
Incorporating more nodes

One of the best features of Bayesian Networks is
that we can incorporate new nodes as the data
becomes available.
Recall that we had information from the computer
vision process as to how likely the extracted
circles were.
This could simply be treated as another node

33
Adding a node doesnt change a network

Write a Comment

User Comments (0)