Connectionist Computing COMP 30230 - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Connectionist Computing COMP 30230

Description:

The size of the largest clique in the induced graph is an indicator for the ... marginalise down to any variable. details in the Jensen&Lauritzen paper on the web site ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 39
Provided by: gruye
Category:

less

Transcript and Presenter's Notes

Title: Connectionist Computing COMP 30230


1
Connectionist ComputingCOMP 30230
  • Gianluca Pollastri
  • office 2nd floor, UCD CASL
  • email gianluca.pollastri_at_ucd.ie

2
Credits
  • Geoffrey Hinton, University of Toronto.
  • borrowed some of his slides for Neural Networks
    and Computation in Neural Networks courses.
  • Ronan Reilly, NUI Maynooth.
  • slides from his CS4018.
  • Paolo Frasconi, University of Florence.
  • slides from tutorial on Machine Learning for
    structured domains.

3
Lecture notes
  • http//gruyere.ucd.ie/2009_courses/30230/
  • Strictly confidential...

4
Books
  • No book covers large fractions of this course.
  • Parts of chapters 4, 6, (7), 13 of Tom Mitchells
    Machine Learning
  • Parts of chapter V of Mackays Information
    Theory, Inference, and Learning Algorithms,
    available online at
  • http//www.inference.phy.cam.ac.uk/mackay/itprnn/b
    ook.html
  • Chapter 20 of Russell and Norvigs Artificial
    Intelligence A Modern Approach, also available
    at
  • http//aima.cs.berkeley.edu/newchap20.pdf
  • More materials later..

5
Make a Boltzmann Machine
  • http//gruyere.ucd.ie/2009_courses/30230/boltzmann
    .doc
  • Due on March 6th
  • 30!
  • -5 every day late

6
d-separation
  • Two variables A and B are d-separated given the
    evidence e if for all paths between A and B there
    is an intermediate variable V such that
  • the connection is serial or divergent and V is
    instantiated by e
  • the connection is converging and neither V not
    any of Vs descendants have received evidence.

7
P(U)
  • If we can store P(U), it is possible to update
    easily our belief for all the variables composing
    U when we receive evidence. This is done by
  • inserting evidence
  • marginalising

8
Bayesian Networks
  • A BN consists of
  • a set of variables/nodes and a set of directed
    edges between nodes.
  • each variable has a finite set of mutually
    exclusive states.
  • the overall graph is a DAG
  • to each node A with parents B1, B2, .., Bn there
    is attached a conditional probability table (CPT)
    P(A B1, B2, .., Bn).
  • Each variable is independent on its
    non-descendants given its parents.

9
Chain rule for BN
  • Let BN be a Bayesian Network over UA1, .. ,
    An. The joint probability distribution P(U) is
    the product of the conditional probabilities that
    label the BN.
  • If pa(Ai)parents of Ai, then

10
Probability updates in BN
  • Now P(U) is factorised into a set of smaller
    tables.
  • Given evidence e, how can we update P(A) to
    P(Ae) for each variable in U, using the BN, i.e.
    how can we insert findings and marginalise?

11
Marginalising in BN
  • What we want to do is
  • Which means

12
distributing
  • In
  • we want to distribute the sums so that we are
    making the smallest possible number of operations.

13
example
A1,A2,A3,A4
A3, A4, A5
A4,A5
A4,A5
A3,A4
14
marginalisation by elimination
  • We now know that we can marginalise a probability
    distribution wrt a variable (or set of variables)
    by successive eliminations.
  • Not all elimination sequences carry the same
    complexity.
  • Now the task is finding the best elimination
    sequence.

15
Domain graph
  • We say that two variables are members of the same
    domain if they appear in the same conditional
    probability table of a BN.
  • The domain graph for a set of variables is a
    graph with one node for each variable and an
    undirected edge between any two variables that
    are members of the same domain.
  • This is sometimes also called moralised graph for
    the BN.

16
example
A
B
C
E
D
F
G
17
elimination from a domain graph
  • We eliminate a variable A from a domain graph G
    by the following procedure
  • add a link (fill-in) between any two neighbours
    of A
  • remove A
  • The new domain graph is called G-A. It can be
    shown that G-A is the domain graph for P(U\A).
  • This means that we can perform variable
    elimination on the domain graph.

18
example eliminate A
A
B
B
C
C
E
D
E
D
F
G
F
G
19
example eliminate B
B
C
C
E
D
E
D
F
G
F
G
20
example eliminate C
C
E
D
E
D
F
G
F
G
21
example eliminate C first
A
B
A
B
C
E
D
E
D
F
G
F
G
order matters!
22
induced graph
  • We call induced graph of G and an elimination
    order s (or s-completion of G) the graph Gs
    obtained by augmenting G with all the fill-ins
    associated with s.

23
triangulated graph
  • An undirected graph G is triangulated if every
    cycle with more than three links has a chord (a
    link connecting two nodes not being neighbours in
    the cycle).
  • A graph G is said to be a triangulation of G if
    G is triangulated, and G is a subgraph of G
    over the same nodes.
  • Any s-completion of G is a triangulation of G.
  • A graph is triangulated if, and only if, it has
    an elimination sequence without fill-ins.

24
induced graph and cliques
  • Every time we eliminate a node A (eliminate a
    variable A) we create a clique, i.e. a fully
    connected subgraph of G containing all neighbours
    of A.
  • Every maximal clique in an induced
    graphcorresponds to a intermediate factor in the
    computations
  • Every factor stored during the process is a
    subset of some maximal clique in the graph

25
example
Elimination order A, B, C, D, E, F, G. No
fill-ins needed
A
B
C
E
D
F
G
26
induced width
  • The size of the largest clique in the induced
    graph is an indicator for the complexity of
    variable elimination
  • This quantity is called the induced width of a
    graph according to the specified ordering
  • Finding a good ordering for a graph is equivalent
    to finding the minimal induced width of the graph

27
Elimination on Trees
  • Suppose we have a tree
  • A network where each variable has at most one
    parent
  • All the factors involve at most two variables
  • Thus, the domain graph is also a tree

28
Elimination on Trees
  • We can maintain the tree structure by eliminating
    extreme variables in the tree

A
C
B
A
E
D
C
B
F
G
D
E
F
G
29
Elimination on Trees
  • Formally, for any tree, there is an elimination
    ordering with induced width 1
  • Inference on trees is linear in number of
    variables

30
PolyTrees
  • A polytree is a network where there is at most
    one path from one variable to another
  • Inference in a polytree is linear in the
    representation size of the network
  • This assumes tabular CPT representation
  • Check it if you wish..

31
General Networks
  • What do we do when the network is not a polytree?
  • If network has a cycle, the induced width for any
    ordering is greater than 1

32
Example
  • Eliminating A, B, C, D, E,.

33
Example
  • Eliminating H,G, E, C, F, D, E, A

A
A
B
C
D
E
F
G
H
H
34
General Networks
  • It can be shown that finding an ordering that
    minimises the induced width is NP-Hard
  • However,
  • There are reasonable heuristics for finding good
    orderings
  • There are provable approximations to the best
    induced width
  • If the graph has a small induced width, there are
    algorithms that find it in polynomial time

35
Junction tree
  • If G is a triangulated graph, and C1 .. Ck are
    its cliques, a junction tree for G is a graph
  • whose nodes are C1 .. Ck
  • such that each node on the path between Ci and Cj
    contains CinCj.
  • The edge between two nodes is labelled with the
    intersection between the nodes.

36
example from triangulated graph to junction tree
T,V
T
A,L,T
B,L,S
B,L
A,L
A,L,B
X,A
A
A,B
A,B,D
37
Junction trees and triangulated graphs
  • A graph is triangulated if and only if it has a
    junction tree.
  • This means that for any BN, for each elimination
    order there is a junction tree.

38
Junction Tree
  • BN -gt Domain graph -gt induced triangulated graph
    -gt Junction Tree
  • JT can be used to
  • represent BN
  • insert and propagate findings
  • marginalise down to any variable
  • details in the JensenLauritzen paper on the web
    site
Write a Comment
User Comments (0)
About PowerShow.com