Dynamic Programming - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Programming

Description:

Dynamic Programming & Hidden Markov Models. Alan Yuille Dept. Statistics UCLA Goal of this Talk This talk introduces one of the major algorithms: dynamic programming ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 25
Provided by: aa2177
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Programming


1
Dynamic Programming Hidden Markov Models.
  • Alan Yuille
  • Dept. Statistics UCLA

2
Goal of this Talk
1. Chair
  • This talk introduces one of the major algorithms
    dynamic programming (DP).
  • Then describe how it can be used in conjunction
    with EM for learning.

3
Dynamic Programming
  • Dynamic Programming exploits the graphical
    structure of the probability distribution. It can
    be applied to any structure without closed loops.
  • Consider the two-headed coin example given in Tom
    Griffiths talk (Monday).

4
Probabilistic Grammars
  • By the Markov Condition
  • Hence we can exploit the graphical structure to
    efficiently compute

The structure means that the sum over x2 drops
out. We need only sum over x1 and x3. Only four
operations instead of eight.
5
Dynamic Programming Intuition
  • Suppose you wish to travel to Boston from Los
    Angeles by car.
  • To determine the cost of going via Chicago you
    only need to calculate the shortest cost from Los
    Angeles to Chicago and then, independently, the
    shortest cost from Chicago to Boston.
  • Decomposing the route in this way gives an
    efficient algorithm which is polynomial in the
    number of nodes and feasible for computation.

6
Dynamic Programming Diamond
  • Compute the shortest cost from A to B.

7
Application to a 1-dim chain.
  • Consider a distribution defined on a 1-dim chain.
  • Important property directed and undirected
    graphs are equivalent (for 1-dim chain).
  • P(A,B) P(AB) P(B)
  • or P(A,B) P(BA) P(A)
  • For these simple graphs with two nodes -- you
    cannot distinguish causation from correlation
    without intervention (Wus lecture Friday).
  • For this lecture we will treat a simple
    one-dimensional cover directed and undirected
    models simultaneously. (Translating between
    directed and undirected is generally possible for
    graphs without closed loops but has subtleties).

8
Probability distribution on 1-D chain
9
1-D Chain.
10
1-Dim Chain
  • (Proof by induction).

11
1-Dim Chain
  • We can also use DP to compute other properties
    e.g. to convert the distribution from undirected
    form
  • To directed form

12
1-Dim Chain

13
Special Case 1-D Ising Spin Model
14
Dynamic Programming Summary
  • Dynamic Programming can be applied to perform
    inference on all graphical models defined on
    trees The key insight is that, for trees, we can
    define an order on the nodes (not necessarily
    unique) and process nodes in sequence (never
    needing to return to a node that have already
    been processed).

15
Extensions of Dynamic Programming
  • What to do if you have a graph with closed loops?
  • There are a variety of advanced ways to exploit
    the graphical structure and obtain efficient
    exact algorithms.
  • Prof. Adnan Darwiche (CS, UCLA) is an expert on
    this topic. There will be an introduction to his
    SamIam code.
  • Also can use approximate methods like BP.

16
Junction Trees.
  • It is also possible to take a probability
    distribution defined on a graph with closed loops
    and reformulate it as a distribution on a new
    nodes without closed loops. (Lauritzen and
    Spiegelhalter 1990).
  • This lead to a variety of algorithm generally
    known as junction trees.
  • This is not a universal solution because the
    resulting new graphs may have too many nodes to
    make them practical.
  • Google junction trees to find nice tutorials on
    junction trees.

17
Graph Conversion
  • Convert graph by a set of transformations.

18
Triangles Augmented Variables
  • From triangles to ordered triangles.

Original Variables Loops
Augmented Variables No Loops
19
Summary of Dynamic Programming.
  • Dynamic Programming can be used to efficiently
    compute properties of a distribution for graphs
    defined on trees.
  • Directed graphs on trees can be reformulated as
    undirected graphs on trees, and vice versa.
  • DP can be extended to apply to graphs with closed
    loops by restructuring the graphs (junction
    trees).
  • It is an active research area to determine
    efficient inference algorithms which exploit the
    graphical structures of these models.
  • Relationship between DP and reinforcement
    learning (week 2).
  • DP and A. DP and pruning.

20
HMMs Learning and Inference
  • So far we have considered inference only.
  • This assumes that the model is known.
  • How can we learn the model?
  • For 1D models -- this uses DP and EM.

21
A simple HMM for Coin Tossing
  • Two coins, one biased and the other fair, with
    the coins switched occasionally.
  • The observable 0,1 is whether the coin is head or
    tails.
  • The hidden state A,B is which coin is used.
  • There are unknown transition probabilities
    between the hidden states A and B, and unknown
    probabilities for the observations conditioned on
    the hidden states.
  • The learning task is to estimate these
    probabilities from a sequence of measurements.

22
HMM for Speech
23
HMM for Speech
24
HMM Summary
  • HMM define a class of markov models with hidden
    variables. Used for speech recognition, and many
    other applications.
  • Tasks involving HMMs involve learning,
    inference, and model selection.
  • These can often be performed by algorithms based
    on EM and DP.
Write a Comment
User Comments (0)
About PowerShow.com