Introduction to Hidden Markov Models

About This Presentation

Title:

Introduction to Hidden Markov Models

Description:

Example ---- Video Texture. Markov Chains. Hidden Markov Models. Example ---- Motion Texture ... The sequence of hidden states is Markov ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 39

Provided by: sha1170

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Hidden Markov Models

1
Introduction to Hidden MarkovModels

Wang Rui
CADCG state key lab
2004-5-26

2
Outline

Example ---- Video Texture
Markov Chains
Hidden Markov Models
Example ---- Motion Texture

3
Example ---- Video Texture

Problem statement

video clip
video texture
4
The approach

How do we find good transitions?

5
Finding good transitions

Compute L2 distance Di, j between all frames

frame i
vs.
frame j

Similar frames make good transitions

6
Fish Tank
7
Mathematic model of Video Texture
A sequence of random variables ADEABEDADBCAD
A sequence of random variables BDACBDCACDBCADCBAD
CA
Mathematic Model The future is independent of the
past and given by the present.
Markov Model
8
Markov Property

Formal definition
Let XXnn0N be a sequence of random variables
taking values sk ?N Iff P(XmsmX0s0,,Xm-1sm-1
) P(Xmsm Xm-1sm-1)
then the X fulfills Markov property
Informal definition
The future is independent of the past given the
present.

9
History

Markov chain theory developed around 1900.
Hidden Markov Models developed in late 1960s.
Used extensively in speech recognition in
1960-70.
Introduced to computer science in 1989.

Applications

Bioinformatics.
Signal Processing
Data analysis and Pattern recognition

10
Markov Chain

A Markov chain is specified by
A state space S s1, s2..., sn
An initial distribution a0
A transition matrix A
Where A(n)ij aij P(qtsjqt-1si)
Graphical Representation
as a directed graph where
Vertices represent states
Edges represent transitions with positive
probability

11
Probability Axioms

Marginal Probability sum the joint probability
Conditional Probability

12
Calculating with Markov chains

Probability of an observation sequence
Let XxtLt0 be an observation sequence from
the Markov chain S, a0, A

13
(No Transcript)
14
(No Transcript)
15
Motivation of Hidden Markov Models

Hidden states
The state of the entity we want to model is often
not observable
The state is then said to be hidden.
Observables
Sometimes we can instead observe the state of
entities influenced by the hidden state.
A system can be modeled by an HMM if
The sequence of hidden states is Markov
The sequence of observations are independent (or
Markov) given the hidden

16
Hidden Markov Model

Definition
Set of states S s1, s2..., sN
Observation symbols V v1, v2, , vM
Transition probabilities A between any two states
aij P(qtsjqt-1si)
Emission probabilities B within each state
bj(Ot) P( Otvj qt sj)
Start probabilities ? a0
Use ? (A, B, ?) to indicate the parameter set
of the model.

17
Generating a sequence by the model

Given a HMM, we can generate a sequence of length
n as follows
Start at state q1 according to prob a0t1
Emit letter o1 according to prob et1(o1)
Go to state q2 according to prob at1t2
until emitting yn

1
a02
2
2
0
N
b2(o1)
o1
o2
o3
on
18
Example
19
Calculating with Hidden Markov Model

Consider one such fixed state sequence
Q q1 q2 qT
The observation sequence O for the Q is

20
The probability of such a state sequence Q can be
written as The probability that O and Q occur
simultaneously, is simply the product of the
above two terms, i.e.,
21
Example
22
The three main questions on HMMs

Evaluation
GIVEN a HMM (S, V, A, B, ?), and a sequence O,
FIND Prob y M
Decoding
GIVEN a HMM (S, V, A, B, ?), and a sequence O,
FIND the sequence Q of states that maximizes P(O,
Q ?)
Learning
GIVEN a HMM (S, V, A, B, ?), with unspecified
transition/emission probs and a sequence Q,
FIND parameters ? (ei(.), aij) that maximize
Px?

23
Evaluation

Find the likelihood a sequence is generated by
the model

A straight way
The probability of O is obtained by summing all
possible state sequences q giving

Complexity is O(NT) Calculations is unfeasible
24
The Forward Algorithm

A more elaborate algorithm
The Forward Algorithm

a11
1
a21
a01
2
2
2
a02
0
an1
a0n
N
N
o1
o2
o3
on
25
The Forward Algorithm

The Forward variable
We can compute a(i) for all N, i,
Initialization
a1(i) aibi(O1) i 1N
Iteration
Termination

a11
1
a21
a01
2
2
2
a02
0
an1
a0n
N
N
o1
o2
o3
on
26
The Backward Algorithm

The backward variable
Similar, we can compute backward variable for all
N, i,
Initialization
ßT(i) 1, i 1N
Iteration
Termination

a11
1
a21
a01
2
2
2
a02
0
an1
a0n
N
N
o1
o2
o3
on
27
Forward, fk(i)
Backward, bk(i)
28
Decoding

Decoding
GIVEN a HMM, and a sequence O.
Suppose that we know the parameters of the
Hidden Markov Model and the observed sequence of
observations O1, O2, ... , OT.
FIND the sequence Q of states that maximizes
P(QO,?)
Determining the sequence of States q1, q2, ...
, qT, which is optimal in some meaningful sense.
(i.e. best explain the observations)

Consider
So that maximizes the above probability
This is equivalent to maximizing

1
A best path finding problem
a02
2
2
0
N
o1
o2
o3
on
30
Viterbi Algorithm

A dynamic programming
Initialization
d1(i) a0ibi(O1) , i 1N
?1(i) 0.
Recursion
dt(j) maxi dt-1(i) aijbj(Ot) t2T
j1N
?1(j) argmaxi dt-1(i) aij t2T
j1N
Termination
P maxi dT(i)
qT argmaxi dT(i)
Traceback
qt ?1(qt1 ) tT-1,T-2,,1.

31
Learning

Estimation of Parameters of a Hidden Markov Model
1. Both the sequence of observations O and the
sequence of States Q is observed
learning ? (A, B, ?)
2. Only the sequence of observations O are
observed
learning Q and ? (A, B, ?)

Given O and Q then the Likelihood is given by

the log-Likelihood is given by
33
In such case these parameters computed by Maximum
Likelihood estimation are the MLE of
bi computed from the observations ot where qt
Si.

34

Only the sequence of observations O are observed
It is difficult to find the Maximum Likelihood
Estimates directly from the Likelihood function.
The Techniques that are used are
1. The Segmental K-means Algorithm
2. The Baum-Welch (E-M) Algorithm

35
The Baum-Welch (E-M) Algorithm

The E-M algorithm was designed originally to
handle Missing observations.
In this case the missing observations are the
states q1, q2, ... , qT.
Assuming a model, the states are estimated by
finding their expected values under this model.
(The E part of the E-M algorithm).

With these values the model is estimated by
Maximum Likelihood Estimation (The M part of the
E-M algorithm).
The process is repeated until the estimated model
converges.

37
The E-M Algorithm

Let denote
the joint distribution of Q,O.
Consider the function
Starting with an initial estimate of
. A sequence of estimates are formed
by finding to maximize
with respect to .

The sequence of estimates
converge to a local maximum of the likelihood
.

Write a Comment

User Comments (0)

About PowerShow.com

Introduction to Hidden Markov Models - PowerPoint PPT Presentation

Introduction to Hidden Markov Models

Example ---- Video Texture. Markov Chains. Hidden Markov Models. Example ---- Motion Texture ... The sequence of hidden states is Markov ... – PowerPoint PPT presentation