Title: WML2007 Hidden Markov Model
1WML2007 Hidden Markov Model
2References
- 1 Hidden Markov model, From Wikipedia, the free
encyclopedia - http//en.wikipedia.org/wiki/Hidden_Mar
kov_model - 2 Layered hidden Markov model, From Wikipedia,
the free encyclopedia - http//en.wikipedia.org/wiki/Layered_hi
dden_Markov_model - 3 N. Oliver, A. Garg and E. Horvitz, "Layered
Representations for Learning and - Inferring Office Activity from
Multiple Sensory Channels", Computer Vision and - Image Understanding, vol. 96, p.
163-180, 2004. - 4 Kalman filter, From Wikipedia, the free
encyclopedia - http//en.wikipedia.org/wiki/Kalman_fil
ter - 5 B. Ristic and S. Arulampalam, Beyond the
Kalman filter particle filters for - tracking. Boston, MA Artech House,
2004. - 6 X. Huang, A. Acero, H.-W. Hon, Spoken
Language Processing, Prentice Hall, - 2001,
- 7 Mobility and Handover Prediction mechanism a
performance comparison - exploiting several filters
http//lia.deis.unibo.it/Research/SOMA/MobilityPre
diction/htmlDocs/filters.html
3Outline
- Hidden Markov Model
- Hierarchical HMM
- Layered HMM
- Continuous HMM
- Semi-continuous HMM
- Solving HMM problems
- Forward and backward algorithm
- Baum-Welch algorithm
- Viterbi Algorithm
- Particle Filter
- Conclusion and Comments
4Hidden Markov Model
- Dynamic system
- The systems state is function of time
- Regular Markov Model
- State is directly visible.
- Hidden Markov Model
- State is invisible, but
- variables influenced
- by the state are visible.
- Hidden x(t) X1,X2,X3
- Observable y(t) Y1, Y2, Y3, Y4
5Hidden Markov Model
- Markov property
- hidden variable x(t) (at time t) only depends on
the value of the hidden variable x(t - 1) (at
time t - 1). Governed by State transition
function - the value of the observed variable y(t) only
depends on the value of the hidden variable x(t)
(both at time t). Governed by likelyhood function -
P(x(t)x(t-1))
P(y(t)x(t))
6Solving HMM HMM Decoding Problem
- Determine the hidden state sequence (x(0),
x(1),..x(t)) from the observation sequence
(y(0), y(1).y(t))
7Example Player Pose Estimation
Hidden States
Visible Observations
Feature Extraction ( Segmentation,Skeleton Body
part Localization )
8Some Applications of Hidden Markov Model
- Especially well-known for their application in
temporal pattern recognition - Speech recognition
- Handwriting recognition
- Gesture recognition
- bioinformatics
9Layered Hidden Markov Model
- Layered Hidden Markov Model (LHMM)
- Statistical model consists of N levels of HMMs
- Can be transfer to
- a more complex
- single HMM
10Layered Hidden Markov Model
11Layered Hidden Markov Model
- LHMM VS single HMM
- Smaller amount of data is required to achieve
performance comparable to the HMM - Any layer of the LHMM can be retrained separately
without altering the other layers of the LHMM - For example, the lower layers which are more
sensitive to changes in the environment such as
the type of sensors, sampling rate etc, can be
retrained separately
12Continuous HMM
- Discrete HMM
- Likelyhood function p(ytxt) are discrete
- Continuous HMM
- Likelyhood function p(ytxt) are continuous
- Usually using mixture of Gaussians to approximate
the continuous likelyhood function p(ytxt) -
p(ytxt) continuous
13Semi-continuous HMM
- Modeling the discrete HMM with mixture of
Gaussians model - Usually select the most significant Gaussians
only to model discrete likelyhood function
p(ytxt)
p(ytxt)
yt
14Three Basic Problems
- The Evaluation Problem Given an HMM F and a
sequence of observation, what is the probability
that the model gnerates the observations ? - The Learning Problem Given a HMM F and a set of
observations, how can we adjust the HMM parameter
F to maximize the joint probability (likehood)
?P(YF) - The Decoding Problem (decode the hidden states,
state estimation) Given a HMM F and a sequence
of observations, what is the most likely state
sequence that produces the observation ?
15Existing Algorithms
- HMM Evaluation
- Forward algorithm
- Backward algorithm
- HMM Learning
- Baum-Welch Algorithm
- HMM Decoding (state estimation)
- Viterbi algorithm
- Particle Filter
16HMM Evaluation Forward Algorithm
- The evaluation problem
- P(Y1, YT F) ?
- A problem of O(NT)
- Forward algorithm
- Solve the evaluation problem in recursive style
- Forward probability
- the probability of producing Yi,t-1 while ending
up in state si - At each time iteration, the only probability to
be calculated is the forward probability
17HMM Evaluation Forward Algorithm
Initial state probabilities State transition
prob Aaij. Symbol emission prob Bbijk
Initialization
Induction
18Calculating Observation probability
19HMM Evaluation Backward Algorithm
- Backward probability
- The probability of producing the sequence Yt,T,
given that at time t, we are at state si.
20HMM Evaluation Backward Algorithm
Initialization
Induction
21Calculating Observation probability
- Traced in a similar way except the direction is
from tT to 1(backward)
22 aß trellis
23Remarks
- The forward algorithm and backward algorithm both
have complexity of O(TN2)
24HMM Learning Baum-Welch algorithm
- HMM Learning problem
- Given a HMM F and a set of observations, how can
we adjust the HMM parameter F to maximize the
joint probability (likehood) ?P(YF) - Baum-Welch algorithm
- Similar to general EM algorithm
- The updated parameter can be obtained with
maximizing the auxiliary function Q
25HMM Learning Baum-Welch algorithm
- The probability of taking the transition from
state i to sate j at time t, given the model and
observation sequence Y1,T
26HMM Learning Baum-Welch algorithm
- The equation for re-estimating the parameter
27Remarks
- Recursive algorithm
- Unsupervised learning
28HMM Decoding Viterbi Algorithm
- The Decoding Problem (decode the hidden states,
state estimation) Given a HMM F and a sequence
of observations, what is the most likely state
sequence that produces the observation ?
29HMM Decoding Viterbi Algorithm
- Viterbi Algorithm
- A best path finding algorithm in dynamic
programing - Recursive algorithm
- The probability of the most likely state sequence
at time t 1, which has generated the observation
Y1,t and ends in state j
The best path
30HMM Decoding Viterbi Algorithm
- Modified forward algorithm
- example
31Particle Filter for Solving HMM Decoding
- Suitable for HMM with continuous probability as
Relations - Goal estimate the posterior density
- p( x(t) z(t), z(t-1),.z(0))
P(x(t)x(t-1))
P(y(t)x(t))
32Particle Filter
- Particle filter
- Sampling-Importance-Resampling Filter (SIRF)
- Approx. of posterior density
- Estimation with SIRF
33Particle Filter
- Application
- Target tracking
- Navigation
- Blind deconvolution of digital communication
channels - Joint channel estimation
- Detection in Rayleigh fading channels
- Digital enhancement of speech and audio signals
- Time-varying spectrum estimation
- Computer vision
- Portfolio allocation
- Sequential estimation of signals under model
uncertainty - ..
34Sampling
- Particle generation with importance function
sampling - Many choices for importance functions
- Optimal choice
- Transition density p(xn xn-1)
- Local linearizations and Gaussian approximation
of the optimal choice -
35Importance
- The weight updating EQ can be shown to be
-
36Resampling
- To solve the degeneracy problem
- Many existing resampling method
- Systematic resampling
- Residual resampling
- Residual systematic resampling (proposed)
-
37Resampling Algorithm Systematic Resampling
38Particle Filter (SIRF) Procedure
Initialization (sampling and assign uniform
weights)
Importance calculation
Resampling
Sampling
39Conclusion and Comments
- HMM
- Can be used to employ temporal relationship
- HMM Evaluation
- Forward, Backward algorithm
- HMM Learning
- Baum-Welch algorithms
- HMM Decoding
- Viterbi
- Particle filter Relations are non-deterministic
with continuous probability.
40Conclusion and Comments
- Different problems gt different machine learning
tool that are suitable - Good features is also important
- Combination of N kinds of machine learning
algorithms - Ex. Behavior analysis
- Raw image gt feature extraction gt SVM or
Adaboost gt HMM