CS479679 Pattern Recognition Spring 2006 Prof' Bebis - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

CS479679 Pattern Recognition Spring 2006 Prof' Bebis

Description:

Speech recognition. Gesture recognition. Human activity ... For every sequence of -hidden- states, there is an associated sequence of visible states: ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 56
Provided by: cse5
Category:

less

Transcript and Presenter's Notes

Title: CS479679 Pattern Recognition Spring 2006 Prof' Bebis


1
CS479/679 Pattern RecognitionSpring 2006 Prof.
Bebis
  • Hidden Markov Models (HMMs)
  • Chapter 3 (Duda et al.) Section 3.10

2
Hidden Markov Models (HMMs)
  • Sequential patterns
  • The order of the data points is irrelevant.
  • No explicit sequencing ...
  • Temporal patterns
  • The result of a time process (e.g., time series).
  • Can be represented by a number of states.
  • States at time t are influenced directly by
    states in previous time steps (i.e., correlated).

3
Hidden Markov Models (HMMs)
  • HMMs are appropriate for problems that have an
    inherent temporality.
  • Speech recognition
  • Gesture recognition
  • Human activity recognition

4
First-Order Markov Models
  • They are represented by a graph where every node
    corresponds to a state ?i.
  • The graph can be fully-connected with self-loops.
  • Links between nodes ?i and ?j are associated with
    a transition probability
  • P( ?(t1)?j/?(t)?i )aij
  • which is the probability of going to state ?j
    at time t1 given that the state at time t was ?i
    (first-order model).

5
First-Order Markov Models (contd)
  • The following constraints should be satisfied
  • Markov models are fully described by their
    transition probabilities aij

6
Example Weather Prediction Model
  • Assume three weather states
  • ?1 Precipitation (rain, snow, hail, etc.)
  • ?2 Cloudy
  • ?3 Sunny

Transition Matrix
?1
?2
? 1 ? 2 ? 3
?1 ?2 ?3
?3
7
Computing P(?T) of a sequence of states ?T
  • Given a sequence of states ?T(?(1), ?(2),...,
    ?(T)), the probability that the model generated
    ?T is equal to the product of the corresponding
    transition probabilities
  • where P(?(1)/ ?(0))P(?(1)) is the prior
    probability of the first state.

8
Example Weather Prediction Model (contd)
  • What is the probability that the weather for
    eight consecutive days is
  • sun-sun-sun-rain-rain-sun-cloudy-sun ?
  • ?8?3?3?3?1?3?2?3
  • P(?8)P(?3)P(?3/?3)P(?3/?3)P(?1/?3)P(?3/?1)
  • P(?2/?3)P(?3/?2)1.536 x 10-4

9
Limitations of Markov models
  • In Markov models, each state is uniquely
    associated with an observable event.
  • Once an observation is made, the state of the
    system is trivially retrieved.
  • Such systems are not of practical use for most
    practical applications.

10
Hidden States and Observations
  • Assume that observations are a probabilistic
    function of each state.
  • Each state can produce can generate a number of
    outputs (i.e., observations) according to a
    unique probability distribution.
  • Each observation can potentially be generated at
    any state.
  • State sequence is not directly observable.
  • Can be approximated by a sequence of
    observations.

11
First-order HMMs
  • We augment the model such that when it is in
    state ?(t) it also emits some symbol v(t)
    (visible states) among a set of possible symbols.
  • We have access to the visible states only, while
    the ?(t) are unobservable.

12
Example Weather Prediction Model (contd)

v1 temperature v2 humidity etc.
Observations
13
First-order HMMs
  • For every sequence of -hidden- states, there is
    an associated sequence of visible states
  • ?T(?(1), ?(2),..., ?(T)) ? VT(v(1),
    v(2),..., v(T))
  • When the model is in state ?j at time t, the
    probability of emitting a visible state vk at
    that time is denoted as
  • P(v(t)vk/ ?(t) ?j)bjk where
  • (observation probabilities)

14
Absorbing State
  • Given a state sequence and its corresponding
    observation sequence
  • ?T(?(1), ?(2),..., ?(T)) ? VT(v(1),
    v(2),..., v(T))
  • we assume that ?(T)?0 is some absorbing
    state, which uniquely emits symbol v(T)v0
  • Once entering the absorbing state, the system can
    not escape from it.

15
HMM Formalism
  • An HMM is defined by O, V, P, A, B
  • O ?1 ? n are the possible states
  • V v1vm are the possible observations
  • P pi prior state probabilities
  • A aij are the state transition probabilities
  • B bik are the observation state probabilities

16
Some Terminology
  • Causal the probabilities depend only upon
    previous states.
  • Ergodic Every one of the states has a non-zero
    probability of occurring given some starting
    state.

left-right HMM
17
Coin toss example
  • You are in a room with a barrier (e.g., a
    curtain) through which you cannot see what is
    happening.
  • On the other side of the barrier is another
    person who is performing a coin (or multiple
    coin) toss experiment.
  • The other person will tell you only the result of
    the experiment, not how he obtained that result!!
  • e.g., VTHHTHTTHH...Tv(1),v(2), ..., v(T)

18
Coin toss example (contd)
  • Problem derive an HMM model to explain the
    observed sequence of heads and tails.
  • The coins represent the states these are hidden
    because we do not know which coin was tossed each
    time.
  • The outcome of each toss represents an
    observation.
  • A likely sequence of coins may be inferred from
    the observations.
  • As we will see, the state sequence will not be
    unique in general.

19
Coin toss example1-fair coin model
  • There are 2 states, each associated with either
    heads (state1) or tails (state2).
  • Observation sequence uniquely defines the states
    (model is not hidden).

observations
20
Coin toss example2-fair coins model
  • There are 2 states but neither state is uniquely
    associated with either heads or tails (i.e., each
    state can be associated with a different fair
    coin).
  • A third coin is used to decide which of the fair
    coins to flip.

observations
21
Coin toss example2-biased coins model
  • There are 2 states with each state associated
    with a biased coin.
  • A third coin is used to decide which of the
    biased coins to flip.

observations
22
Coin toss example3-biased coins model
  • There are 3 states with each state associated
    with a biased coin.
  • We decide which coin to flip using some way
    (e.g., other coins).

observations
23
Which model is best?
  • Since the states are not observable, the best we
    can do is select the model that best explains the
    data.
  • Long observation sequences would be best for
    selecting the best model ...

24
Classification Using HMMs
  • Given an observation sequence VT and set of
    possible models, choose the model with the
    highest probability.

Bayes formula
25
Main Problems in HMMs
  • Evaluation
  • Determine the probability P(VT) that a particular
    sequence of visible states VT was generated by a
    given model (based on dynamic programming).
  • Decoding
  • Given a sequence of visible states VT, determine
    the most likely sequence of hidden states ?T that
    led to those observations (based on dynamic
    programming).
  • Learning
  • Given a set of visible observations, determine
    aij and bjk (based on EM algorithm).

26
Evaluation


(i.e., possible of state sequences)
27
Evaluation (contd)

(enumerate all possible transitions to determine
how good the model is)
28
Example Evaluation

(enumerate all possible transitions to determine
how good the model is)
29
Computational Complexity

30
Recursive computation of P(VT) (HMM Forward)

?(1)
?(t)
?(t1)
?(T)
?i
?j
...
v(T)
v(t1)
v(1)
v(t)
31
Recursive computation of P(VT) (HMM Forward)
(contd)
  • Using maginalization

32
Recursive computation of P(VT) (HMM Forward)
(contd)

?0

?0
?0
?0
?0
33
Recursive computation of P(VT) (HMM Forward)
(contd)



for j0 to c do

(i.e., corresponds to state ?0 ?(T))
34
Example

?0 ?1 ?2 ?3
?0 ?1 ?2 ?3
?0 ?1 ?2 ?3
35
Example (contd)
  • Similarly for t2,3,4
  • Finally


VT
(0.00108)
0.2
initial state
0.2
0.8
36
The backward algorithm (HMM backward)

ßj(t1)
/? (t1)?j)
ßi(t)
i
ßi(t)
?i
?(t)
?(t1)
?(T)
?i
?j
...
v(t)
v(t1)
v(T)
37
The backward algorithm (HMM backward) (contd)

?j))
or
i
?(t)
?(t1)
?(T)
?i
?j
v(t)
v(t1)
v(T)
38
The backward algorithm (HMM backward) (contd)


39
Decoding
  • We need to use an optimality criterion to solve
    this problem (i.e., there are several possible
    ways solving this problem since there are various
    optimality criteria we could use).
  • Algorithm 1 choose the states ?(t) which are
    individually most likely (i.e., maximize the
    expected number of correct individual states).

40
Decoding Algorithm 1 (contd)

41
Decoding Algorithm 2
  • Algorithm 2 at each time step t, find the state
    that has the highest probability ai(t).
  • Uses the forward algorithm with minor changes.

42
Decoding Algorithm 2 (contd)


43
Decoding Algorithm 2 (contd)

44
Decoding Algorithm 2 (contd)
  • There is no guarantee that the path is a valid
    one.
  • The path might imply a transition that is not
    allowed by the model.


0 1 2 3
4
not allowed! ?320
45
Decoding Algorithm 3

46
Decoding Algorithm 3 (contd)

47
Decoding Algorithm 3 (contd)

48
Decoding Algorithm 3 (contd)

49
Learning
  • Use EM
  • Update the weights iteratively to better explain
    the
  • observed training sequences.

50
Learning (contd)
  • Idea

51
Learning (contd)
  • Define the probability of transitioning from ?i
    to ?j at step t given VT

(expectation step)
52
Learning (contd)

53
Learning (contd)

(maximization step)
54
Learning (contd)

(maximization step)
55
Difficulties
  • How do we decide on the number of states and the
    structure of the model?
  • Use domain knowledge otherwise very hard problem!
  • What about the size of observation sequence ?
  • Should be sufficiently long to guarantee that all
    state transitions will appear a sufficient number
    of times.
  • A large number of training data is necessary to
    learn the HMM parameters.
Write a Comment
User Comments (0)
About PowerShow.com