Hidden Markov Models (HMM) in Sequence Analysis - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Hidden Markov Models (HMM) in Sequence Analysis

Description:

PFAM:Protein families database of alignments and HMMs. HMMpro at www.netid.com,(Baldi, Chauvin and Mittal-Henkle) HMMER at hmmer.wustl. ... dry dryish damp soggy ... – PowerPoint PPT presentation

Number of Views:408
Avg rating:3.0/5.0
Slides: 34
Provided by: sch17
Category:

less

Transcript and Presenter's Notes

Title: Hidden Markov Models (HMM) in Sequence Analysis


1
Hidden Markov Models (HMM) in Sequence Analysis
2
Current Applications
  • Multiple Sequence Alignment
  • PFAMProtein families database of alignments and
    HMMs
  • HMMpro at www.netid.com,(Baldi, Chauvin and
    Mittal-Henkle)
  • HMMER at hmmer.wustl.edu(S.Eddy)
  • SAM (Karplus et al.) at www.cse.ucsc.edu/research/
    compbio/sam.html
  • Gene finding (GLIMMER)
  • Motif/Promoter region finding

3
Markov chains
  • The Basics

4
A Markov Chain of Weather
5
A Markov chain
For all L
The current state of the chain only depends on
the last state (not on the future no memory)
6
(No Transcript)
7
Weather and Seaweed
Hidden states the (TRUE) states of a system
that may be described by a Markov process (e.g.,
the weather). Observable states (symbols) the
states of the process that are visible' (e.g.,
seaweed dampness).
8
Emission Probability
Output matrix (emission probability) containing
the probability of observing a particular
observable state given that the hidden model is
in a particular hidden state. Initial
Distribution contains the probability of the
(hidden) model being in a particular hidden state
at time t 1. State transition matrix holding
the probability of a hidden state given the
previous hidden state.
9
(No Transcript)
10
(No Transcript)
11
Ex2.HMM for Sequence Alignment
HMM with 3 states that emits residue
pairs (M)atch emit an aligned pair (D)elete1
emit a residue in seq.1 and a
gap in seq.2 (D)elete2 The converse of D1
Emission Prob. If in M, emit same letter with
probability 0.24 (for each letter) If in D1 or
D2 emit all letters uniformly

M 0.9
D1 0.05
D2 0.05
Transition matrix A Transition matrix A Transition matrix A Transition matrix A
M D1 D2
M 0.9 0.05 0.05
D1 0.95 0.05 0
D2 0.95 0 0.05
12
Ex3. HMM for gene finding
An HMM for unspliced genes. x non-coding DNA c
coding state
13
HMM Illustration
  • An Occasionally Dishonest Casino

14
(No Transcript)
15
A sequence of rolls by the casino player
6
4
6
6
6
1
66
66
6
3
1
6
3
1
6
1
6
6
6
6
16
(No Transcript)
17
(No Transcript)
18
Question 1

Evaluation
GIVEN
A sequence of rolls by the casino player
6
4
6
6
6
1
66
66
6
3
1
6
3
1
6
1
6
6
6
6
QUESTION
How likely is this sequence, given our model of
how the casino
works?
This is the
EVALUATION
problem in HMMs
19
(No Transcript)
20
Question 3

Learning
GIVEN
A sequence of rolls by the casino player
124552
6
4
6
214
6
14
6
13
6
13
666
1
66
4
66
1
6
3
66
1
6
3
66
1
6
3
6
1
6
515
6
1511514
6
1235
6
2344
QUESTION
How loaded is the loaded die? How fair is the
fair die? How
often
does the casino player change from fair to
loaded, and back?
This is the
LEARNING
question in HMMs
Lecture 4, Thursday April 10, 2003
21
(No Transcript)
22
N
23
(No Transcript)
24
The three main questions on HMMs
1.
Evaluation
GIVEN
a HMM M,
and a sequence x,
FIND
Prob
x M
2.
Decoding
GIVEN
a HMM M,
and a sequence x,
FIND
the sequence
of states that maximizes P x,
M
p
p
3.
Learning
GIVEN
a HMM M, with unspecified transition/emission
probs.,
and a sequence x,
FIND
parameters
(
e
(.),
a
) that maximize P x

q
q
i
ij
Lecture 4, Thursday April 10, 2003
25
0.90
0.10
26
Transition / Emission Probability
  • Hidden State Space
  • S0 (fair), 1(loaded)
  • Observable symbols
  • S1,2,3,4,5,6
  • At position i in a sequence of N tosses,
    transition prob
  • Emission prob

27
  • Joint Probability of a sequence x of length N and
    a corresponding state sequence (a parse) p

28
The Optimal State Path
  • The state sequence that maximizes the joint
    probability (or the most probable path given the
    observed sequence)

29
Finding the Optimal Path
  • The joint probability is multiplicative
  • ?The log joint probability is additive

30
Finding the Optimal Path
  • The joint probability is multiplicative
  • ?The log joint probability is additive
  • Use dynamic programming!
  • The Viterbi Algorithm

31
Deriving the Viterbi Algorithm
  • Denote the maximum joint prob of x and p up to
    position N that ends with hidden state
    by
  • is the corresponding best parse ending with

32
Deriving the Viterbi Algorithm
  • Denote the maximum joint prob of x and p up to
    position N that ends with by
  • Or (3.8) in Durbins book pp 55.

33
A short sequence of rolls
3 1 5 1 1 6 6 6 6 1
Fair 0.51/61/12 1/120.95/60.013, 1/200.05/6 0.0021, -- 0.0003, -- 5e-5, -- 7.5e-6, -- 1.2e-6, 2.1e-8 1.9e-7, 5e-9 3e-8, 2.4e-9 5e-9, 1e-9
Loaded 0.51/101/20 1/120.05/10, 0.00475 --,0.00045 0.00210.05/10, 0.000450.95 /104e-5 3e-40.05/1015e-7, 4e-6 5e-5 0.05/2 1.25e-6, 1e-7 1.9e-7, 6e-7 3e-8, 3e-7 --, 1.4e-7 1.5e-10, 1.3e-8
34
Optimal Path of fair/loaded die
Write a Comment
User Comments (0)
About PowerShow.com