Hidden Markov Models (HMM) in Sequence Analysis - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

Hidden Markov Models (HMM) in Sequence Analysis

Description:

PFAM:Protein families database of alignments and HMMs. HMMpro at www.netid.com,(Baldi, Chauvin and Mittal-Henkle) HMMER at hmmer.wustl. ... dry dryish damp soggy ... – PowerPoint PPT presentation

Number of Views:408

Avg rating:3.0/5.0

Slides: 34

Provided by: sch17

Category:

more less

Transcript and Presenter's Notes

Title: Hidden Markov Models (HMM) in Sequence Analysis

1
Hidden Markov Models (HMM) in Sequence Analysis
2
Current Applications

Multiple Sequence Alignment
PFAMProtein families database of alignments and
HMMs
HMMpro at www.netid.com,(Baldi, Chauvin and
Mittal-Henkle)
HMMER at hmmer.wustl.edu(S.Eddy)
SAM (Karplus et al.) at www.cse.ucsc.edu/research/
compbio/sam.html
Gene finding (GLIMMER)
Motif/Promoter region finding

3
Markov chains

The Basics

4
A Markov Chain of Weather
5
A Markov chain
For all L
The current state of the chain only depends on
the last state (not on the future no memory)
6
(No Transcript)
7
Weather and Seaweed
Hidden states the (TRUE) states of a system
that may be described by a Markov process (e.g.,
the weather). Observable states (symbols) the
states of the process that are visible' (e.g.,
seaweed dampness).
8
Emission Probability
Output matrix (emission probability) containing
the probability of observing a particular
observable state given that the hidden model is
in a particular hidden state. Initial
Distribution contains the probability of the
(hidden) model being in a particular hidden state
at time t 1. State transition matrix holding
the probability of a hidden state given the
previous hidden state.
9
(No Transcript)
10
(No Transcript)
11
Ex2.HMM for Sequence Alignment
HMM with 3 states that emits residue
pairs (M)atch emit an aligned pair (D)elete1
emit a residue in seq.1 and a
gap in seq.2 (D)elete2 The converse of D1
Emission Prob. If in M, emit same letter with
probability 0.24 (for each letter) If in D1 or
D2 emit all letters uniformly

M 0.9
D1 0.05
D2 0.05
Transition matrix A Transition matrix A Transition matrix A Transition matrix A
M D1 D2
M 0.9 0.05 0.05
D1 0.95 0.05 0
D2 0.95 0 0.05
12
Ex3. HMM for gene finding
An HMM for unspliced genes. x non-coding DNA c
coding state
13
HMM Illustration

An Occasionally Dishonest Casino

14
(No Transcript)
15
A sequence of rolls by the casino player
6
4
6
6
6
1
66
66
6
3
1
6
3
1
6
1
6
6
6
6
16
(No Transcript)
17
(No Transcript)
18
Question 1

Evaluation
GIVEN
A sequence of rolls by the casino player
6
4
6
6
6
1
66
66
6
3
1
6
3
1
6
1
6
6
6
6
QUESTION
How likely is this sequence, given our model of
how the casino
works?
This is the
EVALUATION
problem in HMMs
19
(No Transcript)
20
Question 3

Learning
GIVEN
A sequence of rolls by the casino player
124552
6
4
6
214
6
14
6
13
6
13
666
1
66
4
66
1
6
3
66
1
6
3
66
1
6
3
6
1
6
515
6
1511514
6
1235
6
2344
QUESTION
How loaded is the loaded die? How fair is the
fair die? How
often
does the casino player change from fair to
loaded, and back?
This is the
LEARNING
question in HMMs
Lecture 4, Thursday April 10, 2003
21
(No Transcript)
22
N
23
(No Transcript)
24
The three main questions on HMMs
1.
Evaluation
GIVEN
a HMM M,
and a sequence x,
FIND
Prob
x M
2.
Decoding
GIVEN
a HMM M,
and a sequence x,
FIND
the sequence
of states that maximizes P x,
M
p
p
3.
Learning
GIVEN
a HMM M, with unspecified transition/emission
probs.,
and a sequence x,
FIND
parameters
(
e
(.),
a
) that maximize P x

q
q
i
ij
Lecture 4, Thursday April 10, 2003
25
0.90
0.10
26
Transition / Emission Probability

Hidden State Space
S0 (fair), 1(loaded)
Observable symbols
S1,2,3,4,5,6
At position i in a sequence of N tosses,
transition prob
Emission prob

Joint Probability of a sequence x of length N and
a corresponding state sequence (a parse) p

28
The Optimal State Path

The state sequence that maximizes the joint
probability (or the most probable path given the
observed sequence)

29
Finding the Optimal Path

The joint probability is multiplicative
?The log joint probability is additive

30
Finding the Optimal Path

The joint probability is multiplicative
?The log joint probability is additive
Use dynamic programming!
The Viterbi Algorithm

31
Deriving the Viterbi Algorithm

Denote the maximum joint prob of x and p up to
position N that ends with hidden state
by
is the corresponding best parse ending with

32
Deriving the Viterbi Algorithm

Denote the maximum joint prob of x and p up to
position N that ends with by
Or (3.8) in Durbins book pp 55.

33
A short sequence of rolls
3 1 5 1 1 6 6 6 6 1
Fair 0.51/61/12 1/120.95/60.013, 1/200.05/6 0.0021, -- 0.0003, -- 5e-5, -- 7.5e-6, -- 1.2e-6, 2.1e-8 1.9e-7, 5e-9 3e-8, 2.4e-9 5e-9, 1e-9
Loaded 0.51/101/20 1/120.05/10, 0.00475 --,0.00045 0.00210.05/10, 0.000450.95 /104e-5 3e-40.05/1015e-7, 4e-6 5e-5 0.05/2 1.25e-6, 1e-7 1.9e-7, 6e-7 3e-8, 3e-7 --, 1.4e-7 1.5e-10, 1.3e-8
34
Optimal Path of fair/loaded die

Write a Comment

User Comments (0)