Hidden Markov Model HMM

About This Presentation

Title:

Hidden Markov Model HMM

Description:

HMMs could be compared to a kind of dynamic statistical profile ... The probability of a given sequence is obtained by the sum of loge (transition probabilities) ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 9

Provided by: bbaa

Category:

more less

Transcript and Presenter's Notes

Title: Hidden Markov Model HMM

1
Hidden Markov Model (HMM)

HMMs offer a more systematic approach to
estimating model parameters
HMMs could be compared to a kind of dynamic
statistical profile
Like an ordinary profile, it is built by
analyzing the distribution of aa in a training
set of related proteins
The topology of a HMM can be visualized as a
finite state machine

2
HMM multiple sequence alignment

Assume the following sequences
ACCG, CTG,CTG, CG
What is the best alignment?

ACCG CTG CTG C-G
3
Hidden Markov Model (HMM)
Delete States
Insert States
A
Match States
C
C
Begin
End
G
Movement from stage n to n1 with a certain
transition probability
4
Hidden Markov Model (HMM)

More than one path leads to the same result

Delete States
Insert States
A
Match States
C
C
Begin
End
G
Movement from stage n to n1 with a certain
transition probability
5
Hidden Markov Model (HMM)

The probability of a given sequence is obtained
by the sum of loge (transition probabilities)
Hidden Markov model, as the path is hidden
Transition probabilities are obtained by training
on a set of sequences
Initialization by estimated transition
probabilities
All possible paths generating a given sequence
are visited proportional to the estimated
transition probabilities
Counting the number of times a given transition
was visited during the above step provides
improved transition probabilities
Start again until the parameters do not change
significantly
The Viterbi algorithm is used on a trained HMM to
determine the best path
The Viterbi algorithm is similar to dynamic
programming

6
Hidden Markov Model (HMM)

HMM is a general technique that can be applied to
many different questions
Multiple sequence alignment
Identification of conserved domains
Gene prediction
Protein secondary structure prediction

7
PAM log odds score

PAM matrices are usually converted in log odds
matrices
The ratio of the hypothesis that the change
represents an authentic evolutionary variation to
the hypothesis that the change occurred because
of random sequence variation (no biol.
significance)
Phe-gtTry
Phe-Try score in PAM250 0.15
Frequency of Phe in data 0.04
Ratio 0.15/0.04 3.75
log103.75 0.57
Try-gtPhe log100.2/0.03 0.83
Log odds score (0.570.83)/210 (to remove
fractional values)

8
PAM matrices

In the PAM 1 matrix the summed probability of
each aa changing to another is 1
To obtain PAM matrices for N , the PAM1 matrix
is multiplied to itself N times
PAM250 represents a level of 250 change
(corresponds to 20 similarity, as some positions
may change several times, perhaps even reverting
to the original one))
Computer simulations have shown that PAM250
provides a better scoring alignment than lower
numbered PAMs for distantly (14-27 similarity)
proteins.

Write a Comment

User Comments (0)

About PowerShow.com

Hidden Markov Model HMM - PowerPoint PPT Presentation

Hidden Markov Model HMM

HMMs could be compared to a kind of dynamic statistical profile ... The probability of a given sequence is obtained by the sum of loge (transition probabilities) ... – PowerPoint PPT presentation