Title: Bayesian Evolutionary Distance
1Bayesian Evolutionary Distance
- P. Agarwal and D.J. States. Bayesian
evolutionary distance. Journal of Computational
Biology 3(1)117, 1996
2Determining time of divergence
- Goal Determine when two aligned sequences X and
Y diverged from a common ancestor AGTTGAC ACTT
GCC - Model
- Mutation only
- Independence
- Markov process
3Divergence points have different probabilities
X
Ancestor
Y
Probability
time
4DNA PAM matrices
- Similar to Dayhoff PAM matrices
- PAM 1 corresponds to 1 mutation
- 1 change 10 million years
- Simplification uniform mutation rates among
nucleotides - mij ? if i j
- mij ? if i ?? j
- Can modify to handle different transition/transver
sion rates - Transitions (A?G or C?T) have higher probability
than transversions - PAM x (PAM 1)x
5DNA PAM 1
A
C
T
G
A
G
T
A
6DNA PAM x
A
C
T
G
A
G
T
A
7DNA PAM x
- As x ? ?, ?(x) and ?(x) ? 1/4
- Assume pi ¼ for i A,C,T,G
- Leads to simple match/mismatch scoring scheme
8DNA PAM x Scoring
?
9DNA PAM
10DNA PAM n Scoring
Log-odds score of alignment of length n with k
mismatches
Odds score of same alignment
11Probability of k mismatches at distance x
Note Need odds score here, not log-odds!
12Expected evolutionary distance given k mismatches
Over all distances
By Bayes Thm
13Assumptions
- Consider only a finite number of values of x
e.g., 1, 10, 25,50, etc. - In theory, could consider any number of values
- Flat prior All values of x are equally likely
- If M values are considered, Pr(x) 1/M
14Calculating Pr(k) and Pr(xk)
15Calculating the distance
16Ungapped local alignments
An ungapped local alignment of sequences X and Y
is a pair of equal-length substrings of X and Y
Only matches and mismatches no gaps
17Ungapped local alignments
A
B
P. Agarwal and D.J. States. Bayesian
evolutionary distance. Journal of Computational
Biology 3(1)117, 1996
18Which alignment is better?
Answer depends on evolutionary distance