Title: An Evolutionary SpaceTime Model with Varying AmongSite Dependencies
1An Evolutionary Space-Time Model with Varying
Among-Site Dependencies
2Evolutionary models
- The aim of an evolutionary model to describe
(often in probabilistic terms) the evolutionary
biological reality
http//tolweb.org
3Likelihood
- An evolutionary model enables us to compute the
likelihood that a certain scenario describes the
biological reality we observe. - For instance, what is the likelihood that
GVLLMEIFALTQFQRRGNQANAFSFGKDIFIRQFVPGCIRDGYTFVGPV
GVLLTELLALTKIKQRANDKFAFSFGKESFIEPFVPGCVSEPYAIMIFV
GVLLMRFTTLNPIHKRGSKAFAFSFGKDSLVGTFVPGCIPLAYSIVTPV
GVLLMRLATLSLMHQRGSKASAFSFGKDSLIGPFVPGCIQLAYNVISPV
GVVLVALITLIPIKKRGTQVFAFSFGHDEFIRTFVPGCVEDNFDQILRI
GVVLMVLYTLSRIKKRGTQPVTFSFGEDEFLRIFVPGCANDTFELLMQL
GVLLMGLFPMKHIEKRGFQALAFSFGNDAFIRPFVPGCIEEGYPVLAPL
describes
4Markov process
- Most models today assume a Markov process over
time, i.e. over the phylogenetic tree
Time
5Inference with a model
- Using the evolutionary model we can compute the
likelihood of the data, and we can use this to
infer different biological properties-
phylogenetic tree- ancestral states-
evolutionary rate
6Site-specific evolutionary rates
- High rate fast-evolving site ? variable
- Low rate slow-evolving site ? conserved
7Space-time sites in a protein are dependent
- Sites are independent
- Sites are dependent conserved regions in a
protein there is an interaction amongst sites
8Space and time
- A Markov process is assumed both in time and in
space (spatial relation)
Time
9Space and time
- A Markov process is assumed both in time and in
space (spatial relation)
Time
Position 2
Position 3
Position 1
Space
10A Markov process over the rates
- If we have 4 possible rate categories, the Markov
process is described by
positioni
positioni1
11A Markov process over the rates
- If we have 4 possible rate categories, the Markov
process is described by
positioni
positioni1
12A Hidden Markov model (HMM)
- This leads to an HMM over the evolutionary rates
Yang 1995Churchill and Felsenstein 1996
13But are adjacent sites dependent?
14But are adjacent sites dependent?
15An HMM with hyper-states
Dependence
Independence
positioni
positioni1
Stern and Pupko 2006
16An HMM with hyper-states
Dependence
Independence
Stern and Pupko 2006
17An HMM with hyper-states
D model
Dependence
I model
Independence
Stern and Pupko 2006
18An HMM with hyper-states
DI model
D model
Dependence
I model
Independence
Stern and Pupko 2006
19Validating the model
- Likelihood analysis
- In-depth study of biological examplerate
inference and dependence inference - Simulation studies
20Likelihood analysis
- 84 protein datasets analysed in 60 of 84 the DI
model outperformed the D model, in 81 of 84 the
DI model outperformed the I model (LRT AIC) - Datasets where the improvement was not
significant tended to be small (few sequences or
short sequence length)
21In-depth analysis of the Potassium channel
Extracellular
Tetrameric pore-forming protein
Intracellular
22In-depth analysis of the Potassium channel
View from extracellular side
Extracellular
23Using DI to analyse the K-channel
Dark conserved
Light variable
Variable Conserved
24Dependence inference
- The DI model enables not only inferring the rate
at each site but also inferring whether this
position is dependent or independent of the
previous position
position 1
position 2
position 1
position 2
25Sites which were inferred as independent
Dark conserved
Light variable
26Summary
- DI model may more accurately model the
biological reality, but requires a larger dataset
to support it. - The DI model enables implicit study of relations
between structure and evolutionary rate.
Stern and Pupko, 2006 Mol. Biol. Evol.
Future enhancements
- Explicitly model dependencies along the 3D
structure of the protein belief propagation.
27Acknowledgements
- Dr. Tal Pupko
- Lab members Itay MayroseAdi Doron-FaigenboimNi
mrod Rubinstein Eyal PrivmanOsnat ZomerThis
study was supported by an Israeli Science
Foundation grant - THANK YOU!