Title: SYSTEMS BIOLOGY
1SYSTEMS BIOLOGY
- Lukasz Huminiecki, DPhil Nobel medical
institute, Karolinska, Stockholm Ludwig
Institute for Cancer Research, Uppsala
2Please, tell me who you are!
- Raise your hand if you are
- Computational biologist/bioinformatician
- Computer scientist/mathematician
Undergraduate
Post-doc
3WHAT IS SYSTEMS BIOLOGY?
What do you think?
- Systems biology is the coordinated study of
biological systems by (1) investigating the
components of cellular networks and their
interactions, (2) applying exprerimental
high-throughput and whole-genome techniques, and
(3) integrating computational methods with
experiemntal efforts. first sentence of the
Preface, to Klipp E et al. Systems Biology in
Practice, WILEY-VCH, 2005.
4Back to the Roots?
Before the era of the molecular revolution
physiology-oriented biologists were much more
used to looking at living things as systems.
In fact, early criticics argued that molecular
approaches are too reductionist, attempting to
explain complex biological phenomena, through
actions of few genes or proteins.
There is a cyclical element to all progress!
5Four areas of systems biology on which I will
focus today
- Analysis of expression patterns
- Mathematical modeling
- Phylogenetics
- Web-resources and data integration
6PART 1EXPRESSION PATTERN EVOLUTION
7Classic view of evolution through gene
duplication
- Susumu Ohno, 1970. Evolution by Gene Duplication.
Springer, Berlin - Natural selection merely modified while
redundancy created" - The neo-functionalization model
8Genome-scale tests (1)
9Genome-scale tests (2)
- Nembaware et al. 2002 Impact of the presence of
paralogs on sequence divergence in a set of
mouse-human orthologs. Genome Research
10Gene Expression Atlas
- http//expression.gnf.org
- 101 human (microchip U95A) and 89 mouse
(microchip U74A) Affymetrix experiments - Huminiecki L, Lloyd AT, Wolfe KH. Congruence of
tissue expression profiles from Gene Expression
Atlas, SAGEmap and TissueInfo databases. BMC
Genomics. 2003 Jul 294(1)31 - Mapping to Ensembl via LocusLink
- TRIBE families and Ka/Ks calculations using yn00
from PAML
11Huminiecki et al. Congruence of tissue
expression profiles from GEA, SAGEmap and
TissueInfo databases. BMC Genomics
12R vs. Ks in paralogs
13One-to-one orthologs
14Human or mouse duplication
15Cumulative plots
16Randomisation test
The percentages indicate the ratios between the
fractions of genes having a particular R-value in
sets of orthologues with the human (163 sets) or
mouse (139 sets) duplication versus the group of
one-to-one orthologues (1,324 pairs).
17Sub-functionalisation
- Force et al. argue that neo-functionalisation
alone could not account for high accumulation of
duplicated genes in eucaryotes - Duplication-degeneration-complementation (DDC)
- Should lead to tissue-specific expression!
18Tissue-specific genes evolve faster and are more
likely to belong to large gene families
19Gene expression patterns are, in evolutionary
perspective, surprisingly labile!
20Literature
- Khaitovich P, Weiss G, Lachmann M, Hellmann I,
Enard W, Muetzel B, Wirkner U, Ansorge W, Paabo
S. A neutral model of transcriptome evolution.
PLoS Biol. 2004 May2(5)E132. Epub 2004 May 11. - Huminiecki L, Wolfe KH. Divergence of spatial
gene expression profiles following
species-specific gene duplications in human and
mouse. Genome Res. 2004 Oct14(10A)1870-9. - Jordan IK, Marino-Ramirez L, Koonin EV.
Evolutionary significance of gene expression
divergence. Gene. 2005 Jan 17345(1)119-26. Epub
2004 Dec 29. - Khaitovich P, Paabo S, Weiss G. Toward a neutral
evolutionary model of gene expression. Genetics.
2005 Jun170(2)929-39. Epub 2005 Apr 16. - Liao BY, Zhang J. Evolutionary conservation of
expression profiles between human and mouse
orthologous genes. Mol Biol Evol. 2006
Mar23(3)530-40. Epub 2005 Nov 9. -
21The take home message
- An entirely new paradigm is emerging in
evolutionary biology expression patterns can
change dramatically in the course of evolution. - This impacts on our understanding of
biodiversity, human origins, and drug discovery.
22Broad goals of collaboration with Pfizer
- We aim towards a set of heuristic rules to
identify the most druggable GPCRs and the best
model species in which to conduct preclinical
tests. By druggable it is meant those which
possess any single or combination of
characteristics favourable to drug development,
such as (1) conserved sequence, (2)
tissue-specificity, and (3) expression domain not
overlapping with other members of the family.
Conserved sequence suggests that function is
the same, and that drugs will have similar
efficacy. A tissue-specific gene facilitates
targeting into specific organs or tumour types,
and is less likely to engage in multiple
functions - both of these features are likely to
result in advantageous toxicological profiles.
Non-overlapping expression domain minimises the
possibility of functional redundancy. Finally,
the best animal model for preclinical trials is
likely to be the species with the most human
expression pattern of the target gene, especially
in tissues directed for therapeutic intervention,
as well as in toxicologically important organs,
such as heart, lung, liver, kidney, and brain.
23Specific goals of collaboration with Pfizer
- Generate high quality RNA preparations from 20
organs from duplicate male and female rat, guinea
pig and dog samples, for comparison with
commercial human RNA samples. - Using qPCR techniques, determine the expression
profile of at least 25 genes (with
representatives from the histaminergic,
serotinergic, and adrenergic GPCR families) in
each of these tissues. - Analyse data to consider congruence in expression
profiles between species from an evolutionary
bioinformatics perspective, in addition to
gaining a deeper understanding into the degree of
human-animal model translation and therefore into
the suitability of animal species used for
functional efficacy and toxicological studies at
Pfizer.
24Results RNA isolation
a) b)
c)
Polytron/RNAeasy with additional acid phenol step
and DNAaway for difficult tissues
25The Database
- cumulative genes
RT_samples ----------
------ ---------- - run ----gt id
lt---------- id assay
symbol prep
tissue
species - gene ----------
-------- tissue rt -------
ct
- \/
\/ - RT_summary tissue
preps - ----------
------------ ------- - id date -----
tissue_index lt---------- prep - technician
tissue_name species - kit
tissue - samples lt-
donor - description
ratio - dilution
yield -
technician - housekeep_actb
- -------------gt housekeep_hprt
- --------------
- count
- run
- tissue
26The Ct value
- Two-tube comparative methodwith virtual
housekeeping gene - Amplification assumed to be exponential with
100 efficiency, Cts scaled accordingly - a) histogram of Ct-values for over 6000
reactions b) standard deviations in triplicates
c) ACTB plotted against HPRT1. a)
b)
c)
27Results overview
- Tissue RNAs from rat, guinea pig and dog were
isolated. Human RNAs were purchased from
Clontech. - Human, rat and canine expression profiles of just
under 40 genes have been examined thus far.
Approximately 8 thousand assays have been
performed. - A number of striking differences in expression
patterns have been revealed. - Thus far, the most remarkable expression shifts
have been observed in heart and aorta, among
histamine, prostacyclin and adrenergic beta
receptors. Numerous changes were also localised
to the uterus. - Apart from divergent expression patterns, mean
expression levels also appeared rather different
for many genes. - Differences in expression between prostanoid
receptors may have implications for the
pharmacology of troublesome COX-2 inhibitors
(such as Celebrex, Bextra, and Vioxx).
28PART 2MODELING
29Mathematische Modellierung von Stoffwechsel und
GenexpressionMathematical Modeling
ofMetabolism and Gene Expression
- Dr. Edda Klipp
- Kinetic Modeling Group
- Vorlesung in der Reihe
- Gene und Genome die Zukunft der Biologie
30What is a model?
Yeast, mouse as models for human Verbal
explanation A sequence of letters ATTCGAGGTATA
for DNA sequence Wiring scheme Mathematical
description Boolean Network Differential
Equations Stochastic Equations
- - Abstraction
- (Simplified) representation allowing
- for understanding
Edda Klipp, Kinetic Modeling Group
31Why modeling?
Experimental observations many simple and
complex processes isolated enzymatic
reaction temporal prozesses in
metabolic networks pattern of gene expression
and regulation
Even the behavior of simple systems can usually
not be predicted intuitively and from experience.
The behavior of complex dynamical processes can
not predicted with sufficient precision just
from experience. For prediction and explanation
of processes one needs a model.
Edda Klipp, Kinetic Modeling Group
32Why modeling?
- Advantages
- - Time scales may be streched or compressed.
- - Solution algorithms / computer programmes can
often be used - indepentend of the actually modeled system.
- Costs of modeling are lower than for
experiments. - Representation of quantities that are
experimentally hidden. - - No risk for real systems, no interactions
investigation/system.
Edda Klipp, Kinetic Modeling Group
33Why modeling?
- Burning questions
- How is cellular response to environmental
changes and stress regulated? - How should a cell be treated to yield a high
output of a desired product (Biotechnology) - Where should a drug operate to cure a disease
(Health care)? - Is our knowledge about a network/pathway
complete?
Edda Klipp, Kinetic Modeling Group
34Structure of the system
fast
S2
S3
S4
S5
Smito
S1
Sext
S6
slow
slow
Boundary of the system
Variables, parameters, constants
State variables - set of variables describing the
system completely
Dimension of the systems number of independent
state variables How many variables are used in
my model? too few System ist
under-determined too many System ist
over-determined and may be contratictery Units
of variables and parameters etc. fit together?
Edda Klipp, Kinetic Modeling Group
35Biological processes arecomplex phenomena
Central dogma of molecular biology
Gene
mRNA
Proteines
Cellular processes
Edda Klipp, Kinetic Modeling Group
36Direction of discovery
known to be predicted Structure Function Pro
tein interactions Biochemical action Metabolic
pathways Concentration changes Enzyme
sets Influence of perturbations Possible
behavior, bifurcations
Function Structure Transmission of a
signal Sequence of signaling compounds Time
course of concentrations Possible protein
interactions
Edda Klipp, Kinetic Modeling Group
37Concept of state
The state of a system is a snapshot of the system
at a given time that contains enough information
to predict the behaviour of the system for all
future times. The state of the system is
described by the set of variables that must kept
track of in a model. Different models of gene
regulation have different representations of the
state
Boolean model a state is a list containing for
each gene involved, of whether it is expressed
(1) or not expressed (0) Differential
equation model a list of concentrations of each
chemical entity Probabilistic model a current
probability distribution and/or a list of actual
numbers of molecules of a type
Each model defines what it means by the state of
a system. Given the current state the model
predicts what state/s can occur next.
Edda Klipp, Kinetic Modeling Group
38Kinetics change of state
k
A B
Deterministic, continuous time and state e.g.
ODE model concentration of A decreases and
concentration of B increases. Concentration
change in per time interval dt is given by
Probabilistic, discrete time and state
transformation of a molecule of type A into a
molecule of type Sorte B. The probability of this
event in a time interval dt is given by
a number of molecules of type A
Deterministic, discrete time and state e.g.
Boolean network model Presence (or activity) of
B at time t1 depends on presence (or activity)
of A at time t
Edda Klipp, Kinetic Modeling Group
39Boolean Models
(discrete, deterministic)
(George Boole, 1815-1864) Each gene can assume
one of two states expressed (1) or not
expressed (0) Background Not enough
information for more detailed description Increasi
ng complexity and computational effort for more
specific models
Replacement of continuous functions (e.g. Hill
function) by step function
Edda Klipp, Kinetic Modeling Group
40Boolean Models
- Boolean network is characterized by
- the number of nodes (genes) N
- the number of inputs per node (regulatory
interactions) k
The dynamics are described by rules if
input value/s at time t is/are...., then output
value at t1 is.... Boolean network have always
a finite number of possible states
and, therefore, a finite number of state
transitions.
Linear chain
A
B
C
D
A
A
B
A
Ring
B
C
C
D
B
Edda Klipp, Kinetic Modeling Group
41Boolean Models
Truth functions in output p
p not p 0
0 0 1 1 1
0 1 0
1 rule 0 1 2
3
A
B
B(t1) not (A(t)) rule 2
input
output
p q
Edda Klipp, Kinetic Modeling Group
42Boolean Models
Boolean network
a
b
c
d
a(t1) a(t)
transcription
b(t1) (not c(t)) and d(t)
translation
c(t1) a(t) and b(t)
repression
d(t1) not c(t)
activation
gene
0000 ? 0001 0001 ? 0101 0010 ? 0000 0011 ?
0000 0100 ? 0001 0101 ? 0101 0110 ? 0000 0111
? 0000
1000 ? 1001 1001 ? 1101 1010 ? 1000 1011 ?
1000 1100 ? 1011 1101 ? 1111 1110 ? 1010 1111
? 1010
protein
Zyklus 1000 ? 1001 ? 1101 ? 1111 ? 1010 ?
1000
Steady state 0101
Edda Klipp, Kinetic Modeling Group
43Boolean Models
- - The number of states is finite, , as
well as number of state changes. - - The system may reach steady states or cycles.
- - Not every state can be reached from every other
state. - The successor state is unique, the predecessor
state not. - Advantages easy description with simple rules,
no parameters - computationally not demanding
- Drawbacks no intermediate values
Edda Klipp, Kinetic Modeling Group
44Description with Differential Equations
k1
X DNA X-DNA
k-1
X-DNA X DNA
S vector of concentrations f function(s),
often non-linear
k1
Nucleic acids DNA mRNA DNA
k-1
mRNA Nucleic acids
k2
Amino acids mRNA Proteins mRNA
k-2
Proteins Amino acids
Edda Klipp, Kinetic Modeling Group
45Basic Elements of Biochemical Networks
v1
v2
S10 0 S20 0 S30 0 S40 1
p1 1 p2 1 p3 1 p4 0.5 p5 0.5
S1
S1
S2
St
S4
S3
v3
S2
S4
Time
v4
S3
v5
Edda Klipp, Kinetic Modeling Group
46ODE - concept of steady state
To restrict modeling to main aspects often the
asymptotic behaviour of dynamic systems is
analyzed (behavior after sufficient long time).
It may be
Variable
Time
- in many relevant situations the system will
- reach a steady state.
Edda Klipp, Kinetic Modeling Group
47Data Bases
GO (Gene Ontology) http//www.geneontology.org,
functional description of gene products KEGG
(Kyoto Enzyclopedia of Genes and Genomes)
http//www.genome.ad.jp/kegg/, reference
knowledge base offering information about genes
and proteins, biochemical compounds and
reactions, and pathways BRENDA (Comprehensive
Enzyme Information System) http//www.brenda.un
i-koeln.de, curated database containing
functional data for individual enzymes NCBI
(National Center for Biotechnology)
http//www.ncbi.nlm.nih.gov/ ,provides several
databases - molecular databases, with
information about nucleotide sequences, proteins,
genes, molecular structures, and
gene expression - taxonomy database names
and lineages of more than 130,000 organisms
SPAD (Signaling PAthway Database)
http//www.grt.kyushu-u.ac.jp/spad/index.html,
information about signaling pathways (schemes,
links) JWS Online, Model database
http//jjj.biochem.sun.ac.za/database/index.html
, published models,implemented in Mathematica
Models can be simulated Biomodels, Model
database http//www.biomodels.net/ , published
models,implemented in SBML
Edda Klipp, Kinetic Modeling Group
48Modeling Tools
- BALSA
- BASIS
- BIOCHAM
- BioCharon
- biocyc2SBML
- BioGrid
- BioModels
- BioNetGen
- BioPathway Explorer
- Bio Sketch Pad
- BioSens
- BioSPICE Dashboard
- BioSpreadsheet
- BioTapestry
- BioUML
- BSTLab
- CADLIVE
- CellDesigner
- Cellerator
- Cytoscape
- DBsolve
- Dizzy
- E-CELL
- ecellJ
- ESS
- FluxAnalyzer
- Fluxor
- Gepasi
- INSILICO discovery
- JACOBIAN
- Jarnac
- JDesigner
- JigCell
- JWS Online
- Karyote
- KEGG2SBML
- Kinsolver
- libSBML
- MMT2
- Modesto
- Moleculizer
- Monod
- Narrator
- NetBuilder
- Oscill8
- PANTHER Pathway
- PathArt
- PathScout
- PathwayLab
- Pathway Tools
- PathwayBuilder
- PaVESy
- PNK
- Reactome
- ProcessDB
- PROTON
- pysbml
- SBMLmerge
- SBMLR
- SBMLSim
- SBMLToolbox
- SBToolbox
- SBW
- SCIpath
- Sigmoid
- SigPath
- SigTran
- SIMBA
- SimBiology
- Simpathica
- SimWiz
- SmartCell
- SRS Pathway Editor
- StochSim
- STOCKS
- TERANODE Suite
http//sbml.org
Edda Klipp, Kinetic Modeling Group
49Conclusions
- Mathematical models of cellular processes allow
for a testable representation of experimental
knowledge. - Models clarify systemic and dynamic properties of
the investigated object. - Models allow simulating processes independent of
the experiment. - Modeling reveals regulatory properties of
cellular networks - Osmostress response
- The role of channel Fps1 in osmoresponse
- The ability to repeated stimulation and the
contribution of phosphatases - Feedback loops / signal integration and
separation - Models can have predictive value
- Mutant phenotypes
- Effect of intervention
- Integration of external signals to cell cycle
progression - Critical cell size for G1/S transition
Edda Klipp, Kinetic Modeling Group
50Process of model development
- Analysis of the objects to be modeled
- Formulating of the scientific PROBLEMS
- Design of a simple model
- as cartoon
- in mathematical terms
- - Solve the respective (mathematical) problemes
- - Comparison of results with real system
(EXPERIMENT) - Difference - - iterative enhancement of the models (structure,
parameters, )
Distribution of molecules on Both sides of a
membrane
Ai Ao
dAi/dt f(Ai, Ao, C, p)
If we would not make models, then we would not
know, why they are wrong
Edda Klipp, Kinetic Modeling Group
51Modeling
Mathematical Models for Cellular Processes
structural Knowledge experimental Data
System Analysis Simulation, Parameter identificati
on
Metabolic and Regulatory Networks
ODE-Systems
System Understanding Prediction
Edda Klipp, Kinetic Modeling Group
52Basic Elements of Biochemical Networks
Transport
Reaction
Reaction
Glucose1-P
Glucose6-P
Fructose6-P
v1
v2
Glucose- Phosphat- isomerase
Metabolite
Metabolite
Metabolite
Phospho- glucomutase
Design of structured metabolic models
G6P
G1P
F6P
v1
v2
1. Determination of system limits
System
extern
extern
2. Balancing
Concentration change Production Degradation
Transport
3. Assignment of Kinetics
Rate as function of concentrations and parameters
Edda Klipp, Kinetic Modeling Group
53Hypothesis Generation
Possible theoretical approaches Structure
Function Modelling of
Systems Dynamics Function
Structure Evolutionary Optimization
Network Control pattern Parameters
Homeostasis Appropriate Response Experimental data
establish a mathematical model of the
network -define a performance function
-calculate parameters optimizing the performance
function -compare prediction with experimental
data
Edda Klipp, Kinetic Modeling Group
54Model examples -Metabolism
In Vivo Analysis of Metabolic Dynamics in S.
cerevisiae M. Rizzi, M. Baltes, U. Theobald, M.
Reuss Biotechnol Bioeng.55 592608, 1997.
Representation of Metabolism in the KEGG data
base www.kegg/kegg2.jp
Edda Klipp, Kinetic Modeling Group
55Model examples Signaling pathways
Ra
G-Protein
GTP GDP
GDPGabg
GTPGa Gbg
Signal
P
GDPGa
Edda Klipp, Kinetic Modeling Group
56Common properties
Cellular network has a high degree of
connectivity. The processes are reactions,
molecular interactions. binding intramolecular
transformations release Differences in modeling
of different parts are due to appropriate
approximations.
Edda Klipp, Kinetic Modeling Group
57Concentrations
Signalling
Metabolism
Enzymes low Metabolites higher
Proteins low 100-300 nmol/L ( 103-104
molecules per cell) (catalysts and
substrates) ATP 2 mmol/L
Edda Klipp, Kinetic Modeling Group
58Network Characteristics
- Signaling
- Reactions can be
- catalysed by enzymes
- autocatalytic.
- The network is given by
- the existing protein
- and their interactions.
- Metabolism
- All reactions are catalysed by enzymes.
- The network is determined by the existing enzymes
- (which not necessarily interact).
- Metabolites need not to be there initially.
Edda Klipp, Kinetic Modeling Group
59Network Characteristics
Signaling
Metabolism
ATP ADP
ATP ADP
ATP ADP
ATP ADP
MAP K
MAP K-P
MAP K-PP
Glucose
Gluc 6-P
Fruc 1,6-PP
Fruc 6-P
P
P
P
State changes change in phosporylation
states Coding of information But
Conservation (MAPK MAPK-P MAPK-PP) in the
considered time window
Important feature Flux through the
pathway, (final) transformation of
metabolites Phosphorylation energy
transfer
Edda Klipp, Kinetic Modeling Group
60Rate equations. Are a Choice of the Modeler
Signaling
Metabolism
Hexokinase
MAP KK-PP
ATP ADP
ATP ADP
Glucose
Gluc 6-P
MAP K
MAP K-P
Mg2
Mass action kinetics
Typical choice Michaelis-Menten-Kinetics ES
ES EP Requirement E ltlt S
Catalyst and Substrate have about the same
concentration (E?S) Binding slow compared to
intramolecular rearrangements. First order
kinetics
fast
slow
Edda Klipp, Kinetic Modeling Group
61Spatial effects
Signaling
Metabolism
well stirred ??? Low number of
molecules, Highly organised complexes, Often
membrane-bound. Spatial effects should be
considered. (problem with ODEs) At least as
compartmentalisation
well stirred Molecules are considered to meet
with probability according to their
concentration (mass action). Spatial effects
usually neglected.
Edda Klipp, Kinetic Modeling Group
62Temporal characterisation
Signaling
Metabolism
Time constants for reactions
Heinrich et al., 2002
k
Transition time
A B
k-
Duration
- stoichiometric coefficients
Amplitude
Time constants for metabolites
Definition acc.to Llorens et al. 1998
time
Edda Klipp, Kinetic Modeling Group
63Conclusions
Models for Metabolism and Signaling can use
the Same Design Principles. Metabolism and
Signaling may take place in different areas of
the cells different regions of the concentration
space different time scales Signaling models
have to account for the hierarchy in the
system Regulatory couplings (feedback)
distribute control in both cases.
Edda Klipp, Kinetic Modeling Group
64EXAMPLE TGFbeta signal transduction the SMAD
engine
65Overview of the pathway
- Ligand dimer binds to receptor heterotetramer
(type I and II receptors, both ser/tre kinases) - r-SMAD1/5/8 versus r-SMAD 2/3
- Phosphorylated r-SMAD binds SMAD4 and travels to
the nucleus
- Ubiquitylation (SMURF1-dependent and
independent)
66LETS LOOK UP THE TGFbeta PATHWAY!
67Example Vilar et al. 2006, PLoS Computational
Biology
- Signal Processing in the TGFbeta Superfamily
Ligand/Receptor Network
6814 ligands, 5 type II and 7 type I receptors
this results in 50 different ligand/receptor
complexes
Figure 2
From Vilar et al. 2006, 21 0036-0045, PLoS
Computational Biology
69Unusual features of the TGFbeta pathway
- Simple core trasduction engine (two SMAD
channels 2/3 and 1/5/8) but very complex,
diverse respones (42 ligands, 5 type II and 7
type I receptors, 300 target genes) - Receptors are constitutively internalised and
recycled only app. 10 present on the plasma
membrane at any time - Comparatively late activation peak app. 60
minutes (compare with EGFR of only 5 minutes) - Several negative feedback loops, including-
constitutive degradation- ligand-induced
degradation (Smad7-Smurf2) -
7030 min
Klid 1/4 min
Ki 1/3 min
60 min
From Vilar et al. 2006, 21 0036-0045, PLoS
Computational Biology
Figure 3
71Sources of experimental data
- Mitchell H, Choudhury A, Pagano RE, Leof EB.
Ligand-dependent and independent transforming
growth factor-beta receptor recycling regulated
by clathrin-mediated endocytosis and Rab11. Mol
Biol Cell, 2004, 15 4166-4178 - Recycling rate Figure 3 (app. 30min)
- Internalisation rate Figure 4
- Di Guglielmo GM, Le Roy C, Goodfellow AF, Wrana
JL. Distinct endocytic pathways regulate TGF-beta
receptor signalling and turnover. Nat Cell Biol,
2003, 5 410-421 - Internalisation rate Table 1 - receptors are
internalised through the clathrin pathway and
lipid-caveolar compartments with similar rates - Degradatation rate Figure 3 app. 400 min
72Figure 3, Mitchell et al.
Figure 3. TGF-beta receptors recycle at the same
rate in the presence and absence of ligand. (A)
Mb202 1-18 cells were processed for imaging and
fluorescence quantitation as in Figure 2, B and
C, except 10 ng/ml GM-CSF was included in both
incubations. Bar, 10 µm. (B) Cultures were
labeled with 125I-Fab anti-GM-CSF receptor- for
2 h at 4C in the presence ( ) or absence ( ) of
10 ng/ml GM-CSF. After washing and incubation at
37C for 30 min (in the presence or absence of 10
ng/ml GM-CSF), labeled receptor antibody was
removed by acid wash and the cultures returned to
37C. () Results are expressed as percentage of
the total cell-associated radioactive counts
after the first acid strip and before further
incubation at 37C, and indicate the mean SD of
two experiments done in duplicate.
73Figure 4, Mitchell et al.
- Figure 4. TGF-beta receptors internalize at the
same rate regardless of activation state. Mb202
1-18 cells were prebound with radiolabeled
antibody in the presence ( ) or absence ( ) of 10
ng/ml GM-CSF as in Figure 3B and then incubated
at 37C for the indicated times. Surface antibody
was removed by acid treatment at 4C, after which
cells were processed to determine internalized
radioactivity (see Materials and Methods).
Results are expressed as percentage of total
cell-associated radioactive counts before
incubation at 37C and indicate the mean SD of
two experiments done in duplicate.
74Table 1, Di Guglielmo et al. Quantitation of
TGF-beta receptor distribution by immunoelectron
microscopy
75Figure 4
From Vilar et al. 2006, 21 0036-0045, PLoS
Computational Biology
76Plasma membrane concentrations lRiRii -
ligand/heterotetramer receptor complex l
- ligand Ri - receptor type i Rii -
receptor type ii ka - ligand/receptor complex
formation rate kcd - constitutive degradation
rate klid - ligand induced degradation rate ki
- internalisation rate
77Endosomal concentrations lRiRii -
ligand/heterotetramer receptor complex Ri
- receptor tpe i Rii - receptor type
ii ki - internalisation rate kr - recycling
rate
78Slower rates for internalisation and recycling
Ki 1/10 min kr 1/100 min
Late and long
Figure 5
From Vilar et al. 2006, 21 0036-0045, PLoS
Computational Biology
79 CIR makes the difference
Figure 6
From Vilar et al. 2006, 21 0036-0045, PLoS
Computational Biology
80PART 3PHYLOGENETICS
81WHAT I WILL TALK ABOUT
- A BIT OF THEORY
- EXAMPLES (CRISPs AND SMADs)
- MEGA PACKAGE - HOWTO
- INTERPRETATIONS (what can a simple BLAST search,
multiple sequence alignment, or a tree, tell me
about BIOLOGY)
82First Things First(definitions)
- Phylogenetic analysis
- Phylogenetic tree
- rooted
- unrooted
- Homology
- paralogy
- orthology
- one-to-one
- co-orthology
- Nucleotide substitutions
- synonymous
- non-synonymous
83A phylogenetic analysis of a family of related
nucleic acid or protein sequences, is a
determination of how the family members might
have been derived during evolutionPhylogenetic
tree a graphical representation that depicts
evolutionary relationships between a set of
related sequences. Most-alike sequences are
placed at the outer ends if two branches that are
joined below into a lower common branch,
representing their derivation from an ancetral
sequence. An unrooted tree does not provide
information on the common ancestor to the group.
What is phylogenetics?
84The simplest tree
Species A
Gene A
node
Ancestral species
root
branches
Ancestral gene
Species B
Gene B
Evolutionary time
85Homologs. Genes whose sequences are so similar
that they almost certainly arose from a common
ancestor gene(1) Orthologs are genes in
different species that arose from a single gene
in the most recent common ancestor of those
species that is, by a process of speciation
(2) Paralogs, on the other hand, are genes in
the same species that arose from a single gene
in an ancestral species by a process of
duplication
Who is Who of -ologs
86Gene A1
paralogs
Gene A2b
co-orthologs
11 orthologs
Gene A2
Ancestral gene
Gene B2
Gene B1
Evolutionary time
87Non-synonymous substitution a nucleotide
substitution that results in an amino acid change
(dn) Synonymous substitution a silent
nucleotide substitution, often in the third codon
position, that does not result in an amino acid
change (ds)dn/ds the simplest test for the
rate of evolution (1 lt, gt 1, 1)
Synonymous or non-synonymous?
88EXAMPLEcysteine-rich secretory proteins (CRISPs)
89There are three CRISP genes in human, rat and
mouse. However, their nomenclature is misleading
- None of the genes are simple one-to-one orthologs
- A single ancestral gene at the base of the
vertebrate lineage was most likely subject to two
rounds of gene duplication before the
human/rodent split, but the picture is
complicated by species-specific duplications and
lineage-specific losses - A surprisingly high number of changes in gene
expression patterns have occurred during the
evolution of the CRISP family. For detailed
discussion, please see (Huminiecki and Wolfe,
Genome Research, 2004)
90(No Transcript)
91EXAMPLE TGFbeta signal transduction the SMAD
engine
92Overview of the pathway
- Ligand dimer binds to receptor heterotetramer
(type I and II receptors, both ser/tre kinases) - r-SMAD1/5/8 versus r-SMAD 2/3
- Phosphorylated r-SMAD binds SMAD4 and travels to
the nucleus
- Ubiquitylation (SMURF1-dependent and
independent)
93Interesting phylogenetic phenomena
- DPP/BMP Type-1 receptor and an r-SMAD found in
non-bilaterian cnidarian (Acropora millepora)
has the pathway evolved in a context other than
dorsoventral patterning? - Two SMAD4 in frogs XSMAD4a and XSMAD4ß. Also
worms could have two co-SMADs (Sma-4 and Daf-3)
but only one SMAD4 expected in mammals!
94What is the ancestral SMAD?
- Hypothesis an ancestral SMAD CoRe-SMAD
worked as a homodimer. The gene duplicated and
gave rise to an r-SMAD and a co-SMAD - But where did the i-SMADs come from?
- i-SMADs evolve faster (evidence average dn/ds,
length of protein branches, missing
phosphorylation motif, and L3 sequence not
conserved between DAD and i-SMAD6, 7) - (((mad, dsmad2), medea),dad)
- (((((SMAD1,SMAD5), SMAD9),SMAD2, SMAD3), SMAD4),
SMAD6, SMAD7)
95Amino-acid PAM matrix, neighbour joining tree
Fascinating C. elegans SMADs
96Positive selection in sma/daf branches?
- Sma genes control body size, while daf genes
control dauer formation. Lengths of protein
branches suggested that daf genes underwent a
period of very fast protein evolution. Could it
be positive selection in response to
environmental change? dn/ds test positive! - Daf corresponding SMAD evidence
- Daf-3 co-SMAD(?) nj_PAM, newfeld2_MH1_ml
- i-SMAD newfeld2_p-loop_degenerate
- Daf-8 r-SMAD nj_PAM, newfeld2_p-loop
- Daf-14 co-SMAD(?) nj_PAM
- co-SMAD newfeld2_p-loop(2S)
- Tag-68 i-SMAD nj_PAM
97Interpretations of phylogenies How all this
could help in my project? I will propose just
a few ideas please, join in, voice your
suggestions, discuss your favourite gene family!!!
98Application 1Evolutionary Saga or my gene
family over the eonsIs the gene family present
in bacteria, yeast, plants, non-bilaterial
animals? To find out, just run a BLAST search
against GenBank and read names of the species
with hits. Can one infer from this how old the
family is? How many gene duplication events,
and when did they occur? Have there been any
deletions? Has the intron number changed, or
there is no introns (suggestive of
retroposition)Can these events be correlated
with the development of a new body plan, new
organs, or novel physiology? Is this correlation
supported by the sites of expression?
99Metazoanphylogeny
Vertebrates (fish, birds, mammals)
2R?
TGFbeta
FGF
Wnt
Cephalochordates
Urochordates
Hemichordates
Bilateria
Echinoderms (sea urchins, starfish)
Nematodes (?)
Arthropods (insects)
Annelids (leeches)
Molluscs (gastropods)
Flatworms
Cnidaria (jellyfish, coral)
Porifera (sponges)
100Expansion of the signal transduction toolkit
- Cnidaria C. elegans
Drosophila Human and
porifera - TGFbeta 1(?) 4 - 27 Wnt gt1 5 7 18FGF
- 1 1 23
Increased anatomical complexity(diversification
of body plans and body parts)
101Application 2My Gene and the Genome, or how
my favourite gene compares to other members of
the gene family?How many related genes, how
similar, and in what physical location in the
genome (most duplications are tandem,
head-to-tail)Evidence for functional
redundancy? (important for knockouts)Tissue-spe
cific expression patterns, or do they overlap
(expression.gnf.org)? Genomic context
(www.ensembl.org)
102Application 3Special Sites in my
GeneMultiple sequence alignment - regions of
conservation - regions of changeImportant for
the design of my next deletion mutant,
hybridization probe, or a set of primersVisual
inspection of the multiple sequence alignment
will be sufficient in most cases (check out Pfam
or ENSEMBL for precomputed alignments of your
favourite family www.ensembl.org)
103Reference Bioinformatics Sequence and Genome
AnalysisDavid W. Mount CSHL lab manual
seriesGreat introductionto the field
104Reference Molecular Evolutionand Phylogenetics
Masatoshi Nei, Sudhir Kumar Nuts and bolts
of tree drawing methods
105Reference From DNA to Diversity Molecular
Genetics and the Evolution of Animal Design
S. Carroll, J. Grenier, S. Weatherbee
Interpretations
106