Model-based investigation of bacterial metabolism using gene essentiality data. - PowerPoint PPT Presentation

1 / 70
About This Presentation
Title:

Model-based investigation of bacterial metabolism using gene essentiality data.

Description:

Model-based investigation of bacterial metabolism using gene essentiality data. PhD defense Maxime Durot PhD prepared in the Computational Systems Biology Group ... – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 71
Provided by: Maxime7
Category:

less

Transcript and Presenter's Notes

Title: Model-based investigation of bacterial metabolism using gene essentiality data.


1
Model-based investigation of bacterial metabolism
using gene essentiality data.
  • PhD defense Maxime Durot
  • PhD prepared in the
  • Computational Systems Biology Group at Genoscope
  • under the supervision of
  • Vincent Schachter Jean Weissenbach

2
Motivation goals of the thesis
3
Metabolism
Picture Roche Applied Science
http//www.expasy.org/tools/pathways/
4
Information from two scales
genome
metabolism
phenotype
molecular scale
cellular scale
5
Mutant phenotyping experiments
Wild-type bacterium
Wild-type growth phenotype
Gene
Genome
Knock-out mutant
Mutant growth phenotype
Deleted gene
  • Mutant phenotype
  • No growth gene is essential on the tested
    environment
  • Growth gene is dispensable on the tested
    environment
  • Experiments are performed genome-wide for a
    growing number of organisms (Gerdes et al, Curr
    Opin Biotechnol 2006)

6
Confronting the two scales is complex
7
Modeling metabolism can help
(Stelling, Curr Opin Microbiol. 2004)
8
The constraint-based modeling framework
A(ext)
B(ext)
P(ext)
  • Key concepts
  • variable of interest reactions fluxes

R2
R1
B
R3
R4
R5
R6
R7
A
C
P
R9
R8
D
9
The constraint-based modeling framework
A(ext)
B(ext)
P(ext)
  • Key concepts
  • variable of interest reactions fluxes

0.5
1.5
B
1
0
0
0.5
0.5
A
C
P
1
1
D
10
The constraint-based modeling framework
A(ext)
B(ext)
P(ext)
  • Key concepts
  • variable of interest reactions fluxes
  • constraint-based approach applying constraints
    to the model reduces the possible flux
    distributions

R2
R1
B
R3
R4
R5
R6
R7
A
C
P
R9
R8
D
Admissible flux distributions
v3
v2
v1
11
The constraint-based modeling framework
A(ext)
B(ext)
P(ext)
  • Key concepts
  • variable of interest reactions fluxes
  • constraint-based approach applying constraints
    to the model reduces the possible flux
    distributions
  • Classical constraints
  • metabolism in steady-state metabolic
    concentrations remain constant
  • some reactions are irreversible
  • flux values are bound to a maximal value

R2
R1
B
R3
R4
R5
R6
R7
A
C
P
R9
R8
D
Admissible flux distributions
Applicable at genome scale
12
The constraint-based modeling framework
A(ext)
B(ext)
P(ext)
  • Key concepts
  • variable of interest reactions fluxes
  • constraint-based approach applying constraints
    to the model reduces the possible flux
    distributions
  • explore the space of admissible flux
    distributions
  • Classical constraints
  • metabolism in steady-state metabolic
    concentrations remain constant
  • some reactions are irreversible
  • flux values are bound to a maximal value

R2
R1
B
R3
R4
R5
R6
R7
A
C
P
R9
R8
D
Admissible flux distributions
Applicable at genome scale
13
Models and gene essentiality datasets
  • Constraint-based models can predict growth
    phenotypes for genetic and environmental
    perturbations (Price et al, Nat Rev Microbiol
    2004)(Durot et al, FEMS Microbiol Rev 2009)
  • Gene essentiality datasets have been used to
    provide rough assessments of metabolic models
    (Covert et al, Nature 2004)(Joyce et al,
    J Bacteriol 2006)
  • Compute predictive accuracy for gene essentiality
    prediction
  • List of inconsistencies, used as a starting point
    for curation
  • Can gene essentiality datasets be used more
    systematically for metabolic model assessment
    refinement ?

14
Objectives of the thesis
  1. Develop a framework for the refinement of
    metabolic models using gene essentiality data

15
Context the Metabolic Thesaurus project
  • Acinetobacter baylyi ADP1
  • ?-proteobacteria, Pseudomonales group
  • Nutritionally versatile, strictly aerobic
  • Non-pathogenic
  • Evidence of xenobiotic degradation capabilities
  • Experimental context
  • Reliable genome annotation (Barbe et al, Nucleic
    Acics Res 2004)
  • Comprehensive knock-out mutant collection (de
    Berardinis et al, Mol Syst Biol 2008)
  • Phenotyping capability complete conditional
    essentiality datasets on several media (de
    Berardinis et al, Mol Syst Biol 2008)

16
Objectives of the thesis
  • Develop a framework for the refinement of
    metabolic models using gene essentiality data
  • Application to Acinetobacter baylyi metabolism
  • reconstruct a global metabolic model from its
    genome annotation
  • assess and refine the model using mutant
    phenotypes
  • point out poorly understood metabolic events
    requiring further experimental investigation

17
Outline
  • A/ A formal framework for comparing predicted and
    experimental gene essentialities
  • B/ Reconstruction and refinement of A. baylyi
    metabolic model using mutant phenotypes
  • C/ Automated reasoning with metabolic models and
    essentiality data

18
A/ A formal framework for comparing predicted and
experimental gene essentialities
19
Model refinement using experimental data
Improved metabolic reconstruction
20
Formal representation of a metabolic model
  • Model refinement using large-scale genetics data
    requires
  • Computer generation of variants of models
  • Understanding the impact of model variations on
    phenotype predictions
  • Problem
  • Constraint-based models appear to be complex
    mathematical objects
  • An appropriate representation of metabolic models
    is required to perform automated reasoning with
    essentiality

21
Formal representation of a metabolic model
Genetic background
GPR
Set of reactions fulfilling the modeling
constraints
  • Boolean gene-reaction associations (GPR)

Gene
g1
g2
Boolean rules
Protein
p1
p2
r1 g1
r2 g1 and g2
Complex
c1
Reaction
r1
r2
22
Formal representation of a metabolic model
Genetic background
GPR
Metabolites of the medium
Set of reactions fulfilling the modeling
constraints
Producible metabolites
  • Boolean gene-reaction associations (GPR)
  • Set of metabolic reactions (NETWORK)

23
Formal representation of a metabolic model
Genetic background
essential biomass precursors
GPR
Metabolites of the medium
Set of reactions fulfilling the modeling
constraints
Producible metabolites
  • Boolean gene-reaction associations (GPR)
  • Set of metabolic reactions (NETWORK)
  • List of essential biomass precursors (BIOMASS)

24
Predicting mutant phenotypes
genetic perturbation
25
Confronting model predictions with experiments
  • Comparison of predictions with experiments reveal
    inconsistencies

26
Classifying inconsistencies according to likely
cause correction type
Type of inconsistency
False essential
False dispensable
GPR
decrease impact of gene deletion on reaction set
increase impact of gene deletion on reaction set
- add an alternate enzyme - gene is a
non-essential subunit of a complex - reaction may
occur spontaneously
- remove an isozyme - form a complex instead of
isozyme - gene has an additional essential role
NETWORK
augment reaction set
reduce reaction set
- remove or block an alternate pathway
- add an alternate pathway
BIOMASS
reduce biomass requirements
augment biomass requirements
- remove a biomass precursor
- add a biomass precursor
27
B/ Reconstruction and refinement of A. baylyi
metabolic model using mutant phenotypes
28
A. baylyi model reconstruction
  • Two step process
  • Identify all metabolic reactions occurring in the
    cell
  • Adapt representation to modeling requirements

29
1/ Metabolic network reconstruction
30
2/ Adapt to modeling requirements
  • Specific developments made for A. baylyi model
  • Automated expansion of generic pathways
  • Inference of enzyme complexes by homology to E.
    coli

31
Initial model reconstruction
  • 859 reactions using 697 metabolites, linked with
    787 genes
  • 109 metabolites that are exchangeable with the
    environment

32
Evidence supporting the enzymatic function of
model genes
33
Experimental datasets
Dataset 2
  • Genome-wide gene essentialities from A. baylyi
    mutant collection construction
  • Selection on succinate minimal medium
  • Gene essentiality results
  • Growth phenotypes of wild-type strain on 190
    carbon sources
  • Results
  • Growth on 45 carbon sources
  • No growth on remaining 145 carbon sources

(de Berardinis et al, Mol Syst Biol 2008)
34
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyiv1
35
Model refinement using dataset 1
iAbaylyiv1
86
overall prediction accuracy
24 / 45 (53)
correctly predicted carbon sources
140 / 145 (97)
correctly predicted non carbon sources
36
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyi
v1
Model accuracy

88 on dataset 1
37
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyi
v1
Dataset 2
Model accuracy

88 on dataset 1
genome-wide gene essentialities from A. baylyi
mutant collection construction
3093 strains x 1 medium
Gene
Status
Gene
Status
ACIAD0001
NA
ACIAD0001
NA
ACIAD0002
Essential
ACIAD0002
Essential
ACIAD0003
Dispensable
ACIAD0003
Dispensable
ACIAD0004
Essential
ACIAD0004
Essential
ACIAD0005
Dispensable
ACIAD0005
Dispensable
ACIAD0006
Dispensable
ACIAD0006
Dispensable
38
Model refinement using dataset 2
iAbaylyiv2
88
overall prediction accuracy
187 / 251 (75)
correctly predicted essential genes
489 / 516 (95)
correctly predicted dispensable genes
39
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyi
v1
Dataset 2
Model accuracy

88 on dataset 1
genome-wide gene essentialities from A. baylyi
mutant collection construction
3093 strains x 1 medium
Gene
Status
Gene
Status
ACIAD0001
NA
ACIAD0001
NA
ACIAD0002
Essential
ACIAD0002
Essential
ACIAD0003
Dispensable
ACIAD0003
Dispensable
ACIAD0004
Essential
ACIAD0004
Essential
ACIAD0005
Dispensable
ACIAD0005
Dispensable
ACIAD0006
Dispensable
ACIAD0006
Dispensable
40
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyi
v1
Dataset 2
Model accuracy

88 on dataset 1
genome-wide gene essentialities from A. baylyi
mutant collection construction
3093 strains x 1 medium
Gene
Status
Gene
Status
ACIAD0001
NA
ACIAD0001
NA
ACIAD0002
Essential
ACIAD0002
Essential
ACIAD0003
Dispensable
ACIAD0003
Dispensable
ACIAD0004
Essential
ACIAD0004
Essential
ACIAD0005
Dispensable
ACIAD0005
Dispensable
Dataset 3
ACIAD0006
Dispensable
ACIAD0006
Dispensable
growth phenotypes of A. baylyi mutant collection
on 8 minimal media
Quantitative
growth
measure
2350 strains x 8 media
41
Model refinement using dataset 3
iAbaylyiv3
93
overall prediction accuracy
correctly predicted gene phenotypeswith 1
essentiality
16 / 36 (44)
406 / 419 (97)
correctly predicted gene phenotypeswith no
essentiality
42
Iterative refinement of A. baylyi model
Initial reconstruction
Dataset 1
from
growth phenotypes of wild-type strain on 190
carbon sources

genome annotation

pathway databases

literature
1 strain x 190 media
iAbaylyi
v1
Dataset 2
Model accuracy

88 on dataset 1
genome-wide gene essentialities from A. baylyi
mutant collection construction
3093 strains x 1 medium
Gene
Status
Gene
Status
ACIAD0001
NA
ACIAD0001
NA
ACIAD0002
Essential
ACIAD0002
Essential
ACIAD0003
Dispensable
ACIAD0003
Dispensable
ACIAD0004
Essential
ACIAD0004
Essential
ACIAD0005
Dispensable
ACIAD0005
Dispensable
Dataset 3
ACIAD0006
Dispensable
ACIAD0006
Dispensable
growth phenotypes of A. baylyi mutant collection
on 8 minimal media
Quantitative
growth
measure
2350 strains x 8 media
43
GPR correction example
  • ACIAD0661 (hisG) and ACIAD1257 (hisZ) were
    initially assigned as isozymes of ATP
    phosphoribosyl transferase reaction.
  • Observed essentiality of both genes suggests they
    are both necessary to the activity.
  • Further examination of the literature confirms
    that both proteins form an enzymatic complex
    (Sissler et al, PNAS 1999)

PRPP
ATP phospho-ribosyltransferase
ACIAD0661 OR ACIAD1257
phosphoribosyl-ATP
protein
histidine
essential gene or reaction
dispensable gene or reaction
biomass precursor
44
GPR correction example
PRPP
PRPP
ATP phospho-ribosyltransferase
ACIAD0661 OR ACIAD1257
ACIAD0661 AND ACIAD1257
phosphoribosyl-ATP
phosphoribosyl-ATP
protein
protein
histidine
histidine
essential gene or reaction
dispensable gene or reaction
biomass precursor
45
Network correction example
  • ACIAD0822-0824 (gatABC) annotated as an
    aspartyl/glutamyl-tRNA amidotransferase
  • gatABC are essential only way to produce
    asparagine.
  • ACIAD1920 (glnS) catalyzes direct charging of
    glutamine on its tRNA
  • Essentiality of ACIAD1920 suggests that gatABC
    pathway is not effective for glutamine

aspartate
glutamate
ACIAD3371 ORACIAD0272
ACIAD0609
glutamate-tRNA(gln)
aspartate-tRNA(asn)
glutamine
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD1920
asparagine -tRNA(asn)
glutamine -tRNA(gln)
protein
protein
essential gene or reaction
dispensable gene or reaction
biomass precursor
46
Network correction example
aspartate
aspartate
glutamate
ACIAD3371 ORACIAD0272
ACIAD0609
ACIAD0609
glutamate-tRNA(gln)
aspartate-tRNA(asn)
aspartate-tRNA(asn)
glutamine
glutamine
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD1920
ACIAD1920
asparagine -tRNA(asn)
asparagine -tRNA(asn)
glutamine -tRNA(gln)
glutamine -tRNA(gln)
protein
protein
protein
protein
essential gene or reaction
dispensable gene or reaction
biomass precursor
47
A. baylyi model refinement
48
Online prediction of mutant phenotypes
(Le Fèvre et al, Bioinformatics 2009)
49
C/ Automated reasoning with metabolic models and
essentiality data
50
Automated reasoning on gene-reaction associations
GPR
  • Use phenotypes as specifications for
    gene-reaction associations
  • Assume NETWORK and BIOMASS parts of the model are
    correct
  • For each inconsistency
  • search all GPRs compatible with experimental data

51
1/ Deduce impact scenarios from phenotypes
  • Equivalent view of gene-reaction associations
  • Deletion impact
  • Impact (deletion of G1,,Gn) R1,..,Rp
    inactivated
  • Key idea
  • Phenotypes of reaction deletions can be predicted
  • Compatible deletion impacts must follow the
    rules
  • ? lethal gene deletions must impact an essential
    reaction set
  • ? viable gene deletions must not impact any
    essential reaction set

52
1/ Deduce impact scenarios from phenotypes
  • For each inconsistency, generate all possible
    impact scenarios
  • Closed-world assumption
  • the set of genes potentially linked to a reaction
    is known

53
1/ Deduce impact scenarios from phenotypes
  • For each inconsistency, generate all possible
    impact scenarios
  • Closed-world assumption
  • the set of genes potentially linked to a reaction
    is known

scenario 1
impact
54
1/ Deduce impact scenarios from phenotypes
  • For each inconsistency, generate all possible
    impact scenarios
  • Closed-world assumption
  • the set of genes potentially linked to a reaction
    is known

scenario 2
impact
55
1/ Deduce impact scenarios from phenotypes
  • For each inconsistency, generate all possible
    impact scenarios
  • Closed-world assumption
  • the set of genes potentially linked to a reaction
    is known

scenario 3
impact
56
1/ Deduce impact scenarios from phenotypes
  • For each inconsistency, generate all possible
    impact scenarios
  • Closed-world assumption
  • the set of genes potentially linked to a reaction
    is known

scenario 4
impact
57
2/ Implement proposed impacts with GPR
  • Choose an impact scenario
  • For each reaction, find Boolean rules
    implementing the impacts
  • analogy to logic circuit design
  • GPR specificity no negation rule
  • monotonic increasing Boolean function (F(0,0)
    F(1,0) F(1,1))
  • constrains the possible implementations

58
2/ Implement proposed impacts with GPR
G1
  • Specifications for R1
  • G1 deletion does not impact R1
  • G2 deletion does not impact R1
  • G3 deletion does impact R1

G2
R1
G3
R2
G4
scenario 1
  • Truth table for R1

G1 G2 G3 GPR
0 0 0
1 0 0
0 1 0
1 1 0 0
0 0 1
1 0 1 1
0 1 1 1
1 1 1
G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1
1 0 1 1
0 1 1 1
1 1 1 1
monotony
59
2/ Implement proposed impacts with GPR
  • Multiple solutions
  • Generate all possible cases

G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1 ?
1 0 1 1
0 1 1 1
1 1 1 1
GPR G3
GPR G3 and (G1 or G2)
60
2/ Implement proposed impacts with GPR
  • Multiple solutions
  • Generate all possible cases
  • Choose closest behavior to the original GPR
  • Propose experiment to fully determine the Boolean
    rule
  • G2, G3 double deletion here

G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1 ?
1 0 1 1
0 1 1 1
1 1 1 1
61
Comparing AutoGPR proposals with expert
interpretations
  • Comparison with manual corrections of A. baylyi
    model

62
Comparing AutoGPR proposals with expert
interpretations
  • Comparison for S. cerevisiae model
  • iND750 model predictions compared with gene
    essentiality data on 8 environments (Duarte et
    al, Genome Res 2004)
  • Inconsistent predictions were manually
    interpreted (not corrected)

63
Number of generated proposals for A. baylyi
64
Reducing complexity
  • First, simply test the existence of GPR
    corrections
  • Impose similar reactions to have similar GPR

65
Examining corrections across environments
  • GPR corrections can contradict each other across
    environments

(Durot et al, BMC Syst Biol 2008)
  • Possible interpretations
  • Inconsistencies between experimental conditions
  • Error in NETWORK or BIOMASS model components
  • GPR are not constant across environments
  • Conditional expression of genes
  • Regulatory interactions intervene

66
Conclusion perspectives
67
Main contributions
  • Reconstruction of a global metabolic model of A.
    baylyi
  • Development of a framework for interpreting
    inconsistent growth phenotype predictions
  • Systematic interpretation of A. baylyi mutant
    phenotypes using its metabolic model
  • Design of an automated method to reason on GPR
    corrections from gene essentialities

68
Perspectives
  • A. baylyi metabolic model
  • Tool to integrate further experimental data
  • RNA-seq , metabolomics on A. baylyi and mutants
  • Metabolic model reconstruction
  • Automate the reconstruction process from genome
    annotation
  • Systematically assess model correctness using
    high-throughput experimental data
  • gt Microme European project to be started

69
Acknowledgments
Supervisors
Vincent Schachter Jean Weissenbach
Metabolic Thesaurus experimental work
Acinetobacter baylyi annotation
Marcel Salanoubat Véronique de Berardinis Alain
Perret Marielle Besnard Christophe
Lechaplais Agnès Pinet
Claudine Médigue David Vallenet Valérie
Barbe Georges Cohen Nuria Fonknechten Annett
Kreimeyer
Computational Systems Biology group
François Le Fèvre Gilles Vieira Richard
Baran Pierre-Yves Bourguignon Serge Smidtas (
former members)
70
Discussion
Write a Comment
User Comments (0)
About PowerShow.com