Title: Major Challenges for Computational Biology
1Major Challenges for Computational Biology
- Representing incredibly complex biological
knowledge - Making predictions from complex models (i.e.,
reasoning) - Checking or revising models using miniscule
amounts of data - Dealing with pervasive uncertainty in knowledge
and data
2Challenge 1 Representing Biological Models
A computationally manipulable biological model
must be
- Process-based (describe complex biochemical and
biomechanical mechanisms) - Combine Qualitative and Quantitative (analytical
models are unknown and parameters are very
difficult to determine) - Abstract/Hierarchical (need to ignore details at
many levels) - Causal and reason-able (need to describe and/or
compute chains of effect)
3Represent Biochemical Reactions
Yet Another Qualitative Biochemical Model
Simplified Molecular Models (only the necessary
features -- NO GRAPHICS!)
(specific molecule glucose ((6 C)))
Abstract qualitative reactions (Pathways are
not explicitly represented)
(cytosolic reaction (glucose atp) -gt (g6p
adp) (enzyme Hexokinase 22-40C))
Reactions are relegated to cellular compartments
(transport reaction pyruvate cytosolic -gt
mitochondrial)
4Example Reactions from Glycolysis and the TCA
Cycle
CYTOSOLICglucose ATP ---Hexokinase--gt
glucose 6-phosphate ADP CYTOSOLIC1,3-bispho
sphoglycerate ADP ---Phosphoglycerate
kinase--gt 3-phosphoglycerate
ATP MITOCHONDRIALisocitrate NAD
---Isocitrate dehydrogenase--gt
a-ketoglutarate NADH H
Co2 MITOCHONDRIALsuccinyl CoA GDP
phosphatate ---Succinyl CoA synthase--gt
succinate GTP CoA
5A Simplified Model of Photosynthetic Regulation
How do plants modify their photosynthetic
apparatus in high light?
(Although qualitative, the model relates
continuous variables, as with formalisms from
qualitative physics e.g., Forbus, 1984.)
6Dealing with Temporal Phenomena
Some biological processes occur over extended
periods of time to deal with such phenomena, we
need methods that
- Represent biological models with time-delayed
effects - Utilize these time-delayed models to make
predictions - Evaluate alternative models in terms of their fit
to data - Carry out search through the space of alternative
models.
We have extended our framework to handle
qualitative causal models with time delays and
have done initial evaluations.
7A Regulatory Model with Time Delays
This model predicts the systems qualitative
behavior over time.
8(No Transcript)
9(No Transcript)
10PhotoSynthesis (light reactions)
Photosynthesis The Turing Test of biological
knowledge representation
http//www.bio.ic.ac.uk/research/barber/photosyste
mII.html
11The sad pinnacle of current representational
practice
From Kegg The Koyoto Encyclopedia of Genes and
Genomes
12The sad pinnacle of current representational
practice
From GenNav, the NIH Gene Ontology Browser
13Doing better Detailed qualitative representation
(photosynthesis isa process with inputs
(chloroplast-inside.water everywhere.light
chloroplast-outside.nadph chloroplast-outside.a
dp chloroplast-outside.pi) outputs
(chloroplast-outside.atp chloroplast-outside.nadph
everywhere.o2) implemented-by
photosystem) (photosystem composition (psii
antenna-array atpase pq-pool)) (light-absorption
isa process with inputs (everywhere.light)
outputs (chlorophyll.energy) function
absorption implemented-by chlorophyll) (light-en
ergy-concentration isa process with outputs
psii.energy driver chlorophyll.energy
function concentration implemented-by
antenna-array) (psii-water-breakdown isa process
with inputs (chloroplast-inside.water) driver
psii.energy outputs (psii.e- psii.e-
chloroplast-inside.h chloroplast-inside.o2)
function molecular-splitting implemented-by
psii) (psii-pq-reduction isa process with
inputs (psii.e- chloroplast-membrane.h
chloroplast-membrane.plastoquinone) outputs
(chloroplast-membrane.plastoquinol) function
reduction implemented-by psii inhibited-by
dcmu)
14Challenge 2 Reasoning on Models
15Explanation by pathway tracing
(photosynthesis isa process with inputs
(chloroplast-inside.water everywhere.light
chloroplast-outside.nadph chloroplast-outside.a
dp chloroplast-outside.pi) outputs
(chloroplast-outside.atp chloroplast-outside.nadph
everywhere.o2) implemented-by
photosystem) (photosystem composition (psii
antenna-array atpase pq-pool)) (light-absorption
isa process with inputs (everywhere.light)
outputs (chlorophyll.energy) function
absorption implemented-by chlorophyll) (light-en
ergy-concentration isa process with outputs
psii.energy driver chlorophyll.energy
function concentration implemented-by
antenna-array) (psii-water-breakdown isa process
with inputs (chloroplast-inside.water) driver
psii.energy outputs (psii.e- psii.e-
chloroplast-inside.h chloroplast-inside.o2)
function molecular-splitting implemented-by
psii) (psii-pq-reduction isa process with
inputs (psii.e- chloroplast-membrane.h
chloroplast-membrane.plastoquinone) outputs
(chloroplast-membrane.plastoquinol) function
reduction implemented-by psii inhibited-by
dcmu)
16Explanation by pathway tracing
(track-object 'chloroplast-inside.water)Tracking
CHLOROPLAST-INSIDE.WATER -gt PHOTOSYNTHESIS
Tracking CHLOROPLAST-OUTSIDE.ATP Tracking
CHLOROPLAST-OUTSIDE.NADPH Tracking
EVERYWHERE.O2 -gt PSII-WATER-BREAKDOWN
Tracking PSII.E- -gt PSII-PQ-REDUCTION
Tracking CHLOROPLAST-MEMBRANE.PLASTOQUINOL
-gt E-FUNNLING-PSII-TO-PSI Tracking
PSI.E- -gt PSI-NADPH-FORMATION
Tracking CHLOROPLAST-INSIDE.H -gt
ATP-FORMATION Tracking CHLOROPLAST-INSIDE.O2
-gt O2-DIFFUSSION
17Example Reactions from Glycolysis and the TCA
Cycle
CYTOSOLICglucose ATP ---Hexokinase--gt
glucose 6-phosphate ADP CYTOSOLIC1,3-bispho
sphoglycerate ADP ---Phosphoglycerate
kinase--gt 3-phosphoglycerate
ATP MITOCHONDRIALisocitrate NAD
---Isocitrate dehydrogenase--gt
a-ketoglutarate NADH H
Co2 MITOCHONDRIALsuccinyl CoA GDP
phosphatate ---Succinyl CoA synthase--gt
succinate GTP CoA
18Find Pathways in Likelihood order by
uArray-Guided Search
Solution for Fructose environment (Target
Malate) frucose ---Fructokinase--gt fructose
1-phosphate fructose 1-phosphate ---Fructose
1-phosphate aldolase--gt glyceraldehyde
dihydrozyacetone phosphate dihydrozyacetone
phosphate ---Isomerase--gt glyceraldehyde
3-phosphate phosphatate NAD glyceraldehyde
3-phosphate ---Triose phosphate
dehydrogenase--gt 1,3-bisphosphoglycerate 1,3-bisp
hosphoglycerate ADP ---Phosphoglycerate
kinase--gt 3-phosphoglycerate
ATP 3-phosphoglycerate ---Phosphoglyceromutase--
gt 2-phosphoglycerate 2-phosphoglycerate
---Enolase--gt phosphoenolpyruvate
H2O phosphoenolpyruvate ATP ---Pyruvate
kinase--gt pyruvate ADP malate NAD
---Malate dehydrogenase--gt oxaloacetate NADH
H pyruvate NAD CoA ---NIL--gt NADH H
Co2 acetyl CoA acetyl CoA oxaloacetate
---Citrate synthase--gt citrate CoA citrate
---Aconitase--gt isocitrate isocitrate NAD
---Isocitrate dehydrogenase--gt a-ketoglutarate
NADH H Co2 a-ketoglutarate NAD CoA
---a-ketogluterate dehydrogenase complex--gt
succinyl CoA NADH H Co2 succinyl CoA GDP
phosphatate ---Succinyl CoA synthase--gt
succinate GTP CoA succinate FAD
---Succinate dehydrogenase--gt fumarate
FADH2 fumarate H2O ---Fumerase--gt
malate Solution for Glucose environment (Target
Malate) glucose ATP ---Hexokinase--gt
glucose 6-phosphate ADP glucose 6-phosphate
---Phosphoglucomutase--gt frucose
6-phosphate frucose 6-phosphate ATP
---Phosphofructokinase--gt frucose 1,6
bisphosphate ADP frucose 1,6 bisphosphate
---Aldolase--gt dihydrozyacetone phosphate
glyceraldehyde 3-phosphate phosphatate NAD
glyceraldehyde 3-phosphate ---Triose phosphate
dehydrogenase--gt 1,3-bisphosphoglycerate 1,3-bisp
hosphoglycerate ADP ---Phosphoglycerate
kinase--gt 3-phosphoglycerate ATP same as
above from this point onward
19Simulate natural or experimental knockouts and
propose adaptive bridges.
glucose ATP ---Hexokinase--gt glucose
6-phosphate ADP glucose 6-phosphate
---Phosphoglucomutase--gt frucose
6-phosphate frucose 6-phosphate ATP
---Phosphofructokinase--gt frucose 1,6
bisphosphate ADP frucose 1,6 bisphosphate
---Aldolase--gt dihydrozyacetone phosphate
glyceraldehyde 3-phosphate phosphatate NAD
glyceraldehyde 3-phosphate ---Triose phosphate
dehydrogenase--gt 1,3-bisphosphoglycerate 1,3-bisp
hosphoglycerate ADP ---Phosphoglycerate
kinase--gt 3-phosphoglycerate
ATP 3-phosphoglycerate ---Phosphoglyceromutase--
gt 2-phosphoglycerate 2-phosphoglycerate
---Enolase--gt phosphoenolpyruvate
H2O phosphoenolpyruvate ATP ---Pyruvate
kinase--gt pyruvate ADP malate NAD
---Malate dehydrogenase--gt oxaloacetate NADH
H pyruvate NAD CoA ---NIL--gt NADH H
Co2 acetyl CoA acetyl CoA oxaloacetate
---Citrate synthase--gt citrate CoA citrate
---Aconitase--gt isocitrate isocitrate NAD
---Isocitrate dehydrogenase--gt a-ketoglutarate
NADH H Co2 a-ketoglutarate NAD CoA
---a-ketogluterate dehydrogenase complex--gt
succinyl CoA NADH H Co2 succinyl CoA GDP
phosphatate ---Succinyl CoA synthase--gt
succinate GTP CoA succinate FAD
---Succinate dehydrogenase--gt fumarate
FADH2 fumarate H2O ---Fumerase--gt malate
Knockout
1,3-bisphosphoglycerate ADP ---Phosphoglycera
te kinase--gt 3-phosphoglycerate ATP
20Given Inactivated Reactions Propose
Bridging Reactions
Abstract Chemicial Knowledge
Constrained Search
glucose ATP ---Hexokinase--gt glucose
6-phosphate ADP
ATP
ADP
3 Phosphates
2 Phosphates
6 Carbons 0 Phosphates
6 Carbons 1 Phosphate
Abstract Balance
21Given Inactivated Reactions Propose
Bridging Reactions
Knockout
1,3-bisphosphoglycerate ADP ---Phosphoglycera
te kinase--gt 3-phosphoglycerate ATP
25 plausible (single) bridging reactions are
proposed
ltCYTOSOLICglyceraldehyde 3-phosphate -----gt
3-phosphoglycerategt ltCYTOSOLICdihydrozyacetone
phosphate -----gt 3-phosphoglycerategt
ltCYTOSOLICfrucose 1,6 bisphosphate -----gt
phosphoenolpyruvate 3-phosphoglycerategt
ltCYTOSOLICfrucose 1,6 bisphosphate -----gt
2-phosphoglycerate 3-phosphoglycerategt
ltCYTOSOLICfrucose 1,6 bisphosphate -----gt
3-phosphoglycerate 3-phosphoglycerategt
ltCYTOSOLICATP frucose 1,6 bisphosphate
-----gt ADP 1,3-bisphosphoglycerate
3-phosphoglycerategt ltCYTOSOLICfrucose 1,6
bisphosphate -----gt glyceraldehyde 3-phosphate
3-phosphoglycerategt ltCYTOSOLICfrucose 1,6
bisphosphate -----gt dihydrozyacetone phosphate
3-phosphoglycerategt ltCYTOSOLICADP frucose
1,6 bisphosphate -----gt ATP Co2 acetyl
3-phosphoglycerategt ltCYTOSOLICADP
1,3-bisphosphoglycerate -----gt ATP
3-phosphoglycerategt ltCYTOSOLICADP frucose 1,6
bisphosphate -----gt ATP pyruvate
3-phosphoglycerategt ltCYTOSOLICADP frucose 1,6
bisphosphate -----gt ATP glycerate
3-phosphoglycerategt ltCYTOSOLICADP frucose 1,6
bisphosphate -----gt ATP glyceraldehyde
3-phosphoglycerategt ltCYTOSOLICADP frucose 1,6
bisphosphate -----gt ATP dihydroxyacetone
3-phosphoglycerategt ltCYTOSOLICATP glucose
6-phosphate -----gt ADP phosphoenolpyruvate
3-phosphoglycerategt ltCYTOSOLICATP glucose
6-phosphate -----gt ADP 2-phosphoglycerate
3-phosphoglycerategt ltCYTOSOLICATP glucose
6-phosphate -----gt ADP 3-phosphoglycerate
3-phosphoglycerategt ltCYTOSOLICATP glucose
6-phosphate -----gt ADP glyceraldehyde
3-phosphate 3-phosphoglycerategt ltCYTOSOLICATP
glucose 6-phosphate -----gt ADP
dihydrozyacetone phosphate 3-phosphoglycerategt
ltCYTOSOLICglucose 6-phosphate -----gt Co2
acetyl 3-phosphoglycerategt ltCYTOSOLICglucose
6-phosphate -----gt pyruvate
3-phosphoglycerategt ltCYTOSOLICglucose
6-phosphate -----gt glycerate
3-phosphoglycerategt ltCYTOSOLICglucose
6-phosphate -----gt glyceraldehyde
3-phosphoglycerategt ltCYTOSOLICglucose
6-phosphate -----gt dihydroxyacetone
3-phosphoglycerategt ltCYTOSOLICglucose ATP
-----gt 1,3-bisphosphoglycerate
3-phosphoglycerategt
22Challenge 3 Checking and Revising Modelsgiven
Limited Data
Prediction What sort of metabolic systems
might we expect to find in a methane-rich
environment? How might organisms acclimate or
adapt to increased temperature or UV radiation?
Experiment Planning What observations would
indicate sensory-system failure in abnormal cell
death?
Explanation Why do we observe bleaching of
plant cells under high light conditions?
23Interactive Guidance from Scientists
Background knowledge
Experimental data
Discovery
Updated models
24Knowledge lean (de novo) Discovery
Knowledge
Data
A Useful Model
Intense Data Use
Simplified Model Space
Search
Efficient Search Control
25The Data Analyzing Acclimation Dynamics
www.affymetrix.com/
Statistical Annotation
Stress (e.g., High Light)
Acclimation Adaptation
Cell Density
Sampling mRNA/cDNA
Initial Equlibrium
Time
www.affymetrix.com/
26How Evaluation Works
Evaluation Agreement with predicted relations
among partial correlations a measure of model
complexity
1 Pathway structure predicts covariation among
expression levels, and correlation over time
among derivatives.
2 Graph structure provides a complexity score,
and a distance score from the given base model.
27How many models are therefor the C. reinhardtii
chip?
28How many models are therefor the C. reinhardtii
chip?
2 1/2(8000 - 8000) 4
31996000 4
29How many models are therefor the C. reinhardtii
chip?
2 1/2(8000 - 8000) 4
31996000 4
Not to mention needing 28000 observations to
distinguish any two of them completely!
30The Data What you really get
31(No Transcript)
32biologists
Go out and bring us more data!
Jump naked into a vat of hot phenol!
33Shragers second law of computational
biology If you think that you need more
data..
34Shragers second law of computational
biology If you think that you need more
data.. You need more knowledge!
35Knowledge lean (de novo) Discovery
Knowledge
Data
A Useful Model
Intense Data Use
Simplified Model Space
Search
Efficient Search Control
36Knowledge Rich Computational Discovery
Data
Knowledge
A Useful Model
Constrained Model Space
Search
37Adding knowledge Limiting search to subsystems.
38Adding Knowledge Annotate the theory in terms of
Views.
Whats are Views?
Conceptually coherent, possibly complex, units of
partially abstract knowledge that can be
incrementally mixed into an existing model (by
View Application), updating the model in accord
with the principles represented in the view.
(aka. Schemas, Scripts)
Some Views in Cell Biology Transcriptional
Regulation Operon Attentuation Chemical
Cycle Transposon Insertion Feedback
Regulation Allosteric Modulation Protein
Assembly Signal Transduction
39Graphical Representation of Regulatory Models
-
NBLA
NBLR
PBS
-
DFR
Health
psbA1
-
-
-
RR
Photosyntheticactivity
psbA2
-
Light
cpcB
40Adding knowledge Annotate the theory in terms
of Views.
energy
-
NBLA
NBLR
PBS
damage
-
DFR
psbA1
Health
Signal Cascade
-
-
-
-
Signal Detection
psbA2
RR
Photosyntheticactivity
-
Light
cpcB
41(No Transcript)
42A Revised Model of Photosynthesis Regulation
Changes to the model improve its match to the
expression data.
Similar changes adapt the model to expression
data from mutants.
43Interactive Discovery The
Biologists Roles
- Provide representations and biological concepts,
possibly in abstract terms. - Focus search by providing initial models using
the above representations and concepts. - Guide search interactively
- Focus attention on problematic aspects of the
model. - Run discriminating experiments.
- Make hard (subjective) choices.
44Interactive Discovery The
Computers Role
The discovery system must be able to
- Deal with incomplete, ambiguous, and abstract
knowledge - Search the region near a given model using
biologically-plausible operators and within given
constraints - Search fast enough to interact effectively
- Produce explanations
- Formulate discriminating experiments.
45Challenge 4 Dealing with pervasive uncertainty
46(No Transcript)
47Homology of hexokinase across species
www.bio.davidson.edu/Biology
48Where sequences come from
49What chromatograms really look like
50Homology of hexokinase across species
www.bio.davidson.edu/Biology
51Homology of hexokinase across species
www.bio.davidson.edu/Biology
D _at_ p.3
52Homology of hexokinase across species
www.bio.davidson.edu/Biology
D _at_ p.3
Alignment _at_ p.025
53Homology of hexokinase across species
Maybe!
www.bio.davidson.edu/Biology
D _at_ p.3
Alignment _at_ p.025
54(No Transcript)
55P.031
P.031
Maybe!
P.031
P.031
P.031
P.031
P.031
P.031
P.031