Title: Semantic Aggregation, Integration, and Inference of Pathway Data
1Semantic Aggregation, Integration, and Inference
of Pathway Data
(Pedantic Aggravation, Irritation, and
Interference)
- Co-Destructors
- Joanne Luciano, PhD
- jluciano_at_biopathways.org
- Jeremy Zucker
- zucker_at_research.dfci.harvard.edu
ISMB 2005 Tutorial Detroit Michigan June 25th
2005 http//www.biopathways.org/ismb2005tutorial-a
m6/
2Overview
- Introduction (45 minutes)
- Time Out (15 minutes)
- Workshop Case Studies Exercises (2 hrs 15
minutes) - Subdivide into groups of triads and dyads with 5
minute breaks in between case studies - Case Study I (40 minutes)
- Case Study II (40 minutes)
- Case Study III (40 minutes)
- Time Out (15 minutes)
- Lessons Learned (30 minutes)
- Lessons Not Yet Learned (take home)
3Introduction (45 minutes)
- Semantic Aggregation, Integration and Inference
of Pathway Data -
- Pathway Data (domain)
- Why do we care? (motivation)
- What is it?
- What does it look like?
- Definitions Disclaimers
- Methodology
4Pathway Data Why do we care?
- Pathway Research has Broad Impact
- Drug Discovery (pathway of target, safety)
- Basic Science (identify pathways)
- Disease Research (cancer pathways)
- Environmental Research (microbial research)
- Combine knowledge from multiple sources
- Whole is greater than the sum of its parts
- Biological knowledge is fragmented
- Need database to manage resources
5Different types of pathways(different strokes
for different folks, its OK.)
Glycolysis
Protein-Protein
Apoptosis
Lac Operon
Molecular Interaction Networks
Gene Regulation
Signaling Pathways
Metabolic Pathways
The Main Categories
6Different representations of the same pathways
lt!ELEMENT reaction (substrate,product)gt lt!ATTLIS
T reaction name keggid.type
REQUIREDgt lt!ATTLIST reaction type
reaction-type.type REQUIREDgt lt!ELEMENT
substrate EMPTYgt lt!ATTLIST substrate name
keggid.type REQUIREDgt lt!ELEMENT product
EMPTYgt lt!ATTLIST product name keggid.type
REQUIREDgt
starts at a-D-Glucose 1P
KEGG Reference Pathway GLYCOLYSIS
7Different representations of the same pathways
reactions.dat This file lists all chemical
reactions in the PGDB. Attributes UNIQUE-ID
TYPES COMMON-NAME ACTIVATORS
BASAL-TRANSCRIPTION-VALUE DBLINKS DELTAG0
DEPRESSORS EC-LIST EC-NUMBER
ENZYMATIC-REACTION EQUILIBRIUM-CONSTANT
IN-PATHWAY INHIBITORS LEFT MOVED-IN
MOVED-OUT OFFICIAL-EC? REACTANTS REQUIREMENTS
RIGHT SIGNAL SPECIES SPONTANEOUS?
STIMULATORS SYNONYMS
starts at b-D-glucose6-phosphate
BioCYC Reference Pathway GLYCOLYSIS
8Different representations of the same pathways
ltreaction name"R_alpha_D_glucose_6_phosphate_D_fr
uctose_6_phosphate" id"R_163457"gt ltlistOfReactant
sgt ltspeciesReference species"R_30537_alpha_D_Gluc
ose_6_phosphate" /gt lt/listOfReactantsgt ltlistOfProd
uctsgt ltspeciesReference species"R_29512_D_Fructos
e_6_phosphate" /gt lt/listOfProductsgt ltlistOfModifie
rsgt ltmodifierSpeciesReference species"R_163455_gl
ucose_6_phosphate_isomerase_dimer_name_copied_from
_complex_in_Homo_sapiens_" /gt lt/listOfModifiersgt lt
/reactiongt
DatabaseObject 41245 Event 8285
Reaction 6598 ConcreteReaction 4034
GenericReaction 2564
Reactome Pathway GLYCOLYSIS
9Different representations of the same pathways
Does not compute. Pretty, but useless
Reactions clickable but...
Starts at Glucose (but it doesnt matter)
BioCarta Reference Pathway GLYCOLYSIS
10Pathway Data (domain)
How bad is it? Pathway Databases
So many pathway databases, so little time.
Graphic from Mike Cary and Gary Bader
11Definitions Disclaimers
- Aggregation
- 2 (or more) data sources, different schema. How
are they related? By creating explicit cross
references between them. - Integration
- 2 (or more) data sources, same schema. How does
it all fit together? Creating a standard schema,
semantic mapping and instance merging, or (entity
resolution) required. - Inference
- 1 (or more) data sources, one schema. creating
new instances or new relationships from existing
data rules. - (Evidence code type kind of inference)
- Disclaimer
- Controlled Vocabulary scope this tutorial
12Our methodology
- Define the goal of the project
- What are the questions are you trying to answer?
- Take stock of current information resources
- Experts
- Tools
- Data sources
- Scope the work to get from B to A
- Aggregation, Integration, or Inference?
- Data Cleaning
13Assembling KnowledgeAggregation, Integration,
Inference
When it comes to data cleaning, theres no such
thing as a free lunch. Tim Berners-Lee
Some tasks are specific to a use case, some are
common to more than one and theres no escaping
others.
14Time Out (15 minutes)
15Workshop Case Studies Exercises(2 hrs 15
minutes)
- Break into groups of triads and dyads
- Aggregation Workshop (45 minutes)
- Aggregation The Siderean Demo
- Group Exercise 1 Pedantic Aggravation
- Integration Workshop (45 minutes)
- Integration BioPAX Initiative
- Group Exercise 2 Pedantic Irritatation
- Inference Workshop (45 minutes)
- Inference Flux Balance Analysis
- Group Exercise 3 Pedantic Interference
16Methodology
- Define the goal of the integration
- How will the integrated data be used?
- This defines the level of integration from
syntactic through semantic - Take stock of current resources
- This defines your staring point
- Data base sources, programmers, lab access,
collaborators - Scope the work to get from B to A
- Data Profiling
- Resource Profiling
17Case Study IThe Siderean DemoAggregation
- Question What drugs can be used as candidates
for treating for B-cell Lymphoma patients? - By comparing gene expression patterns between
patients with and without B-cell lymphoma, a top
biomarker was found BRKCB-1
18Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
Gene
GO.rdf
Enzyme
Enzymes.rdf
19Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
Gene
GO.rdf
GO2Enzyme.rdf
Enzyme
Enzymes.rdf
20Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
Gene
MIM Id
OMIM.rdf
GO.rdf
GO2Enzyme.rdf
Enzyme
Enzymes.rdf
21Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
GO2OMIM.rdf
Gene
MIM Id
OMIM.rdf
GO.rdf
GO2Enzyme.rdf
Enzyme
Enzymes.rdf
22Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
23Seamark Demonstration Identification of new
drug candidates for BRKCB-1
1. Differentiate different forms of disease 2.
Identify patients subgroups. 3. Identify top
biomarkers 4. Identify function 5. Identify
biological and chemical properties and disease
associations of biomarker 6. Identify
documents 7. Identify role in metabolic
pathways 8. Identify compounds that interact 9.
Identify and compare function in other
organisms 10. Identify any prior art
24Aggregation Methodology
- For each Data source
- Motivation Why is this database included? What
will it allow the user to do? - Source Where can the original data be
found/downloaded? - Original format RDF, XML, flat file
- Link values What predicate classes found in the
data will be used to link to other data sources?
What sources will it link to? - Transformation How was the data changed to
expose the links? - Notes Instructive examples of usage, caveats,
etc.
25Aggregation Methodology
- GO to Probe Set
- Motivation Links the Probe Set to the Genes
- Source Affymetrix, Genecruiser
- Original format text
- Link values identified 42 genes to drive the
demo - Transformation Manually created from output of
Gene Cruiser - Notes Scaleable solution still needed.
26GO to Probe Set
27GO to Enzymes
- Motivation Links between GO and Enzyme
- Source ec2go.txt
- Original format flat file
- Link values goid and EC number are referenced
- Transformation inserted flatfile to relational
table and issued SQL query to create rdf - Notes Need authoritative LSID
28Gene Ontology to Enzymes
29Group Exercise IYou-do-it Aggregation
- Aggregation or Aggravation?
30Time Out (15 minutes)
31Different types of pathways(different strokes
for different folks, its OK.)
Glycolysis
Protein-Protein
Apoptosis
Lac Operon
Molecular Interaction Networks
Gene Regulation
Signaling Pathways
Metabolic Pathways
The Main Categories
32Different representations of the same pathways
lt!ELEMENT reaction (substrate,product)gt lt!ATTLIS
T reaction name keggid.type
REQUIREDgt lt!ATTLIST reaction type
reaction-type.type REQUIREDgt lt!ELEMENT
substrate EMPTYgt lt!ATTLIST substrate name
keggid.type REQUIREDgt lt!ELEMENT product
EMPTYgt lt!ATTLIST product name keggid.type
REQUIREDgt
starts at a-D-Glucose 1P
KEGG Reference Pathway GLYCOLYSIS
33Different representations of the same pathways
reactions.dat This file lists all chemical
reactions in the PGDB. Attributes UNIQUE-ID
TYPES COMMON-NAME ACTIVATORS
BASAL-TRANSCRIPTION-VALUE DBLINKS DELTAG0
DEPRESSORS EC-LIST EC-NUMBER
ENZYMATIC-REACTION EQUILIBRIUM-CONSTANT
IN-PATHWAY INHIBITORS LEFT MOVED-IN
MOVED-OUT OFFICIAL-EC? REACTANTS REQUIREMENTS
RIGHT SIGNAL SPECIES SPONTANEOUS?
STIMULATORS SYNONYMS
starts at b-D-glucose6-phosphate
BioCYC Reference Pathway GLYCOLYSIS
34Different representations of the same pathways
ltreaction name"R_alpha_D_glucose_6_phosphate_D_fr
uctose_6_phosphate" id"R_163457"gt ltlistOfReactant
sgt ltspeciesReference species"R_30537_alpha_D_Gluc
ose_6_phosphate" /gt lt/listOfReactantsgt ltlistOfProd
uctsgt ltspeciesReference species"R_29512_D_Fructos
e_6_phosphate" /gt lt/listOfProductsgt ltlistOfModifie
rsgt ltmodifierSpeciesReference species"R_163455_gl
ucose_6_phosphate_isomerase_dimer_name_copied_from
_complex_in_Homo_sapiens_" /gt lt/listOfModifiersgt lt
/reactiongt
DatabaseObject 41245 Event 8285
Reaction 6598 ConcreteReaction 4034
GenericReaction 2564
Reactome Pathway GLYCOLYSIS
35Case study II IntegrationThe BioPAX initiative
- A. How to create a standard pathway
representation? - B. Resources
- Pathway Databases/data providers
- Granting agencies/program managers
- Software tools/tool developers
- Ontologies/ontology experts
- Leadership/dedicated group of users
36Case study II IntegrationThe BioPAX initiative
- A. How to create a standard pathway
representation? - B. Resources
- Pathway Databases/data providers
- Granting agencies/program managers
- Software tools/tool developers
- Ontologies/ontology experts
- Leadership/dedicated group of lusers
37Methodology part cTake over the world or have
the world take over itself?
- Develop bridging technologies
- Develop pathway representation standard within
the Life Science community (BioPAX) (Social
Engineering!) - Utilize Semantic Web Integration Technologies
(LSID, RDF/OWL)
38Exchange Formats in Pathway Data Space(Scope)
Graphic from Mike Cary Gary Bader
39BioPAX Objectives
- Accommodate existing database representations
- Integration and exchange of pathway data
- Interchange through a common (standard)
representation - Provide a basis for future databases
- Enable development of tools for searching and
reasoning over the data
40BioPAX Motivation
gt180 DBs and tools
Application
Database
User
Before BioPAX
With BioPAX
Common format will make data more accessible,
promoting data sharing and distributed curation
efforts
41BioPAX Biological PAthway eXchange
- An abstract data model for biological pathway
integration - Initiative arose from the community
42Biological pathways of the Cell What is a
Pathway?
Glycolysis
Apoptosis
Lac Operon
Protein-Protein
Molecular Interaction Networks
Gene Regulation
Metabolic Pathways
Signaling Pathways
BioPAX Level 1
BioPAX Level 2
43Data integration with BioPAX
- Multiple kinds of pathway databases
- metabolic
- molecular interactions
- signal transduction
- Constructs designed for integration
- DB References
- XRefs (Publication, Unification, Relationship)
- synonyms
- provenance
- OWL DL to enable reasoning
44BioPAX Biochemical Reaction
OWL (schema)
Instances (Individuals) (data)
phosphoglucose isomerase
5.3.1.9
45BioPAX uses other ontologies
- Use pointers to existing ontologies to provide
supplemental annotation where appropriate - Cellular location ? GO Component
- Cell type ? Cell.obo
- Organism ? NCBI taxon DB
- Incorporate other standards where appropriate
- Chemical structure ? SMILES, CML, INCHI
46BioPAX Ontology Overview
a set of interactions
parts
how the parts are known to interact
Level 1 v1.0 (July 7th, 2004)
47BioPAX Ontology Top Level
- Pathway
- A set of interactions
- E.g. Glycolysis, MAPK, Apoptosis
- Interaction
- A set of entities and some relationship between
them - E.g. Reaction, Molecular Association, Catalysis
- Physical Entity
- A building block of simple interactions
- E.g. Small molecule, Protein, DNA, RNA
Graphic from Gary Bader
48BioPAX Ontology Root
- Root class Entity
- Any concept referred to as a discrete biological
unit when describing pathways. This is the root
class for all biological concepts in the
ontology, which include pathways, interactions
and physical entities
49Metabolic Pathways
- Interaction sub-classes
- Definition
- An entity that defines a single biochemical
interaction between two or more entities. - An interaction cannot be defined without the
entities it relates.
50Metabolic Pathways
- Interaction sub-classes
- Definition Two terms exist under interaction
Control and conversion. In future BioPAX levels,
this list may be extended to include other
classes, such as genetic interactions.
Examples Enzyme catalysis controls a biochemical
reaction, transport catalysis controls transport,
a small molecule that inhibits a pathway by an
unknown mechanism controls the pathway.
51Group Exercise IISemantic Ingegration
- Mapping of Biocarta Metabolic pathway
- to BioPAX.
52Case study IIIInference of a Metabolic flux
model from an annotated genome
- A. How to infer steady-state flux distributions
in single-cell organisms? - B. Information sources
- Stoichiometric matrix
- Thermodynamic constraints
- Nutrient uptake rates
- Biomass composition
- Gene-protein-reaction associations
53Case study IIIA. How to infer steady-state
metabolic flux distributions in single-cell
organisms?
- B. Information sources
- Stoichiometric matrix
- Thermodynamic constraints
- Nutrient uptake rates
- Biomass composition
- Gene-protein-reaction associations
54What is Metabolic Flux Analysis?
- Starts with the metabolic network
- Assumes steady-state behavior
- Constrain with Thermodynamics
- Add Nutrient conditions
- Choose an objective Biomass growth
- Predicts growth rate for mutant and wild-type
organisms under different conditions.
55Start with the metabolic network
56Stoichiometric Matrix Representation of the
metabolic network
R1 ? A R2 A ? B R3 A ? C
R4 B E ? 2D
R5 ? E R6 2B ? C F R7 C ? D R8 D ? R9
F ?
57What is a metabolic flux?
Source fluxes
Metabolite Pool
Sink fluxes
58What is a metabolic flux?
For a reaction of stoichiometry R2 A ? B the
rate of reaction, or flux is equal to
For a reaction of stoichiometry R4 BE ?
2D the flux is equal to
59What is a metabolic flux?
For a reaction of stoichiometry R4 BE ?
2D The rate of reaction, or flux, is equal to
60At steady-state, nonlinear dynamics simplify to
linear fluxes.
B
B
k2
v2
P2
k1
v1
A
P1
Aext
A
Aext
k3
P3
v3
C
C
61At steady-state, the sum of the fluxes that
produce a metabolite is equal to the sum of the
fluxes that consume it.
B
v2
v1
A
Aext
v3
C
62Stoichiometric Matrix more unknowns than
equations
63How to determine the metabolic capabilities of a
network?
Uptake v5
E
B
v4
v2
2B
Biomass v8
2D
Uptake v1
A
v6
D
v3
F
v7
C
Waste v9
64Using Elementary modes to study the steady
state-behavior
v5
v5
E
E
B
E
E
B
v4
v2
2B
v4
v2
2B
2D
v1
2D
A
v1
v6
v8
A
D
v8
v6
D
v3
v3
F
v7
F
C
v7
C
R9
v9
v5
E
B
v4
v2
2B
2D
v1
A
v8
v6
D
v3
F
v7
C
v9
65How to draw conclusions about the behavior of the
metabolic network?
Uptake v5
E
B
v4
v2
2B
Biomass v8
2D
Uptake v1
A
v6
D
v3
F
v7
C
Waste v9
66Optimal wild-type flux distribution
v5
10
Optimal Growth Flux
E
B
v4
2B
v2
10
10
2D
v1
v8
A
v6
10
D
20
v3
F
v7
C
v9
67Optimal mutant flux distribution
v5
E
B
v4
2B
v2
STOP
2D
v1
v8
A
v6
10
D
10
v3
10
10
F
C
v7
v9
68Suboptimal mutant flux distribution
v5
E
B
v4
2B
v2
STOP
6.7
2D
v1
v8
3.3
A
10
v6
D
6.7
v3
3.3
6.7
F
C
v7
3.3
v9
69Case III Integrated Metabolic flux model
- good flux balance model
- implicit schema
- literature curated biochemical reactions
- 904 enzymatic reactions
- gene, enzyme-reaction associations
70Model vs. Exper., Glucose limited
(fluxes in mmol/gr DM h normalized to glucose
uptake flux)
(Segrè, Vitkup and Church, PNAS 2002)
71Low Glucose Limited
High Glucose Limited
Nitrogen Limited
ni (exper)
ni (exper)
ni (exper)
Corr.coeff.0.91
Corr.coeff.0.97
Corr.coeff.0.78
72Max growth (optimal)
Min Adjust. (suboptimal)
Corr
.
coeff
.0.564
250
P
-
value0.007
200
7
)
8
150
theor
10
13
9
100
11
14
3
(
12
1
i
v
50
16
2
15
17
6
5
0
4
-
50
-
50
0
50
100
150
200
250
v
(
exper
)
i
73The power of a model lies in its ability to
distinguish between competing hypotheses
74Bugs in model assumptions revealed by comparison
to experiment
75Data Profiling of Flux Model
- Incorrect constraints (reversibility)
- Incorrect Nutrient conditions
- Incorrect Biomass composition
- Incorrect protein function predictions
76Data profiling of Flux Predictions
- Incorrect hypothesis
- (FBA vs MOMA vs ROOM)
- Incorrect network architecture
- (Gene knockouts)
- Incorrect modeling assumptions
- (steady state assumption,
- gene expression profiles)
77Fixing the problems you find
- Requires different amounts of time, money, and
expertise - Enzyme Genomics project
- Community annotation projects
- Adopt-a-Genome project
- High-throughput experiments
- Pathway hole filling algorithms
78Syntactic Bugs revealed by mass balance
- maltodextrin phosphate ? maltotetraose
glucose-1-phosphate - maltotetraose glucose-1-phosphate ? Glycogens
phosphate - Glycogens phosphate ? maltodextrin
glucose-1-phosphate - --------------------------------------------------
------------------------------- - phosphate ? glucose-1-phosphate !!!
- Solution Check that each reaction is balanced
- Molecular weights WATER 18 daltons
- Chemical formulae (C 6) (H 12) (O 6)
- Atomic structure
4-(1-D-ribitylamino)-5-amino-2,6-dihydroxypyrimidi
ne 5-amino-6-ribitylamino-2,4(1H,3H)-pyrimidined
ione
79Semantic bugs revealed by chemical structure
EcoCyc 7.5 Pathway Riboflavin and FMN and FAD
biosynthesis
No place to go!
4-(1-D-ribitylamino)-5-amino-2,6-dihydroxypyrimidi
ne
80Semantic bugs revealed by chemical structure
EcoCyc 8.0 Pathway Riboflavin and FMN and FAD
biosynthesis
Synonyms
4-(1-D-ribitylamino)-5-amino-2,6-dihydroxypyrimidi
ne
81Bugs in Network structure revealed by Forward and
Backward chaining
Known Nutrient set
Fired Reaction
Unfired Reaction
Essential compounds
Missing essential compound
Biomass
82Bugs in Network structure revealed by Forward and
Backward chaining
Unproduced metabolite
Precursor metabolite
Essential compounds
Missing essential compound
Biomass
83Case III Semantic Aggregation Case study
- Prochlorococcus marinus MED4
- Most abundant species in the ocean
- Responsible for a significant portion of
photosynthetic carbon fixation. - Iron hypothesis Possible solution to global
warming? - Need to understand details of metabolic network
84Group Exercise III Inference
- Metabolic Inference
- Inference of a metabolic reaction from a pathway
85Time Out (15 minutes)
86Lessons Learned(30 minutes)
- What did you learn?
- Discussion
- A good representation is the key to good problem
solving Patrick Winston - Standard is better than bestGerald J Sussman
- The great thing about standards is that there
are so many from which to choose --Unknown - Above all, one must develop a feeling for the
organism.Barbara McClintock - Someone does it once, everybody benefits.Eric
Miller, W3C Semantic Web Activity Lead - Remember people, process, technology, however
without people there isnt any process or
technology, so its all social engineering.
87Discussions(30 minutes)
- Does Inference subsume Integration?
- Does Integration subsume Aggregation?
- Overlaying Microarray data on to pathways
- BioPAX used to make SBML more easily integratable
- Shcema level / instance level problems
88Bridging Chemistry and Molecular Biology
- Different Views have different semantics Lenses
- When there is a correspondence between objects,
a semantic binding is possible
UniprotP49841
Apply Correspondence Ruleif ?target.xref.lsid
?bpxprot.xref.lsidthen ?target.correspondsTo.
?bpxprot
Source Eric Neumann Haystack BioDASH Demo
http//www.w3.org/2005/04/swls/BioDash/Demo/
89SMBL integration using BioPAX
- Use BioPAX to Address SBMLs data integration
issues - Different data types, same representation
- Same data, different representations
- External references
- Synonyms
- Provenance
90A problem same representation different
semantics (SBML)
- Protein-Protein Interaction
- ltreaction
- idpyruvate_dehydrogenase_cplx/gt
- ltlistOfReactantsgt
- ltspeciesRef speciesPdhA/gt
- ltspeciesRef speciesPdhB/gt
- lt/listOfReactantsgt
- ltlistOfProductsgt
- ltspeciesRef speciesPyruvate_dehydrogenase_E1
/gt - lt/listOfProductsgt
- lt/reactiongt
Biochemical Reaction ltreaction
idpyruvate_dehydrogenase_rxn/gt
ltlistOfReactantsgt ltspeciesRef
speciesNADP/gt ltspeciesRef speciesCoA/gt
ltspeciesRef speciespyruvate/gt
lt/listOfReactantsgt ltlistOfProductsgt
ltspeciesRef speciesNADPH/gt ltspeciesRef
speciesacetyl-CoA/gt ltspeciesRef
speciesCO2/gt lt/listOfProductsgt
ltlistOfModifersgt ltmodifierSpeciesRef
speciespyruvate_dehydrogenase_E1/gt
lt/listOfModifiersgt lt/reactiongt
91SBML annotated with BioPAX
- ltsbml xmlnsbphttp//www.biopax.org/release1/bio
pax-release1.owl - xmlnsowl"http//www.w3.org/2002/07/owl"
- xmlnsrdf"http//www.w3.org/1999/02/22-rdf
-syntax-ns"gt - ltlistOfSpeciesgt
- ltspecies idPdhA metaidPdhAgt
- ltannotationgt
- ltbpprotein rdfIDPdhA/gt
- lt/annotationgt
- lt/speciesgt
- ltspecies idNADP metaidNADPgt
- ltannotationgt
- ltbpsmallMolecule rdfIDNADP/gt
- lt/annotationgt
- lt/listOfSpeciesgt
- ltlistOfReactionsgt
- ltreaction idpyruvate_dehydrogenase_cplxgt
- ltannotationgt
- ltbpcomplexAssembly rdfIDpyruvate_dehydrog
enase_cplx/gt - lt/annotationgt
species is protein protein is PdhA
species is small molecule small molecule is NADP
92BioPAX External References
- ltspecies idpyruvate metaidpyruvategt
- ltannotation
- xmlnsbphttp//biopax.org/release1/biopax-r
elease1.owlgt - ltbpsmallMolecule rdfIDpyruvategt
- ltbpXrefgt
- ltbpunificationXref
rdfIDunificationXref119"gt - ltbpDBgtLIGANDlt/bpDBgt
- ltbpIDgtc00022lt/bpIDgt
- lt/bpunificationXrefgt
- lt/bpXrefgt
- lt/bpsmallMoleculegt
- lt/annotationgt
- lt/speciesgt
93BioPAX Synonyms
- ltspecies idpyruvate metaidpyruvategt
- ltannotation xmlnsbphttp//biopax.org/release1/b
iopax_release1.owl/gt - ltbpsmallMolecule rdfIDpyruvate gt
- ltbpSYNONYMSgt2-oxo-propionic
acidlt/bpSYNONYMSgt - ltbpSYNONYMSgt2-oxopropanoatelt/bpSYNONYMSgt
- ltbpSYNONYMSgtBTSlt/bpSYNONYMSgt
- ltbpSYNONYMSgtpyruvic acidlt/bpSYNONYMSgt
- lt/bpsmallMoleculegt
- lt/annotationgt
- lt/speciesgt
94Lessons Not Yet Learned(Take home exercise)
95Feedback
- Our goal is to have you walk away with a clear
understanding of how to approach any database
integration project - To provide
- A methodology to scope and plan the project
- An understanding of what to expect
- Some specific examples to illustrate what is
common to all integration projects (data
cleaning) and what specific to a particular task.
(i.e. to provide you with examples to give a
sense of it) - Some first hand experience at pedantic
aggravation, irritation and interference - How did we do? Please let us know how we can
improve this tutorial.
96Thank You Joanne Jeremy