Title: Developing a protein-interactions ontology
1Developing a protein-interactions ontology
- Esther Ratsch
- European Media Laboratory
2PIOG
- Protein Interactions Ontology Group
- Computer scientists
- Philipp Cimiano Lavin (IMS Stuttgart, EML
Heidelberg) - Isabel Rojas (EML Heidelberg)
- Computational linguists
- Uwe Reyle (IMS Stuttgart)
- Jasmin Saric (EML Heidelberg)
- Biologists
- Esther Ratsch (EML Heidelberg)
- Jörg Schultz (MPI for Molecular Genetics Berlin)
- Ulrike Wittig (EML Heidelberg)
3Motivation
- Why protein interactions?
- protein function analysis
- larger datasets
- Why an ontology?
- clear domain model
- storage and understanding of data
- information retrieval from text
- retrieve hidden information, inferencing
4What is a signal transduction pathway?
- signal from outside is transduced to the nucleus
- often phosphorylation cascade
change
signal
transcription
5Why are they important?
- control of cellular processes
- communication between cells
- response to environmental changes
- regulatory network
- stable system, single mutations may be overriden
by other pathway - complex network enables complex behaviour
6Jak-Stat pathway
P
P
P
P
P
P
P
P
P
P
STAT monomers
P
nucleus
P
tyrosine residues
target genes
7General approach
- Identify scope of the ontology
- Identify concepts involved and their properties
- How to represent them?
- Define rules and constraints
- Formalisation
Scope Concepts Representation
Rules/Constraints Formalisation
8The scope
- Ontology that represents interactions between
proteins and other cellular compounds - Restriction on molecular detail amino acids
- Concentration on signal transduction pathways in
initial phase - no quantitative properties are modeled
Scope Concepts Representation
Rules/Constraints Formalisation
9Identify concepts Interacting compounds
- Different kinds of compounds proteins,
genes/DNA, ions, ... - Composition of compounds, e.g. amino acids,
domains
Scope Concepts Representation
Rules/Constraints Formalisation
10Properties of compounds
- Characteristics molecular weight, sequence,
isoelectric point... - Interaction potential modifications, location,
binding partners
Scope Concepts Representation
Rules/Constraints Formalisation
11Identify concepts Interactions I
- Different types of interactions phosphorylation,
binding, translocation ... - Other classification grouping of gt 100 verbs
(Swissprot) ? 11 not disjoint classes
- Control/Regulation
- Biochemical Interactions
- Logical Interactions
- Bind/Dissociate
- Formation
- Integrity
- Availability
- Change of Location
- Modification of Structure
- Special Processes/ Reactions
- Order
Scope Concepts Representation
Rules/Constraints Formalisation
12Representation of proteins
- General characteristics sequence, molecular
weight, ... - Protein state
- location
- list of modifications
- list of binding partners
Scope Concepts Representation
Rules/Constraints Formalisation
13Representation of interactions
- Event with pre- and postconditions
Scope Concepts Representation
Rules/Constraints Formalisation
14Rules and constraints
- Simple hierarchies nucleolus inside nucleus,
Stat1 is a Stat is a protein - Rules for the definition of interactions
- Consistency checking
- Knowledge retrieval
Scope Concepts Representation
Rules/Constraints Formalisation
15Rules and constraints example
- Protein A is phosphorylated by B at position X.
- A and B are located in the same compartment
- A was not modified at X before
- A is phosphorylated at X afterwards
- B is a protein kinase, which is a protein
- dependent on X, B is either a S/T-kinase or a
Y-kinase
Scope Concepts Representation
Rules/Constraints Formalisation
16Formalisation phosphorylation
- Phosphorylation of a protein by a kinase at a
distinct residue - S/T-kinase phosphorylation
Â
Scope Concepts Representation
Rules/Constraints Formalisation
17Challenges met
- Multidisciplinarity of the group
- Different vocabularies ? clear expression, fewer
ambiguities - Different goals, different needs ? not restricted
to one goal - Different experiences ? mutual benefit
- Domain
18Complexity of the domain
- Granularity of information
- detail of compound part
- protein Stat
- domain SH2-domain
- amino acid tyrosine701
- detail of protein identity
- protein family Jak, Stat
- protein type Jak2, Stat5
- organism specific protein Jak2_human, Jak2_rat
19Complexity of the domain II
- description detail
- not known no data available
- doesnt have no binding partners
- dont care not important for a certain
interaction
20What comes next?
- Go on with development of ontology
- Projects using the ontology
- integration in larger ontology on metabolic
pathways - application to TIGERSearch (see poster)
21Acknowledgements
- Protein Interactions Ontology Group
- Computer scientists
- Philipp Cimiano Lavin (IMS Stuttgart, EML
Heidelberg) - Isabel Rojas (EML Heidelberg)
- Computational linguists
- Uwe Reyle (IMS Stuttgart)
- Jasmin Saric (EML Heidelberg)
- Biologists
- Esther Ratsch (EML Heidelberg)
- Jörg Schultz (MPI for Molecular Genetics Berlin)
- Ulrike Wittig (EML Heidelberg)