Bioinformatics%20in%20the%2090 - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics%20in%20the%2090

Description:

The protein-protein interaction map of Helicobacter pylori ... 3rd Meeting Satellite Meeting of ISMB 2001, Copenhagen - Focus on ontologies ... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 33
Provided by: Hybrig
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics%20in%20the%2090


1
Bioinformatics in the 90s
  • Origins data storage needs related to the
    sequencing effort...
  • but storage was hardly enough additional
    needs
  • Assembly, comparison and annotation of sequences
  • Prediction of genes
  • Reconstruction of evolutionary trees
  • Modelisation prediction of 3D structures
  • ...
  • IT on-line databases and software tools
  • Science modeling, computational
    representations, algorithms

2
The post-genomic phase transition
  • Availability of complete genome sequences
  • High-throughput experimental techniques yield new
    types of results
  • SNP (Single Nucleotide Polymorphisms)
  • mRNA expression levels (DNA chips)
  • Systematic determination of 3D structure
  • Protein expression levels
  • Protein- protein interactions
  • Systematic mutagenesis
  • ...
  • New needs opportunities
  • Processing and analysis of each type of data
  • Integration of heterogeneous data
  • Reconstruction and simulation of cellular
    mechanisms

3
Corporate Information
Founded December 1997 Headquarters and
laboratories Central Paris Employees 60
as of end 2001 Intellectual Property 57
patents on technology, interaction
s and targets Equity raised c. ?30
million Ownership Advent (B), Alafi
(US), Apax (F), Auriga (F),
IMH(D), Health Cap (S), Lombard-Odie
r (CH), Medicis (D), Rendex (B)
4
Hybrigenics business strategy
  • Own drug discovery programs
  • in the fields of infectious diseases, cancer and
    metabolic disorders
  • the resulting novel validated targets being
    exploited for the Companys own product pipeline
  • Collaboration and licensing agreements with
    biopharmaceutical companies
  • in any disease field
  • for out-licensing

5
Hybrigenics discovery programs
Cancer Proteins involved in basic cellular
functions Proteins involved in
apoptosis Proteins involved in cell cycle
regulation
Metabolic disorders / Obesity Proteins
involved in adipogenesis
Anti-infectious diseases
Antibacterial Essential proteins of the
pathogens HIV, HCV protein-protein interactions
between the host cell and the pathogen
6
The Helicobacter pylori Genome
From Tomb et al. (1997), Nature 388539-47
less than 20 with assigned biological
functions (500 with no database match 250 with
structural homology but totally unknown function)
1,667,867 base pairs 1,590 predicted ORFs
7
The protein-protein interaction map of
Helicobacter pylori
285 baits 261 proteins
2 million prey fragments
20 milion interactions/bait
PBS filtering (false positives identification)
Over 1,200 interactions Over 1500 SID
Nature (2001) 409211-215.
Connectivity 46.6 of proteome 3.36
interactions/bait Reproducibility gt95
8
Target IdentificationHybrigenics' PIM Technology
Platform
New Generation of Reliable High-Throughput 2-Hybri
d in Yeast Coli
PIMBuilder in-house Production Management System
PBS Scoring Technology
VirtualPIM Prediction
PIMRider platform
9
HybrigenicsTarget Discovery Process
Target Identification
Target Pre-Validation
Target Validation
Selected Pathology and Mechanism of Action
10
In-silico Target Validation Platform
  • Goals
  • Validate protein interactions and SIDs
  • Evaluate  target potential  and druggability
  • Provide functional context for target candidates
  • Prioritize  promising" candidates for biological
    validation
  • Means
  • Integrate PIMs with functional clues of different
    origins
  • Predict novel biological information
  • Computer aided decision process
  • Provide comprehensive  decision-oriented  view
    of functional clues
  • Automated filtering
  • Output
  • Prevalidated targets functional context

11
The Genostar platformA modular software platform
for exploratory genomics
The Geno Consortium Pasteur Institute
(Paris), National Institute for Research in
Computer Science (INRIA, Grenoble) Genome Express
(Grenoble) Hybrigenics
  • Genostar technology
  • Rich object-based knowledge representation system
    (objects, relations, tasks and strategies)
  • Modular architecture
  • Domain-specific biological modeling

12
Genolink viewing biological data as a graph of
relations
Genolink Composite Graph
Vertices biological entities
Edges similarity, interaction or association
links
Sequence Similarity Links
Profile Similarity Links
Domain Inclusion Links
Tissue Expression Links
Subcell Location Links
Protein Interaction Links
Preprocessing
Genomic data
mRNA Expression data
Interaction data
Sub-Cellular Location
Domain data
13
From PIMs to Pathways
From PIMs to Pathways
Combine PIMs and external data to reconstruct
biological pathways
PIM annotation Pathways expansion
PIM Network of interaction links
Context-dependentHomology
Common Data Model
Functional Classifications
Pathways Databases
PIMs
14
The BioPathways Consortium
  • Mission
  • Foster development of pathways informatics
    systems biology
  • Goals
  • Scientific community buildup, standards
    recommendation, public outreach,
    industry-academia collaboration support,
    coordination with other groups
  • Means
  • Forum open to interested participants (academics,
    pharmas, biotechs, software vendors)
  • Achievements
  • Launched June 2000 by 3rd Millennium (Boston) and
    Hybrigenics (Paris)
  • 1st Meeting at ISMB 2000 -gt Work Groups
  • 2nd Meeting at PSB 2001 -gt First results on
    evaluation of pathways representations
  • 3rd Meeting Satellite Meeting of ISMB 2001,
    Copenhagen -gt Focus on ontologies and pathways
    reconstruction (gt150 attendants), new workgroups
  • Several sponsors (pharmas, biotechs, IT
    companies)
  • Over 200 participants from academia industry

15
Annotation fonctionnelle
  • Objectif assigner une/des fonction(s) à un
    gène ou à une protéine de séquence connue
  • Méthodes traditionnelles
  • Résultats expérimentaux
  • Variations sur le thème propagation
    dannotations dorigine expérimentale via
    similitude de séquences
  • Fonction ?
  • Locale et précise (Ex la protéine P est un
    enzyme catalysant la réaction R)
  • Globale et vague appartenance à un processus
    biologique de haut niveau (Ex P intervient dans
    la dégradation du glucose)
  • Ce qui est propagé mots clefs, nœud dun arbre
    de classification fonctionnelle

16
An effort toward consensus Gene Ontology
Fig. 1 Examples of Gene Ontology. Three
examples illustrate the structure and style used
by GO to represent the gene ontologies and to
associate genes with nodes within an ontology.
The ontologies are built from a structured,
controlled vocabulary. The illustrations are the
products of work in progress and are subject to
change when new evidence becomes available. For
simplicity, not all known gene annotations
have been included in the figures. a, Biological
process ontology. This section illustrates a
portion of the biological process ontology
describing DNA metabolism. Note that a node may
have more than one parent for example, DNA
ligation has three parents, DNAdependent DNA
replication, DNA repair and DNA recombination
. b, Molecular function ontology. The ontology is
not intended to represent a reaction pathway, but
instead reflects conceptual categories
of gene-product function. A gene product can be
associated with more than one node within an
ontology, as illustrated by the MCM proteins.
These proteins have been shown to bind chromatin
and to possess ATPdependent DNA helicase
activity, and are annotated to both nodes. c,
Cellular component ontology. The ontologies are
designed for a generic eukaryotic cell, and are
flexible enough to represent the known
differences between diverse organisms.
The Gene Ontology Consortium (2000) Nature Genet.
25 25-29
17
Le dogme
Séquence
Structure
Fonction
18
et les expériences
Contexte cellulaire
Technologies de Perturbation
?
Séquence
?
Structure
Technologies dobservation
?
Fonction
Phénotype
Couple perturbation-observation faux positifs,
faux négatifs, traitement statistique,
formalisation de la conclusion
19
Integration of heterogeneous data
  • Joint use of functional clues from a variety of
    experimental approaches to
  • Validate the biological relevance of interactions
  • Determine the function of proteins
  • Validate targets in-silico
  • Examples
  • Interaction expression
  • Interaction 3D structure
  • Location expression
  • Phylogenetic profiles domain fusion
  • Recent problem, drug discovery efforts bottleneck
  • Frontier for the bioinformatics community
  • Technology normalization, formats, ontologies
  • Science automate (some) biological reasoning ?

20
Evaluating pathways representations
  • Vincent Schächter, Hybrigenics, Paris
  • Aviv Regev, Tel-Aviv University
  • BioPathways Formalisms Workgroup

21
Evaluation scope untangling the web...
  • Large body of literature, focusing on different
    biological phenomena and different theoretical
    issues
  • A typical article on pathways may include one or
    more of the following
  • A data-model, describing (a fraction of) the
    pathway universe of discourse
  • A formalism, used to describe the data-model and
    to express algorithms / functions
  • Description of algorithms based on
    characteristics of both the formalism and the
    data model
  • Description of implementations of data-storage
    functionalities and/or of some of the above
    algorithms

22
Excerpt from target evaluation list non DE
formalisms
  • Petri nets (basic, hybrid, self-modifying,
    time-dependent, hierarchical, mobile)
  • Process algebra (basic and stochastic
    pi-calculus)
  • Markup languages (CellML and SBML)
  • Biocalculus
  • Regulatory grammars (Collado-Vides)
  • Semiotes (Kazic)
  • Statecharts (Kam, Holcombe)
  • Boolean networks (basic, multi-level)
  • Hierarchical networks (Bodnar)
  • Neural networks (Mjolsness)
  • Molecular graph reaction networks (McCaskill)
  • Molecular interaction maps (Kohn)
  • Electrical circuits (Keane)

23
Quelques exemples de représentations discrètes
  • Modèles orienté-objet
  • Requêtes sur tous types de réseaux
  • Reconstruction, mais problème de l information
    incomplète
  • Réseaux booléens
  • Simulation qualitative, reconstruction à partir
    de données d expression
  • Appliqué aux réseaux de régulation
  • Réseaux de Petri
  • Simulation qualitative plus fine, analyse
    formelle du comportement
  • Appliqué aux réseaux de régulation
  • Application possible aux réseaux métaboliques et
    signalisation avec extensions (self-modifying PN,
    Hybrid PN)
  • Algèbres de processus
  • Simulation, analyse formelle, reconstruction
  • Appliqué aux réseaux de signalisation et de
    régulation (métabolisme avec extension
    stochastiques

24
The position of formalisms in the context of
pathways informatics
  • Pathway construction
  • Pathway generation
  • Pathway selection
  • Dynamics
  • Simulation
  • Analysis

Data storage retrieval Query language
Supports
Supports
Supports
Construction-oriented formalism data-model
Dynamics-oriented formalism data-model
Database-oriented formalism data-model
Expresses
Expresses
Expresses
  • Core Representation / Ontology
  • Biological scope
  • Formal expressiveness

25
Evaluate and compare a modular approach
  • Evaluate expressiveness/ease of use of
    representation relatively to specific
    goals/functionalities
  • Compare representations in the categories for
    which they were designed
  • Reduce each category to a set of evaluation items
    that can be rated and compared as objectively as
    possible

26
Core representation / Ontology
  • Conceptual structure of the universe of
    discourse (abstract and concrete entities,
    relations, hierarchies...)
  • Constrains scope of phenomena that can be
    described, and thus queried, analyzed,
    reconstructed, and queried.
  • Often implicit in a given pathway representation
    need to extract...
  • Possible evaluation schemes
  • 1. Compare features of ontology
  • 2. Expressiveness benchmark set of biological
    situations
  • 3. Translation of data models into common
    formalisms comparison

How do you represent gene A inhibits gene B
in your data model ?
27
Conceptual Model Biological Scope Evaluation
Items
28
Core Representation Formal Expressiveness
Evaluation Items
29
Data Storage and Retrieval
  • Storage and retrieval of data
    database-related functionalities
  • Extremes relational or OO models vs, e.g., most
    simulation-oriented formalisms...
  • A data-retrieval oriented formalism can be used
     below  other formalisms
  • Query language
  • Retrieve information within a structured,
    homogeneous, compositional framework
  • Shifting boundary with analysis and
    reconstruction algorithms
  • Evaluation items / sub-categories
  • Robust database implementation issue
  • Query language ease of use
  • Query language expressiveness
  • Limited by formalism and ontology expressiveness

30
Pathway reconstruction
  • Construction/prediction of pathways in given
    biological environment (organism, tissue,
    condition, location) from a combination of
  • experimental data
  • fully instantiated pathway information,
  • partially instantiated (or incomplete) pathway
    data, such as interaction data
  • Special cases reverse engineering, pathway
    inference
  • Evaluation items / sub-categories
  • Input data types
  • Pathway generation algorithm
  • Pathway selection algorithm
  • Pathway fitness function
  • Pathway similarity/homology measure
  • Interactive validation ?

31
Dynamics
  • Study of network dynamics (regulatory networks,
    ST, MP)
  • Simulation runs
  • Analysis of dynamic behavior
  • Evaluation items / sub-categories
  • States nature, expressiveness, level of detail
    vs available data
  • Evolution rules / Reaction model rule,
    implementation
  • Time continuous/discrete, synchronous/asynchrono
    us updates
  • Space continuous/discrete, topology, resolution
  • Analysis
  • Scope state reachability, liveness of
    transitions, substance flow...
  • Formal methods available
  • Comparative power
  • Limited to steady state ?

32
Methodology what do we evaluate ?
Queries
Reconstruction
Simulation
Supports
Formalism
Evaluation targets
Describes
Data-model
Translation into common ontology description
language ?
Ontology
Write a Comment
User Comments (0)
About PowerShow.com