InnateDB - PowerPoint PPT Presentation

About This Presentation
Title:

InnateDB

Description:

Grey nodes do not have an annotated subcellular localization (from Gene Ontology ... David Lynn, Chris Fjell, Jennifer Gardy, Karsten Hokamp, Nicolas Richard, ... – PowerPoint PPT presentation

Number of Views:171
Avg rating:3.0/5.0
Slides: 79
Provided by: inna5
Category:

less

Transcript and Presenter's Notes

Title: InnateDB


1
InnateDB Facilitating Systems Level Analyses
of the Mammalian Innate Immune Response
David Lynn M.Sc., Ph.D., Research Associate,
Brinkman Lab., Simon Fraser University
Hancock Lab., University of British
Columbia. InnateDB Data Analysis Workshop -
UBC, Vancouver. April 2nd 3rd 2008. Updated
Sept. 2009.
2
Systems Biology Approaches to Investigating the
Innate Immune Response
  • Although progress has been made in understanding
    the innate immune response including the detailed
    dissection of some of the critical signaling
    pathways involved.
  • Now becoming clear that the innate immune
    response does not involve simple linear pathways
    but rather complex networks of pathways and
    interactions, negative feedback loops and
    multifaceted transcriptional responses.
  • To better understand the complexities of the
    innate immune response and the cross-talk between
    its components, complimentary systems level
    analyses and more focused follow-up experimental
    approaches are now needed.

3
InnateDB Developed in the Context of Two Large
International Systems Biology Projects
Mouse Model Datasets Cerebral Malaria mouse
model (IMR, Australia) Tuberculosis mouse model
(AECM) Shigella xenograft model (Pasteur) Human
Clinical Datasets Typhoid Malaria Vietnam
(OUCRU/Stanford/Sanger) Non Typhoidal Salmonella
Malawi (Sanger) Chronic/Acute Helminth Ecuador
(USF de Quito/Sanger)

Modulating innate immune response viaHost
Defense peptides (Hancock lab, UBC) Mouse KOs
(Sanger)
Novel insight into host response and mechanism of
peptides. Common Pathways, networks and
transcriptional regulation?
4
Why Systems Approaches are Needed
  • Many layers of complexity
  • Layers of regulation
  • 100s 1000s DE genes
  • Not simple pathways ? networks of molecular
    interactions.
  • Gardy, Lynn, Brinkman, Hancock.
  • Enabling a systems biology approach to
    immunology focus on innate immunity.
  • Trends in Immunology June 2009.

5
The Need for InnateDB the Manual Curation of
Innate Immunity Relevant Molecular Interactions
Pathways.
  • Quickly apparent that available resources
    provided poor coverage and detail of the
    molecular interactions and pathways relevant to
    innate immunity.
  • This information is essential for the
    systems-orientated interpretation of large scale
    genomics data.
  • TLR4 ? one of the most important molecules in the
    innate immune response, has relatively few
    molecular interactions annotated in the major
    publicly available interaction DBs.
  • 5 of these DBs combined contained annotated
    molecular interactions between TLR4 and just 11
    other proteins.
  • Through a review of the literature we have
    curated, in detail, a further 16 unique
    interactions, and provided annotation of nearly
    60 different lines of evidence supporting these
    interactions.
  • Relatively new pathways (NLR, RLR pathways) not
    annotated at all in major pathway databases.
  • Few resources available for analysis of data in a
    pathway/network context that were accessible to a
    biologist. No resources for innate immunity.

6
Overview of InnateDB Project (www.innatedb.ca)
  • InnateDB (www.innatedb.ca) is a database of all
    human and mouse experimentally-verified
    interactions and pathways
  • ( their component molecules
    Genes/Proteins/RNAs).
  • Particular emphasis on the contextual manual
    curation of interactions involved in innate
    immunity (10,000 intxns).
  • InnateDB facilitates systems-level analyses of
    mammalian signaling through integrated
    bioinformatics and visualization tools pathway
    ontology analysis, network construction
    analysis, orthologs, Cerebral, Cytoscape, CyOOg,
    etc.
  • Manual curation project integration of publicly
    available databases into InnateDB greatly
    increases innate immunity relevant molecular
    interaction networks pathways.
  • Enable biologists without a computational
    background to explore their data in a more
    systems-oriented, yet user-friendly, manner.

7
Contextually Curating Innate Immunity-Relevant
Interactions
  • Manual curation gt 10,000 innate immune-relevant
    interactions (human and mouse).
  • Involving 2,700 genes from review of 2,600
    unique publications.
  • We can often double of interactions for a given
    gene.
  • Pathways interactions are curated with
    contextual annotations
  • (supporting publication participant molecules
    the species the interaction detection method
    the host system the interaction type the cell,
    cell-line and tissue types etc).
  • Developed InnateDB submission system software to
    allow submission of interaction annotation in an
    ontology-controlled and MIMIx PSI-MI 2.5
    compliant manner.
  • Developed curator tool software to allow curators
    modify existing annotations.

8
Going Beyond Innate Immunity A Centralized
Resource for Interactions Pathways
  • Aside from the well known signalling pathways ? a
    range of other disparate processes, including
    apoptosis, ubiquitination, endocytosis, cell
    activation and recruitment ? all required to
    mount effective innate immune response.
  • Adding to this complexity ? borders between the
    innate and adaptive immune responses are becoming
    increasingly blurred.
  • Furthermore, if we hope to identify new networks
    or pathways involved in innate immunity, analyzes
    must include genes and proteins that are, as yet,
    not known to play specific roles in the innate
    immune response.
  • To address these issues ? InnateDB also
    incorporates data on the entire human and mouse
    interactomes.

9
Going Beyond Innate Immunity An Integrative
Biology Resource
  • 115,000 human and mouse interactions extracted
    loaded from BIND, INTACT, DIP, BIOGRID MINT
    DBs.
  • Cross-referenced genes to gt3,000 pathways from
    KEGG, PID, BIOCARTA, INOH, NetPath Reactome
    DBs.
  • Allows one to visualize/analyze interactions
    associated with specific pathway.
  • Pathway ORA.
  • Annotation from Ensembl provides details of human
    mouse genes, transcripts and proteins.
  • UniProt, Entrez, Gene Ontology ? rich protein
    gene annotation.

10
Through manual curation integration of existing
data from publicly available databases we can
greatly increase innate immunity relevant networks
TLR4 direct and secondary interactions annotated
by InnateDB
TLR4 direct and secondary interactions annotated
by MINT Database
11
Direct and Secondary Interactions of TLR4 in
InnateDB(20 of these interactions unique to
InnateDB)
12
www.innatedb.ca
13
InnateDB Advanced Yet User-Friendly Searching
Find Analyze Relevant Interactions, Pathways
Genes/Proteins.
14
InnateDB Facilitating Systems-Level Analyses of
Gene Expression Data
Upload Your Own Gene Expression Data - Up to 10
conditions/timepoints at 1 time.
Overlay Gene Expression Data from Multiple
Conditions on Networks/Pathways
Pathway, Gene Ontology TF ORA tools Find DE
Pathways/Functionally Related Genes/TFs
Go Beyond Pathway Analysis Differentially
Expressed Sub-networks New Pathways? How Are DE
Genes Actually Inter-connected? Central
Regulators (Network Hubs)
15
Pathway Analysis Any type of Quantitative Data.
Orthologous Pathways
GWA Candidate Associated Genes
  • InnateDB pathway analysis
  • identify OR pathways.
  • highlight potentially unknown relationships
    between makers on different chromosomes.

16
Constructing Analyzing Networks Using InnateDB
  • Pathway analysis can be very powerful in
    determining which annotated pathways are most
    significantly associated with DE genes.
  • Network analysis ? move from simple view of the
    signaling response to a more comprehensive
    analysis of the molecular interactions between DE
    genes and their encoded proteins RNAs.
  • Potentially uncover as yet unknown signaling
    cascades or pathways, functionally relevant
    sub-networks and the central molecules, or hubs,
    of these networks.

17
Results Visualize Gene Expression Data in an
Interaction Network Context
18
Multi-experiment View in Cerebral
19
Robust Orthology Gene Order Predictions
Facilitating Comparative Analysis
  • Majority of mammalian interaction data available
    in InnateDB and other interaction databases
    primarily refers to human genes and proteins.
  • To facilitate comparative network-based analysis
    of the human, mouse and bovine interactomes,
    detailed orthology predictions have been
    integrated into InnateDB.
  • Orthology predictions generated using an in-house
    method, Ortholuge, which provides accurate
    predictions of orthology using a phylogenetic
    distance-based approach.
  • Orthology predictions are further supported
    through the development of a human and mouse gene
    order and synteny browser.

20
A Guide to Using InnateDB
21
InnateDB User Friendly Interface www.innatedb.ca
22
(No Transcript)
23
Not sure what you want to search for? Browse
InnateDB by Interaction Type, Pathway or Various
Immune Gene Lists
24
All InnateDB Interactions Can be Downloaded in
Proteomics Standards Initiative (PSI) 2.5 XML
Format
25
Resources Page Details of Relevant Software,
Databases, and Immune gene Lists
26
Statistics on Curated Interactions Interactions
from other Databases
27
Use contact form or send email to
innatedb-mail_at_sfu.ca to report bugs, errors or to
get involved in curation.
28
Documentation, Tutorials Help
29
Searching InnateDB
30
Do a simple search for genes, proteins or
interactions of interest on the InnateDB hompage
e.g. IRAK genes.
31
Advanced Search for Genes Proteins
32
Advanced Search for Interactions
InnateDB contains detailed information for more
than 115,000 human and mouse molecular
interactions integrated from several of the major
public interaction databases along with 10,000
manually-curated innate immunity relevant
interactions.
To reduce redundancy, interactions in InnateDB
that have the same participants and interaction
type are grouped together by default. Choose 'No'
to return all redundant interactions separately.
33
Search for Particular Interactions or Genes that
are in a Specific Pathway
34
Search Results searching for genes of interest
e.g. IRAK
35
Search Results searching for genes of interest
e.g. IRAK
36
Interaction Results Page.
37
(No Transcript)
38
Visualize Interactions in a subcellular
localization-based layout using the Cerebral
plugin for Cytoscape.
39
How a biologist thinks of a pathway .
40
Pathway Visualization in Cytoscape
41
Pathway Visualization using Cerebral
www.pathogenomics.ca/cerebral (Bioinformatics
2007)
42
A Quick Guide to Using Cerebral in InnateDB
  • Cerebral can be used to visualize interaction
    networks from a set of interactions from
    InnateDB.
  • Cerebral uses subcellular localization
    annotations to provide more biologically
    intuitive pathway-like lay-outs of interaction
    networks.
  • Note the subcellular localizations in Cerebral
    should only be used as a guide. There are many
    proteins with no annotated subcellular
    localizations and many others that have multiple
    possible localizations (only 1 will be shown,
    nuclear, extracellular and membrane localizations
    will take precedence over cytoplasm if there are
    multiple).
  • InnateDB batch searching allows users to upload a
    list of genes along with associated gene
    expression data from up to 4 different
    conditions.
  • Gene expression data can be overlaid on network
    data and you can visualize this in Cerebral.

43
Opening Interaction Data in Cerebral from an
Interaction Results page in InnateDB.
  • You will be prompted to open a .jnlp file.
  • You are recommended to save this file to your
    computer and then open it this will allow you
    save a copy of this dataset.
  • Opening the .jnlp file directly without saving
    sometimes causes Cerebral to hang when loading
    large datasets.
  • Note to use Cerebral you need to install Java
    version 6 or greater.
  • You can get this from http//java.com/en/download/
    index.jsp

44
Opening Cerebral
  • Cerebral is a Java plugin for the Cytoscape
    Visualization software.
  • When you open the .jnlp file Cytoscape will begin
    downloading.
  • You will then be prompted Do you want to run
    the application click Run.

45
Cerebral is Now Open and Displays Interactions
Based on Protein Subcellular Localizations
46
Re-size the Network
Click here to re-size the network display to
full-screen.
47
Navigating in Cerebral
  • Right click and push your mouse forward or back
    to zoom.
  • Hold middle button of your mouse and drag to
    navigate around the network.
  • Grey nodes do not have an annotated subcellular
    localization (from Gene Ontology data in
    InnateDB).
  • Lines connecting nodes represent interactions.
    Dashed lines have only 1 supporting publication
    in InnateDB. The thicker the line the more
    publications support the interaction.

48
Interactively Link back to InnateDB to Look up
Information on Particular Genes/Interactions of
Interest.
  • Right-click on a node (protein/gene) or edge
    (interaction line) to link to the relevant gene
    or interaction details page in InnateDB.

49
Nodes Can be Dragged to Other Layers as Desired.
50
Do a simple search for genes, proteins or
interactions of interest on the InnateDB hompage
e.g. IRAK genes.
51
View Detailed Gene Annotation.
52
Gene Details Page.
53
Gene Details Page Molecular Interactions Gene
Ontology Annotation.
54
Integrated Orthology Gene Order Information
55
Human/Mouse Conserved Gene Order Synteny Browser
56
Gene Details Page Associated Pathways
57
Gene Details Page Cross-references to other
Databases
58
Integrating Gene Expression Data in a Molecular
Interaction Network and Pathway Context
59
InnateDB Integrating Gene Expression Data in a
Molecular Interaction Network and Pathway Context
Integrated Gene Expression Data with Molecular
interaction data Pathway associations Rich gene
annotation
Batch Search of InnateDB
Microarray Data ? Differentially expressed Genes
60
Orthologous Interaction Networks
  • Detailed protein/gene interaction data mainly
    available for human.
  • Can use InnateDB ortholog predictions in mouse
    and cow
  • Build the hypothetical orthologous interaction
    network for genes of interest in these species.
  • Find associations to pathways for orthologous
    genes e.g. map pathways to mouse genes based on
    human orthology.
  • Predict potential differences in different
    species e.g. missing orthologous gene in one
    species ? may indicate reliability as model
    organism for network of interest.
  • Compare orthologous predicted networks to
    experimental data e.g. in mouse.

61
Example Tab-delimited File
62
Upload Gene/Protein List to InnateDB Along with
Any Associated Quantitative Data
Select a file to upload by clicking on the
"Upload File" button - upload a tab-delimited
file of protein/gene identifiers or accession
numbers and obtain a list of all genes, proteins,
pathways, interactors or interactions that they
are associated with. Alternatively, click on the
"Web Form" button and paste your tab-delimited
data in the text box (max. 1000 lines)
63
(No Transcript)
64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
Results Visualize Gene Expression Data in an
Interaction Network Context
68
Multi-experiment View in Cerebral
Click on one of the mini-windows to view data for
condition in large window.
69
Cerebral
Multi-Array Viewer
70
Cerebral Multi-Array Viewer
71
Interactively Link back to InnateDB to Look up
Information on Particular Genes/Interactions of
Interest.
72
Pathway Over-representation Analysis
73
Return Pathways Associated with Uploaded Gene
List
  • To do pathway over-representation analysis (ORA)
    you first need to upload a list of gene
    identifiers and associated fold-change in gene
    expression values (and P values) as described
    above.
  • InnateDB recommends that you to upload All genes
    from your array dataset not just differentially
    expressed (DE) genes (probes mapping to multiple
    different genes should be removed). The pathway
    ORA tool uses the proportion of DE genes on the
    whole array to determine if a particular pathway
    is significant.
  • As the above method can be very conservative due
    to the large number of tests performed -InnateDB
    also provides users with the option of uploading
    a subset of genes and performing the pathway ORA
    analysis. This subset analysis uses a slightly
    different algorithm that does not take gene
    expression values into account. This is necessary
    as the algorithm does not know the proportion of
    DE genes on the array. Therefore, this analysis
    cannot handle data from multiple conditions.
  • If you have multiple probes for the same gene
    these values will be averaged for the purposes of
    the pathway ORA.
  • Because InnateDB sources its pathway data from
    multiple databases, each with its own
    interpretations of the components of a given
    pathway, you will observe some degree of
    duplication in the results however, this is
    outweighed by the extra annotation that can be
    obtained from different data sources.

74
Pathways Associated with Uploaded List
75
Choose Parameters for Pathway ORA
Choose fold-change in gene expression threshold
(determines which genes are considered
differently expressed) Default /- 1.5.
Choose P value threshold associated with each
fold-change in gene expression value. (determines
which genes are considered differently expressed)
Default P lt 0.05. Several different
statistical methods are available to determine if
pathways are significantly associated with DE
genes - Hypergeometric, Fisher Chi Square.
Two options to correct for multiple testing are
included - The Benjamini Hochberg correction
for the FDR and the more conservative Bonferroni
correction.
76
(No Transcript)
77
Pathway Summary Page
KEGG pathway diagrams can be dynamically linked
to overlaying gene expression data
78
Acknowledgements The Bioinformatics Team
  • Overall Project Management
  • Bob Hancock
  • Brett Finlay
  • Lorne Babiuk
  • Bernadette Mah
  • Bioinformatics InnateDB Management
  • Fiona Brinkman
  • David Lynn
  • InnateDB Database Development/Data Loading
  • Matthew Laird
  • Nicolas Richard
  • Fiona Roche
  • Timothy Chan
  • Michael Acab
  • InnateDB Search Engine User Interface
  • Geoff Winsor
  • InnateDB Submission System Curator tool
  • Calvin Chan
  • Naisha Shah
  • Cerebral Pathway Visualization Software
  • Jennifer Gardy
  • Aaron Barsky
  • Tamara Munzner
  • Orthologs Gene Order
  • Dan Tulpan
  • Matthew Whiteside
  • Mark Sun
  • Matthew Laird
  • Matthew Whiteside
  • Systems Administration
  • Matthew Laird (SFU)
  • Timothy Chan (UBC)
Write a Comment
User Comments (0)
About PowerShow.com