GS2PATH: Linking Gene Ontology and Pathways - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

GS2PATH: Linking Gene Ontology and Pathways

Description:

KRIBB Korea Research institute of Bioscience and Biotechnology ... in each Gene Ontology (GO) terms and map the part of gene set on GO term ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 32
Provided by: jinok
Category:

less

Transcript and Presenter's Notes

Title: GS2PATH: Linking Gene Ontology and Pathways


1
GS2PATH Linking Gene Ontology and Pathways
6th InCoB 2007
  • Jin Ok Yang
  • Korean BioInformation Center

2
KOBIC (Korean BioInformation Center)
  • The national bioinformatics center of Korea
  • Integration of diverse biological information
  • Genome information
  • Biodiversity information
  • Bioresource information
  • Bioinformatics training
  • International exchange program
  • Collaborative Development of bioinformatic tools
  • Bioportal (Biowiki)
  • Biopipeline (Bioworkflow engine)

3
BioWiki
  • Wiki
  • a web technology that enables anyone to create
    and update website contents
  • suited for developing online knowledge bases
    (e.g., Wikipedia )
  • BioWiki
  • To adopt the wiki paradigm in biology
  • Collaborative development of biological
    knowledge bases
  • BioWiki Contest ( http//biowiki.net )

4
BioPipe (http//www.biopipe.net)
  • BioWorkFlow Engine
  • No installation required
  • Drag Drop, and then Connect
  • BioPipe Contest !!
  • Aug 15th Sep 20th
  • Open free Web 2.0

Toolbar
Drag the module from the list and drop it into
the design view.
Ontology View
Design View
Monitoring View
5
GS2PATH Linking Gene Ontology and Pathways
6th InCoB 2007
  • Jin Ok Yang
  • Korean BioInformation Center

6
Background
GO Pathways
  • Efforts on analyzing functional relationships
    among gene sets with GO term and pathways
  • Gene Ontology (GO) Term based analysis ?
    Analysis focused on function
  • GO term related pathways ? More useful information

How do you interpret the gene set ?
7
Gene set enrichment
  • Enrichment Test
  • Means test to investigate which specific GO term
    the given gene set has
  • P-value for GO term was calculated by using
    hyper-geometric probability
  • Gene set enrichment
  • Derives its power by focusing on gene sets, that
    is, groups of genes that share common biological
    function, chromosomal location, or regulation
  • Evaluates microarray data at the level of gene
    sets which are defined based on prior biological
    knowledge

8
Introduction GO
  • GO databases and tools
  • GO term was used mostly to analyze data sets to
    identify significant biological changes
  • Pathways also can be exploited to find functional
    relationships in genes

9
Introduction Pathways
10
GS2PATH
  • A system to find gene set enrichment in each Gene
    Ontology (GO) terms and map the part of gene set
    on GO term into biological pathways (KEGG  and
    BioCarta)
  • An integrated search tool for analyzing the
    functional relationships in gene sets and for
    providing comprehensive results

11
Features
  • Functional relationships between GO term and
    pathways
  • Hyper-geometric test for gene set enrichment
  • Dual search for up- and down- regulation gene set
  • Various filtering options for GO terms
  • the number of descendant node, evidence of GO
    terms and statistical values mapping gene set
    in each GO term
  • User-specified coloring for genes onto pathways

12
Implementation (1/3)
  • GS2Path consists of
  • one internal database (mapping database)
  • four components
  • Query Processor, GO Accessor, KEGG Accessor, and
    BioCarta Accessor

13
Schema of internal mapping DB
14
Architecture
15
Implementation (2/3)
  • Query Processor
  • receives a user query
  • Converts query into gene related information
  • distributes it to the other components, waiting
    for receiving results
  • from them
  • GO Accessor
  • retrieves statistical values mapping gene set
    in each GO terms to KEGG and BioCarta Pathways
  • Calculates P-value using cumulative
    hyper-geometric distribution

16
Implementation (3/3)
  • BioCarta and KEGG Accessor
  • retrieve results from BioCarta and KEGG
    databases, respectively
  • To support user-specified coloring,
  • For KEGG, exploiting the web service API
    (SOAP/WSDL) of KEGG
  • For BioCarta, no supporting user-defined coloring
    API. Thus, after retrieving the image of a
    pathway from BioCarta database, we color genes in
    the image on-the-fly.

17
GO Term based Pathways Analysis
18
Search
  • Gene set enrichment test in organism total
    profile GO, KEGG and BioCarta
  • Single or two parts analysis (up and down
    regulation)
  • Pathway viewer for KEGG and BioCarta

19
Input
  • Database
  • GO category
  • Biological Process
  • Molecular Function
  • Cellular Component
  • Pathways KEGG and BioCarta
  • Organism
  • Human, Mouse, Rat, and Yeast
  • Gene ID list

20
Test
  • Enrichment test
  • P-value Hyper-geometric probability
  • FDR (False Discovery Rate)
  • Adjustment of p-value

21
Filtering
  • GO Term
  • Evidence
  • Slim
  • Number of genes in term
  • P-value
  • Pathways KEGG and Biocarta
  • Number of genes in term
  • P-value

22
Example microarray clustering data
Part A
Part B
23
Interface
Select GO category or Pathways
Select Organism
Put the gene set
24
Click
25
Retaining only GO terms having at least 5 genes
26
(No Transcript)
27
Select customized colors
28
(No Transcript)
29
(No Transcript)
30
Genes colored in KEGG and BioCarta
31
Conclusion
  • Using Gs2path, users
  • Get the integrated Gene Ontology terms and
    pathways information together
  • Filter the results with various conditions
  • Capture relationships between Gene Ontology terms
    and Pathways
  • Available at http//array.kobic.re.kr8080/arraypo
    rt/gs2path/
Write a Comment
User Comments (0)
About PowerShow.com