Gene Ontology - PowerPoint PPT Presentation

About This Presentation
Title:

Gene Ontology

Description:

Map selected genes' GO category membership onto GO tree ... tree, parse file every time faster, less code, less chance of error (i.e. if ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 9
Provided by: matt79
Category:

less

Transcript and Presenter's Notes

Title: Gene Ontology


1
Gene Ontology
2
GO Terms Data
  • Full tree available from www.geneontology.org in
    text format
  • OBO file format is current, contains all terms in
    1 file
  • GO format deprecated, 3 separate files
  • Gene to GO category mapping available in
    Affymetrix annotations
  • 6118 // electron transport // inferred from
    electronic annotation /// 6810 // transport //
    inferred from electronic annotation

3
WorkBench Model
  • Select set of genes (by hand, or by process
    i.e. T-Test)
  • GO panel automatically identify chip type by
    specific marker presence (or override by drop
    down)
  • Check for existence of serialized tree model, if
    doesnt exist then build (from deprecated GO
    format) and serialize
  • Map selected genes GO category membership onto
    GO tree
  • Calculate PValue of selected genes GO category
    membership
  • Compare selected genes expression levels in a
    given category to the reference genes expression
    levels this determines whether a gene is
    considered enriched or not

4
Issues - Structural
  • Use of Swing DefaultTreeModel as data structure
    to represent tree limits use by other
    components (1624)
  • GO tree not accessible to other components
    contained only within the GO Panel

5
Issues - Programmatic
  • Of the 2.8 seconds it takes to build an in memory
    representation of the GO tree, 2 seconds is
    serializing the tree. Takes 3.9 seconds to
    deserialize tree (390ms to parse original file)
  • ComputeCumulative() is recursive can overflow
    stack for deep trees (1502)
  • Linear search caused by Vector.contains() at
    GoTerm190 approached 25 of runtime for large
    gene lists. Replace with something like TreeSet.
  • PValue trends rendering causes OutOfMemory for
    larger sets of genes, crashing interface.

6
Issues - GUI
7
Issues GUI
  • Confusing layout of components, i.e. when
    checkbox beside Reference List is checked, that
    means ignore reference list
  • List of genes for a given category has no
    functionality (should be able to add to
    selections from that list too at least)
  • No way to see position in GO tree for a gene in
    the table view
  • Multiple progress bars, sometimes with big pauses
    in between confuse user
  • Is the PValue trends graph useful?

8
Proposed fixes
  • Switch to using OBO gene tree file format
  • Define an interface for GO terms usage across the
    entire application and back that with standard
    data structures (not Swing support classes)
  • Move GO terms into the global annotations service
  • Do not serialize the GO terms tree, parse file
    every time faster, less code, less chance of
    error (i.e. if the OBO source file was updated)
  • Change GUI to split pane with table view on the
    left and a dynamic tree view on the right which
    changes to display the tree for the selected gene
  • Make it clear when youre using all the genes
    from the specified annotation as the reference
    list, or when youve specified a list of genes to
    override that list
  • Unless P-Value trends graph provides important
    functionality, remove it or replace it by
    exposing the results of the analysis and
    displaying in a more general graphing component
Write a Comment
User Comments (0)
About PowerShow.com