Title: Bioin401 Project: Microarray Pipeline
1Bioin401 Project Microarray Pipeline
- Chunyan Meng,
- Yifeng Liu,
- Chelsea Ju
- March 27, 2006
2Microarray Pipeline GANTT Chart
3Micorarry PipelineProgramming progress
- Bio-Perl installation and testing
- Email notification
4Hiearchical Clustering Gene function prediction
- Checked with Greg about biospider
- Checked with Savita about BASys
- (Bacterial Annotation System)
- Literature review about gene function
prediction - Browser the current available methods used for
gene/protein function prediction - Trying to find the software or paper using human
microarray data
5Microarray PipelineGene Function Prediction
software
6Microarray PipelineGene Function Prediction
7Microarray PipelineGene Function Prediction
(STRING)
- STRING - Search Tool for the Retrieval of
Interacting Genes/Proteins - STRING is a database of known and predicted
protein-protein interactions. The interactions
include direct (physical) and indirect
(functional) associations they are derived from
four sources - Genomic Context analysis conserved genomic
neighborhood, gene fusion events, and
co-occurrence of genes across genomes. - High-throughput Experiments microarray
experiments - (Conserved) Coexpression
- Previous Knowledge (text mining Pubmed and mips)
-
8Microarray PipelineGene Function Prediction
(STRING)
- STRING quantitatively integrates interaction data
from these sources for a large number of
organisms, and transfers information between
these organisms where applicable. - The database currently contains 736429 proteins
in 179 species. - Input gene/protein name or amino-acid sequence
- Scope 179 species
- Access publicly accessible through web
interface, academic license free but with delay.
Selected data-items as flat files are available
for free.
9Microarray PipelineGene Function
Prediction(Phydbac)
- Phylogenomic Display of Bacterial Genes (Phydbac)
- The non-homology based methods allow to determine
proteins potentially linked to the pasted
sequence. Functional predictions are then made
out of the annotations of the proteins associated
with the query. - The comparative genomic methods used here are
- Phylogenetic Profiling identifies proteins
that have evolved in a similar manner to the
sequence of interest. - Genomic proximity
identifies proteins that are found nearby on 1 or
more genome.- Domain Fusion events determines
the Pfam domains that are found fused with
domains matching the given sequence. - Input a gene name or sequence in fasta format
- Scope bacteria E Coli K12
- Access publicly available through web interface
10Microarray PipelineGene Function Prediction
- Get better understanding what function prediction
is all about. What kind of methods are available.
What kind of challenges exist.
11Microarray PipelineWhat to do next
- Paper review on gene function prediction
- Integrating different parts together
- Project report writing
12Microarray Pipeline GANTT Chart
13