Title: A distributed and collaborative software system for bioinformatics analysis with applications to gene regulation and variation analysis BCCRC Thursday Seminar Series, Dec 2004 Stephen Montgomery, Genome Sciences Centre
1A distributed and collaborative software system
for bioinformatics analysis with applications to
gene regulation and variation analysisBCCRC
Thursday Seminar Series, Dec 2004Stephen
Montgomery, Genome Sciences Centre
2SCIENTIFIC GOAL To understand more about the
role of mutation in gene regulation.
ENGINEERING GOAL Improve global access to
bioinformatics data, tools, and resources.
3Sockeye Integrating bioinformatics data
p53 alignment
snp density
4An interaction map of biologists and
bioinformaticians
Bioinformaticians
Biologists
5Things get more complicated
- Each individual has
- Access to different resources
- Computational / Monetary / Personnel
- Finite time available
- A different social network
- Professional obligations
- Each group
- Organizational boundaries
- Toolkit (suites and scripts)
- Method of providing tools (OS, Internet,
Interfaces)
6An improved interaction map of biologists and
bioinformaticians
Bioinformaticians
Biologists
Improved Communication
Access to communities Access to resources Retain
sub-organization
7A community-based approach to bioinformatics
analysis
- Use the principles of peer-to-peer technology
- Allow biologists to easily discover and run
bioinformatics tools - Create a dynamic, reliable network for analysis
- Reduce overlapping integration efforts
- Improve communication within/outside
organizations - Address problems relevant to bioinformatics
- Attribution
- Resource distribution
- Specialized data
8discover and run jobs here or through bioperl
9(No Transcript)
10Algorithms integrated into Chinook
ClustalW Genscan
Conreal Sim4
DIALIGN MSCAN
LAGAN ANN-Spec
Mauve Recursive Gibbs Motif Sampler
ORCA MEME
Shuffle-LAGAN Motifsampler
T-Coffee RSAT oligo analysis
Promoterwise STUBB
Primer3 Teiresias
Eponine wConsensus
ELPH ContigMerger
11Adding services (GUI-based)
12(No Transcript)
13Use cases of Chinook
- Grid/Cluster computing.
- Internally connect teams / individuals.
- Collaborate with remote individuals.
- Provide an API layer to your algorithms.
- Insert bioinformatics analysis into applications.
- Show off your tools.
14Searching for Regulatory Variation
15How will/does Sockeye/Chinook help?
- We can perform these analyses on a gene-by-gene
basis for variant sequences. - Can load and visualize results against diverse
annotation such as - known regulatory binding sites,
- ChIP sites,
- DNase 1 hypersensitive sites,
- Encode consortium regions,
- CisRED information,
- Chimpanzee-derived regions of variance,
- and various other SNP and mutation resources.
- we are now developing high-throughput approaches.
16Future Plans
17Projects involving Chinook
- OrthoSEQ plans to provides analysis through the
Chinook/Bioperl Perl API. - Sockeye uses Chinook to deliver state-of-the-art
alignment, PCR prediction, and regulatory
analysis - Pegasys plans to provide pipeline management to
subset of services advertised by Chinook. - Bio-Linux planning to integrate a subset of their
algorithms. - Z-Lab at Boston U. integrating some of their
module prediction algorithms
18Acknowledgements
- GENOME SCIENCES CENTRE
- Steven Jones
- Tony Fu
- Jun Guan
- Keven Lin
- Asim Siddiqui
- Genereg team _at_ GSC
- Mark Mayo
- Bernard Li
Funding MSFHR, Genome Canada VISIT
http//smweb.bcgsc.bc.ca OR http//www.bcgsc.bc.ca
/chinook