Title: CAMERA
1CAMERA
- e-Genomics Conference
- Sep 11th - 13th 2006
- Paul Gilna, Calit2, UCSD
2The CAMERA Partnership
Community Cyberinfrastructure for Advanced Marine
Microbial Ecology Research and Analysis
3Genomic Data Is Growing Rapidly, But
Metagenomics Will Vastly Increase The Scale
100 Billion Bases!
35,000 Structures
Protein Data Bank
GenBank
www.rcsb.org/pdb/holdings.html
www.ncbi.nlm.nih.gov/Genbank
Total Data lt 1TB
4Marine Genome Sequencing ProjectMeasuring the
Genetic Diversity of Ocean Microbes
5Metagenomics Will Couple to Earth Observations
Which Add Several TBs/Day
Source Glenn Iona, EOSDIS Element Evolution
Technical Working Group January 6-7, 2005
6Driven by User Needs
- CAMERA serves as one representation of a specific
research communitys need for a system to - Collect and reference increasing metadata
relevant to environmental metagenome datasets - Exploit the power of querying on metadata across
multiple geospatial locations - Have access to a diverse and customizable set of
easy-to-use tools to analyze their data in the
context of collected metagenomic and whole
genomic datasets - Have ability to update and propagate improvements
to annotations - Have a pre-publication, pre-submission
collaborative workspace - Serve a diverse informatics-literate community
7Services Provided
- Data and Application Services
- Tools and Workflows
- Computational Data, Visualization and
Collaborative environment - Outreach and Training in Environmental Genomics
8Data and Application Services
- Primary Data
- Sargasso Sea and Sorcerer II expedition data
- JGI marine terrestrial environmental datasets
- Moore Microbial Genomes
- JGI and other relevant whole genomes
- Research community submitted datasets
- Submitted 454-based metagenomic datasets
- Publically available NR protein and DNA sequence
datasets - Derived Data
- Annotations of datasets
- Assemblies
- Alignments
- Pre-computed clusters
9Moore Microbial Genome Sequencing Project
Cyanobacteria Being Sequenced by Venter Institute
10Moore Microbial Genome Sequencing Project
Cyanobacteria Being Sequenced by Venter Institute
11Sample Metadata from GOS
- Site Metadata
- Location (lat/long, water depth)
- Site characterization (finite list of types plus
other) - Site description (free text)
- Country
- Sampling Metadata
- Sample collection date/time
- Sampling depth
- Conditions at time of sampling (e.g., stormy,
surface temperature) - Sample physical/chemical measurements (T (oC), S
(ppt), chl a (mg m-3), etc) - author
- Experimental Parameters
- Filter size
- Insert size
12Tools and Workflows
- Initial set
- BLAST Server
- Clustering
- HMM/Profile
- Neighborhood analysis
- Multiple sequence alignments
- Assembly
- Proposed New Tools
- Multiple Auto Annotation pipelines
- Fast Sequence lookup
- Customized Assembly
- Phylogenetic Analysis
- Clustering Tools
13Guiding Philosophy for Development
- Sprint Q4 2006
- Propagate JCVI toolkit and data ASAP
- Mechanism for publication of Sorcerer II data
- Enabler for community
- Defined deliverables, project management approach
- Marathon Q4 2006 onward
- Additional Datasets
- Additional tools
- Community drives prioritization for ongoing
releases - Advisory Board, Community Outreach
- Keys to success
- Tight integration of science, bioinformatics,
software, and IT - Matched to Community Needs
14The Future Home of the Moore Foundation Funded
Marine Microbial Ecology Metagenomics Complex
First Implementation of the CAMERA Complex
Major Buildout of Calit2 Server Room Underway
http//calit2-1101-1.ucsd.edu/
Photo Courtesy Joe Keefe, Calit2
15Moore CAMERAProduction Environment
- Creation of Initial Production Environment
September 2006 - Hardware
- Compute Nodes
- 200 4 CPU Nodes 800 Processing Cores
- Storage Servers
- 10 systems ¼ Petabyte raw storage
- Database Servers
- Larger 20-40TB Smaller 5-10TB
- Network Management
- Force10 E1200 Router w/12 10GigE Interfaces to
Each System Ports - User Access to Compute Cycles
- Bulk of free cycles available to external users
- Proposal mechanism
Source Greg Hidley, Calit2 Phil Papadopoulos,
SDSC, Calit2
16Countries are Aggressively Creating Gigabit
ServicesInteractive Access to CAMERA and
LOOKING Systems
www.glif.is Created in Reykjavik, Iceland 2003
17Scale
18Calit2s Direct Access Core Architecture Will
Create Next Generation Metagenomics Server
Sargasso Sea Data Sorcerer II Expedition
(GOS) JGI Community Sequencing Project Moore
Marine Microbial Project NASA Goddard
Satellite Data Community Microbial Metagenomics
Data
Traditional User
Request
Response
Web Services
Source Phil Papadopoulos, SDSC, Calit2
19OptIPuter Scalable Adaptive Graphics Environment
(SAGE) Allows Integration of HD Streams
OptIPortal Termination Device for the
OptIPuter Global Backplane
20Calit2 and the Venter Institute Will Combine
Telepresence with Remote Interactive Analysis
Live Demonstration of 21st Century
National-Scale Team Science
21Calit2 and the Venter Institute Test CineGrid
with HDTV Movie by John Carter
StarLight Chicago
Sony HDTV JH-3
JCVI
Calit2 Auditorium
JC Venter Institute Rockville, MD
Live Demonstration of 21st Century Entertainment
Delivery June 14, 2006
22OptIPortal Termination Device for the OptIPuter
Global Backplane
- 20 Dual CPU Nodes, 20 24 Monitors, 50,000
- 1/4 Teraflop, 5 Terabyte Storage, 45 Mega
Pixels--Nice PC! - Scalable Adaptive Graphics Environment ( SAGE)
Jason Leigh, EVL-UIC
Source Phil Papadopoulos SDSC, Calit2
23UIC/UCSD 10GE CAVEWave on the National
LambdaRail Emerging OptIPortal Sites
OptIPortals
UW
NEW!
UIC EVL
MIT
NEW!
JCVI
UCI
UCSD
SIO
SunLight
SDSU
CICESE
CAVEWave Connects Chicago to Seattle to San
Diegoand Washington D.C. as of 4/1/06 and JCVI
as of 5/15/06
24First Remote Interactive High Definition Video
Exploration of Deep Sea Vents
Canadian-U.S. Collaboration
Source John Delaney Deborah Kelley, UWash
25High Definition Still Frame of Hydrothermal Vent
Ecology 2.3 Km Deep
Source John Delaney and Research Channel, U
Washington
White Filamentous Bacteria on 'Pill Bug' Outer
Carapace
26A Near Future Metagenomics Fiber Optic-Enabled
Data Generator
Source John Delaney, UWash