Title: The California Institute for Telecommunications and Information Technology
1Cyber Metagenomics Challenge to See The Unseen
Majority in The Ocean
Kayo Arima California Institute for
Telecommunications and Information Technology
(Calit2)-University of California, San Diego
Division
2Looking Back Nearly 4 Billion YearsIn the
Evolution of Microbe Genomics
Eukaryote has the nuclei . Prokaryotes has genes
but no nuclear membrane.
Science Falkowski and Vargas 304 (5667) 58
3Evolution is the Principle of Biological
SystemsMost of Evolutionary Time Was in the
Microbial World
Source Carl Woese, et al
4Two completely different approach to get
microbial genomic information
Microbial whole genomics
Metagenomics
Environmental sample Culture (grow) in
lab Isolate the colony Culture the isolated
colony DNA extraction Enz. digestion Shotgun
sequencing Gene assembly
Environmental sample DNA extraction Enz.
digestion Shotgun sequencing Scaffold assembly
5Down Side of Metagenomics
- Often fragmentary
- Often highly divergent
- Rarely any known activity
- No chromosomal placement
- No organism of origin
- Ab initio ORF predictions
- Huge data
6Genomic Data Is Growing Rapidly, But
Metagenomics Will Vastly Increase The Scale
100 Billion Bases!
35,000 Structures
Protein Data Bank
GenBank
www.rcsb.org/pdb/holdings.html
www.ncbi.nlm.nih.gov/Genbank
Total Data lt 1TB
7Full Genome Sequencing is ExplodingMost
Sequenced Genomes are Bacterial
First Genome 1995 6 Genomes/ Year 2000
Ongoing Genomes
Completed Genomes
90 Metagenomes
Total 422
Total 1665
www.genomesonline.org
8Marine Metagenomics
- Microbes account for more than 90 of ocean
biomass, mediate all biochemical cycles in the
oceans and are responsible for 98 of primary
production in the sea. - Metagenomics is a breakthrough sequencing
approach to examine the open-space microbial
species without the need for isolation and lab
cultivation of individual species.
9PI Larry Smarr
Paul Gilna Ex. Dir.
PI Larry Smarr
10Marine Genome Sequencing ProjectMeasuring the
Genetic Diversity of Ocean Microbes
Sorcerer II Data from this area has already reach
to 10 of GenBank. The Entire Data Will Double
Number of Proteins in Embank!
11Sample Metadata from GOS
- Site Metadata
- Location (lat/long, water depth)
- Site characterization (finite list of types plus
other) - Site description (free text)
- Country
- Sampling Metadata
- Sample collection date/time
- Sampling depth
- Conditions at time of sampling (e.g., stormy,
surface temperature) - Sample physical/chemical measurements (T (oC), S
(ppt), chl a (mg m-3), etc) - author
- Experimental Parameters
- Filter size
- Insert size
12(No Transcript)
13Marine Metagenomics
Metabolic pathway discovery
Drug discovery
Microbial genetic survey
Environmental survey
Symbiosis
Who is there?
Evolution study
Endosymbiosis
Organism discovery
Bioenergy discovery
Microbial genomic survey
Biogeochemistry mapping
Marine conservation
14(No Transcript)
15(No Transcript)