Title: JGI Timeline
1JGI Timeline
Joint Genome Institute .(JGI)
Non Traditional User Facility
2- The JGI Post Human
Genome Project - Community Sequencing Program
- (CSP)
- Microbial Community Genomics
-
3(No Transcript)
4Overview
The Community Sequencing Program (CSP) To
provide the scientific community through a peer
reviewed process access to high throughput
sequencing at the JGI.
5User Guide gt How to Propose a Project
What types of projects will the JGI/CSP
accept? A wide range of projects. Ultimately,
the most important factor in determining if a
project will be accepted is its scientific merit.
6Proposals Peer Review Process General
Scientific Users Proposals
Designated Lab Director
Proposal Study Panel
Scientific Advisory Committee
Users
JGI Director
Sequence Allocation
7What can researchers get from the CSP
program? The deliverables can range from raw
sequence traces to well-annotated assembled
genomes depending on the request in the proposal.
8Scientific Support for Approved Projects
Interactions of the JGI and Scientific Users
with Approved Sequencing Proposals
Production Sequencing
Users
Scientific Support Group SSG
Informatic Analysis Of Sequence
9Scientific Support for Approved Projects
Interactions of the JGI and Scientific Users
with Approved Sequencing Proposals
Production Sequencing
DOE Gov Agencies
Scientific Support Group SSG
GTL, Microbe
CSP
Informatic Analysis Of Sequence
(EPA,USDA, NSF)
10DOE
Production Sequencing
Informatics
JGI Science Programs
11DOECSPGov A
Scientific Support Group
Informatics
Production Sequencing
JGI Science Programs
12(No Transcript)
13- lt 1 of microbes are culturable
- Many unculturables live in interdependent
consortia of considerable diversity - Aim to recover genome-scale sequences and reveal
metabolic capabilities - What is the structure of natural microbial
populations? What is a microbial species? Can we
harness their metabolic capabilities
14What Enviroments to Study?
- Ones with minimal microbial complexity
15Iron Mountain
Jill Banfield et al. UC Berkeley
16Iron Mountain
Superfund site Discharging gt1 ton of toxic
metals/day (pH lt1) FeS2
17whole metagenome shotgun dataset
18Enviromental Sample
Purify High Molecular Weight DNA
Fosmid Library Construction
Shotgun Library Construction
Fosmid Insert End Sequencing
DNA Sequencing
Assembly Annotation
19Enviromental Sample
Purify High Molecular Weight DNA
When possible culture isolates
Fosmid Library Construction
Fosmid Insert End Sequencing
DNA Sequencing
Assembly Annotation
?
20Iron Mtn whole metagenome shotgun GC content
separates into two components
bacteria
Reverse read average GC
archaea
Forward read average GC
21Iron Mountain whole metagenome shotgun GC and
depth distributions
Read depth
Lepto III
3
10
Lepto II
Bacterial
22Read depth
Lepto III
3
10
Lepto II
Bacterial
Archaeal
Fer 1 (cultured and sequenced ) G-plasma
Fer 2
3
10
23Stoichiometry
Read depth
Lepto III (1X)
3
10
Lepto II (3X)
Bacterial
Archaeal
Fer 1 (1X) G-plasma (1X)
Fer 2 (3X)
3
10
24Lepto III
3
10
Lepto II
Other sampled genomes at low depth (including
eukaryotes) 15 of reads
Bacterial
Archaeal
Fer 1 G-plasma
Fer 2
3
10
25Similarity to Fer1 (isolate) to Sequence in
Community
Mixed Community Reads
98-100
Fer2
Fer1
G plasma
Number of reads
.50
.60
.70
.80
1.
.90
id to cultivated Fer1 isolate
26Conclusions So Far
- The stochiometry of organisms encouraging for the
assembly of individual genomes - Assemblies support 16S studies suggesting limited
diversity - Isolated Fer1 genome sequences matches genome in
environmental sample
27How do we know that our assembly is correct?
28How do you know youve done it right?Check pair
ends against scaffold
How do we know that our assembly is correct?
At the gross level check pairs (expect few due
to failing/chimeric clones) Align all reads back
against assembled scaffolds scaffolds end where
there is no clone coverage in 3kb
plasmids Identifies potentially repetitive areas
and/or rearrangements
29Fer2 vs. fer1 shows local synteny
- Fer1 and
- Fer2 have avg. nt identity of 78
Fer2 gene on contig
Fer1 gene on contig
30What does it mean to assemble a community genome?
- Sample derived from millions of genomes.
- ?
- What is a species in the enviroment?
- Members of the same species
- significantly different (many lineages survive
and diverge) - highly similar (selective sweeps)
31What does it mean to assemble a community genome?
- Lepto II 1 nucleotide variation / 3,000 bp
-
Fer II 2.2 nucleotide variation / 100 bp
325 Reads of the Same Sequence from 5
Different Members of the Same Species (FerII)
- CONSENSUS 130953 gtttatattaaatccattgatttctaagctt
ccggttcttcttccgtataatggagattt 131012 - XYG46314.b1 162 A.......C......................
..A...........A.............. 103 - XYG44123.b1 673 A.......C......................
..A...........A.............. 732 - XYG44918.b1 48 A.......C......................
..A........... 4 - XYG13291.g3 2
.......... 11 - XYG40116.g1 192 ......G........................
............................. 133 - XYG3051.b2 396 ......G........................
............................. 455 - CONSENSUS 131013 atagcttaataattcatcctccatcatact
tatgcttgaacctgataatattatgtatag 131072 - XYG46314.b1 102 ...............................
............................. 43 - XYG44123.b1 733 ...............................
............................. 792 - XYG13291.g3 12 ...............................
............................. 71 - XYG40116.g1 132 ...A...........................
............................. 73 - XYG3051.b2 456 ...A...........................
............................. 515 - CONSENSUS 131073 ccttgtagtatccattaattcatcaaatatt
ttctgcattatagatataataccatggtt 131132 - XYG46314.b1 42 ...............................
........... 1 - XYG44123.b1 793 ........................
816 - XYG13291.g3 72 ...............................
............................. 131
1
1
3
3
33Two Haplotypes Among the
5 Different Members of the Same
Species (FerII)
- CONSENSUS 130953 gtttatattaaatccattgatttctaagctt
ccggttcttcttccgtataatggagattt 131012 - XYG46314.b1 162 A.......C......................
..A...........A.............. 103 - XYG44123.b1 673 A.......C......................
..A...........A.............. 732 - XYG44918.b1 48 A.......C......................
..A........... 4 - XYG13291.g3 2
.......... 11 - XYG40116.g1 192 ......G........................
............................. 133 - XYG3051.b2 396 ......G........................
............................. 455 - CONSENSUS 131013 atagcttaataattcatcctccatcatact
tatgcttgaacctgataatattatgtatag 131072 - XYG46314.b1 102 ...............................
............................. 43 - XYG44123.b1 733 ...............................
............................. 792 - XYG13291.g3 12 ...............................
............................. 71 - XYG40116.g1 132 ...A...........................
............................. 73 - XYG3051.b2 456 ...A...........................
............................. 515 - CONSENSUS 131073 ccttgtagtatccattaattcatcaaatatt
ttctgcattatagatataataccatggtt 131132 - XYG46314.b1 42 ...............................
........... 1 - XYG44123.b1 793 ........................
816 - XYG13291.g3 72 ...............................
............................. 131
1
1
3
3
34Two haplotypes Among the
5 Different Members of the Same
Species (Fer II)
- CONSENSUS 130953 gtttatattaaatccattgatttctaagctt
ccggttcttcttccgtataatggagattt 131012 - XYG46314.b1 162 A.......C......................
..A...........A.............. 103 - XYG44123.b1 673 A.......C......................
..A...........A.............. 732 - XYG44918.b1 48 A.......C......................
..A........... 4 - XYG13291.g3 2
.......... 11 - XYG40116.g1 192 ......G........................
............................. 133 - XYG3051.b2 396 ......G........................
............................. 455 - CONSENSUS 131013 atagcttaataattcatcctccatcatact
tatgcttgaacctgataatattatgtatag 131072 - XYG46314.b1 102 ...............................
............................. 43 - XYG44123.b1 733 ...............................
............................. 792 - XYG13291.g3 12 ...............................
............................. 71 - XYG40116.g1 132 ...A...........................
............................. 73 - XYG3051.b2 456 ...A...........................
............................. 515 - CONSENSUS 131073 ccttgtagtatccattaattcatcaaatatt
ttctgcattatagatataataccatggtt 131132 - XYG46314.b1 42 ...............................
........... 1 - XYG44123.b1 793 ........................
816 - XYG13291.g3 72 ...............................
............................. 131
1
1
3
3
35Polymorphisms occur in blocks
polymorphic sites
Local depth
ORFs
- Long quiet regions separate highly variable
segments - Variation is found in blocks of 5-10 genes
36Summary of Iron Mountain Biofilm
-
- Limited number of predominant species present in
biofilm the majority have never been cultured - Several lines of evidence suggest that we can
assemble genomes of these organisms - Simplicity of community suggests removal of most
variants by natural selection -
- Now studying the metabolic capabilities of
microbes
37(No Transcript)