Sequencing the Maize Genome - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Sequencing the Maize Genome

Description:

Draft sequence of the maize genome. All BACs: shotgun & pre-finishing ... (Hind III, EcoR I, MboI ; 27X genome coverage (~150kb inserts) BAC End. Sequencing ~800,000 ... – PowerPoint PPT presentation

Number of Views:444
Avg rating:3.0/5.0
Slides: 41
Provided by: mariom2
Category:

less

Transcript and Presenter's Notes

Title: Sequencing the Maize Genome


1
Sequencing the Maize Genome
Maize Genome Sequencing Consortium
rwilson_at_watson.wustl.edu
2
Sequencing Progress
3
A 22 Mb sequence contig on Maize chromosome 4
Maize Chr4
Genetic
Physical
Synteny
4
Plans Milestones
  • 22 Mb contig on chromosome 4
  • Analysis publication
  • Draft sequence of the maize genome
  • All BACs shotgun pre-finishing (?)
  • End of the calendar year
  • Announce at the Maize Meeting in D.C.
  • Completion of the maize genome sequence
  • Version 1.0
  • Analysis Publication
  • Future Work
  • Secondary Annotation
  • Clean-up sequencing, maintenance

5
Maize Genome Sequencingat Arizona
Rod A. Wing
Arizona Genomics Institute BIO5 Department of
Plant Sciences University of Arizona
6
BAC by BAC Strategy to Sequence the Maize Genome
Maize B73 Genome (2300 Mb)
BAC library construction (Hind III, EcoR I, MboI
27X genome coverage (150kb inserts)
Genetic Anchoring in silico, overgo hybridization
(19,292)
Framework
BAC End Sequencing 800,000
Fingerprinting 460,000 BACs
BAC physical maps (HICF Agarose)
FPC databases (Agarose and HICF)
STC database
Choose a seed BAC (800 Kb spacing)
Shotgun sequencing and finishing
STC database search, FP comparison
Determine minimum overlap BACs
Complete maize genome sequence
7
Estimated Chromosomal Coverage
100
90
Physical
Genetic
80
70
60
Percentage
50
40
30
20
10
1
2
3
4
5
6
7
8
9
10
Chromosomes
The chromosomal coverage based on maize cv Seneca
60
8
Minimum Tiling Path Pipeline(CSHL/AGI)
  • BAC End Sequence of potential BACs
  • are BLASTed against the Seed BACs
  • Results are classified based on location
  • on the physical map
  • A table for each BAC is created of filtered
  • BLAST results with links to CMap and
  • GBrowse
  • Blast results are imported into CMap and
  • GBrowse with additional information such
  • as trace files and FPCs
  • A table of alignments between the seed
  • BAC and the BAC end sequences
  • contains links to CMap and GBrowse.
  • CMap displays the FPC data for the seed
  • BAC and the candidate BACs to pick.
  • GBrowse provides an alignment of the BES
  • with the seed sequence and displays the
  • trace data.

9
Clone Picking Progress
  • Seed BACs 3,400, complete
  • Clone Walking from Seed BACs 12,824 complete
  • Total clones picked 16,224 (169 96-well
    plates)
  • 15,400 successful
  • 7,800 Year 1
  • 7,600 Year 2
  • Gap-filling
  • 600 Year 3, in progress

10
Clone Picking
  • Clone Walking
  • By sequence if seed BAC sequence was available
  • By fingerprints when no sequence was available
  • Clone verification
  • BAC end sequence
  • Seed BAC sequence

11
Library Picking
  • 60 cycles to look through
  • 1,221 384-well plates for
  • 16,320 clones

12
BAC End Sequencing(for Clone Verification)
170 96-well plates for 16,320 clones
generating 48,960 BES (2 forward, one reverse)
13
DNA Preparation and Shearing
170 96-well plates for 16,320 clones 10
plates each month 2.5 plates per person
14
MegaContig 182 in Maize Genome and Its Synteny to
Rice
Maize Chr4
All ordered and orientated
26 MB
Genetic
Physical
Synteny
15
Maize Pseudomolecules for Rice Syntenic Chr3S
6.9 Mb (1.5 gap/BAC)
7.2 Mb (1.7 gap/BAC)
Maize Chr9L
Rice Chr3S
Maize Chr1S
16
Maize Production Sequencing
lfulton_at_watson.wustl.edu
17
Maize Production Goals
  • BAC End Sequencing of 220,000 Clones
  • Fosmid End Sequencing of 500,000 Clones
  • Shotgun of 16,000 BAC Clones

18
Maize BAC End Sequences
  • 580,000 reads processed
  • 567 average read length
  • 60 success

19
Maize Fosmid End Sequences
  • 850,000 processed
  • 79 success
  • 543 average read length
  • Completed today

20
Library Construction Pipeline
  • Receipt of sheared DNA from AGI
  • Size selection of insert DNA
  • Ligation into pSMART vector

21
  • Constructed 17,034 Libraries as of August 31st

22
Average Fail Rate for Library Construction was
less than 5
23
Shotgun Criteria
  • 3.5X coverage
  • Clone size verification
  • 50 paired ends
  • BES agreement
  • 25 of clones failed
  • 22 need more data
  • 3 BES disagreement

24
Shotgun Complete for 12,211 Clones as of August
31st
25
Final Production Work
  • 660 Clones Need Library Construction
  • 2100 Clones In Production Pipeline
  • Expected Completion Date December 2007

26
Sequence Improvement Bob Fulton
Dick McCombie Rod Wing
27
Sequence Improvement Pipeline
  • Shotgun_done triggers the prefinishing pipeline
  • Initial identification of do finish regions
  • Manual sorting and use of autoedit(Gordon) to
    break apart misassembly.
  • Autofinish(Gordon) used to choose directed
    reactions for all gaps and regions of low quality
    in do finish regions
  • Reassembly and 2nd iteration of prefinishing
    pipeline
  • Final identification of do finish regions and
    handoff to finishing pipeline

28
Clone Improvement through the Prefinishing
Pipeline
29
(No Transcript)
30
Assembly View-Entire Clone
Coverage (green)
Spanning Plasmids
End
31
Assembly View-Do Finish Region
EST sequence
GSS sequence
Do Finish
Repeat Tags
32
Alignment with cDNA read pairs
Alignment with End Sequences
33
(No Transcript)
34
(No Transcript)
35
Maize GenBank Submissions
Joanne Nelson
36
Submission Landmarks
HTGS_FULLTOPHTGS_PREFINHTGS_ACTIVEFINHTGS_I
MPROVED
37
Improved Sequence

Non-repetitve portions of the sequence have had
sequence improvement (directed attempts) and
have been labeled as improved. Improved
regions are double stranded, sequenced with an
alternate chemistry or covered by high quality
data (i.e. phred quality greater than or equal
to 30 or approval by an experienced finisher),
unless otherwise noted. Regions of low sequence
complexity (such as dinucleotide repeats and
small unit tandem repeats) in the improved
regions have not been resolved to previously
established finishing standards. BAC end
sequence, cot and methyl filtered genome survey
sequence and data from overlapping projects of
strain B73 may have been included in this
project. Where possible, contigs have been
ordered and oriented based on read pairing.
These regions are designated as scaffolds.
Additional order and orientation will be provided
upon completion of detailed analysis of the
complete finished tiling path.
38
Improved Sequence
FEATURES Location/Qualifiers
source 1..184604
/organism"Zea mays"
/mol_type"genomic DNA"
/db_xref"taxon4577"
/chromosome"1"
/clone"CH201-132J17 ZMMBBc0132J17"
misc_feature 1..69252
/note"scaffold_nameScaffold1" misc_feature
1..34245
/note"assembly_nameContig28
vector_sideSP6" misc_feature
32401..34245 /note"Improved
sequence." unsure 34230..34245
/note"Non-repetitive but
unresolved region" gap
34246..34345
/estimated_lengthunknown misc_feature
34346..68071
/note"assembly_nameContig27" misc_feature
34346..36695
/note"Improved sequence." unsure
34346..34356
/note"Non-repetitive but unresolved region"
misc_feature 38146..46795
/note"Improved sequence." gap
68072..68171
/estimated_lengthunknown misc_feature
68172..69252
/note"assembly_nameContig14" gap
69253..69352
/estimated_lengthunknown misc_feature
69353..132243
/note"scaffold_nameScaffold2

39
Submission Totals
HTGS_FULLTOP 3342HTGS_PREFIN 2014HTGS_ACTI
VEFIN 4151HTGS_IMPROVED 2660 TOTAL
12167
40
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com