Title: Stuff to Do
1Stuff to Do
2Midterm Iquestions due 1/31
- Email me your question (with answers),
- if you have the capability, mail complete
questions, figures, etc. and all, - if not, write questions, with instructionsi.e.
in Figure 2 of x paper, blah, blah, blah, - Friday afternoon, Ill post the questions on the
WEB page, on Monday, youll have time to work on
them together, in class.
3Cycle SequencingChain Terminationa DNA
polymerase application.
ddNTPs
dNTPs
4(No Transcript)
5Linked on Course WEB Page.
Cycle Sequence Tutor
and an animation, http//www.dnalc.org/shockwave/
cycseq.html
6Disclaimer this review is heavily biased toward
the public sequencing consortium.
7Map First then sequence
Sequence First then map
8Genome Sequencing Strategy 1
- Clone-by Clone Approach
- Order clones along the genome, then sequence,
- not dependent on acceleration of sequencing
capacity, - not dependent on advanced computer analysis,
- not dependent on as-of-yet sequencing
technologies, - repeats not as big a problem?
- heavy up-front demand for human labor.
9Clone-by-Clone Ordered Approach
Online Primer mapping
10Genomic Libraries
how many clones to cover a genome?
11Vectors(carry insert DNA)
Vector
Host
Inserts
- Plasmid E. coli up to 15 kb,
- Phage E. coli up to 25 kb,
- Cosmid E. coli up to 45 kb,
- BAC E. coli 100-500 kb,
- YAC Yeast 250-1000 kb.
-
plasmid/phage hybrid
12Genomic Sequences and Coverage
- N ln(1 -
.9999) - ln(1 - v/2,900,000,000)
- v average vector insert size
plasmid (5 kb) 5.3 x 106 phage (20 kb)
1.3 x 106 BAC (125 kb) 2.2 x 105
YAC (500 kb) 27,000 clones
13Clone-by-Clone Ordered Approach
14Contigs(Contiguos Sequences)
Find overlapping ends
Clone 1
Sequence,
Restriction Fragment Length Polymorphisms
(RFLPs).
15Sequence Contig
16RFLP
Restriction enzymes cut specific
DNA specifically,
Fragment lengths provide clone identification
data.
17(No Transcript)
18Contigs(Contiguos Sequences)
Find overlapping ends
Merge good pairs of reads into longer contigs
- Find the minimal Tilling Path,
- - minimum set of overlapping clones that cover
the genome.
19Minimal Tilling Path
Shotgun Sequence Each Clone
20Bacterial Artificial ChromosomesBACs
- Universal Priming Sites,
- On the vector, flanking the genomic insert.
21Shotgun(self-quiz)
8x - 10x coverage To shotgun sequence 10,000
bp, youd need 80k - 100k bp of sequence, or 160
- 180 sequencing reactions.
But, 10,000 bp, at 500 bp per sequencing
reaction could be done in as few as 20 sequencing
reactions.
Why Shotgun?
22Contigs
QC
23(No Transcript)
24Structural Genomic Strategies 2
- Whole Genome Assembly Approach
- Sequence first, then order,
- dependent on advances in computer analysis and
sequencing technologies, - dependent on automated labor.
25WGA
26Read Pairs Mate End Pairs
- Paired End Sequencing,
- sequence both ends of the vector insert, using
vector derived primers, - Maintain mate pair data.
5
3
5
3
27Example Sequence Output(example 5 kb insert)
5 read(543 bp)-atatgtatattgaattacatacatattattaatg
cacatttttatccggagttgtggaccatagaaagacatattgactcctca
aagtaaattctgcatgttacattgaaatcataggctaaatttgagatgca
ctatttttagaaagtgtagagaaaaggacaggaagaaataagcgaaagct
ttggtaagccaccaaacctgattactggaagaaaagaaaaaagttccgag
aatagagttagatcgctggtgagggttttaaatggaacacaacaatggtt
gttttagagtgtgttattcttttgtatttataccttctcataggtttctt
gtaatacacgcttcttcctctctctccctctctcttatggcctcgtcttg
aaagcgtcttgcatgctaagagaaggctttagagcaaggagagaagggag
aagttgatttatacgtccatcggatatatcttctttttatatctgtctct
cttttaaggaagaaaaatggcgactgaattctcgtgggatgaaatcaaga
aagaaaatg...
- rest of insert (unsequenced, 3.9 kb) -
...ggcttgaaatatttggggcaaacaagcttgaagagaaatcagagaac
aagtttttgaaattcttggggttcatgtggaatcctctctcatgggttat
ggagtctgctgcaatcatggctattgttttagctaatggaggaggaaagg
cgccggattggcaagattttatcggtattatggtgttgcttatcatcaac
tccaccataagtttcatcgaggagaacaatgctggcaatgccgctgctgc
tctcatggcaaatcttgcaccaaagactaaggtatgcaaatttctcaata
catatatataggtatgtattttctaaaaaggagagttatataacctatgt
gtgaatgtaggtgttgagagatggtaaatggggggagcaagaggcttcaa
tcttggttccgggtgatttgataagcatcaaattgggtgacattgttcct
gctgatgctcgtctcctcgaaggagatcctttaaaaattgaccaatctgc
tcttactggtgaatcccttccaaccaccaaacacccaggagat - 3
read(540 bp)
plus trace data files associated with these
sequence runs.
28WGA
29Structural Genomic Strategies 3 (Hybrid)
30Project Comparisons(NYT 10/3/2002)
- Decoding the genome of Plasmodium falciparum, the
most dangerous of the four single-cell parasites
that cause malaria, took six years and cost about
20 million, paid for by the Wellcome Trust of
London, the National Institutes of Health in
Bethesda, Md., and other sources. Dr. Malcolm J.
Gardner of the Institute for Genomic Research in
Rockville, Md., led a large team of scientists
there and at the Sanger Centre near Cambridge in
England. Completion of the falciparum genome was
first announced at a conference in Las Vegas in
February. - The genome of Anopheles gambiae, the primary
carrier of the parasite, was begun more recently
and took a mere 15 months even though its genome
is far larger some 278 million units of DNA
encoding 14,000 genes compared with the
parasite's 23 million units of DNA and 5,268
genes. The mosquito team was led by Dr. Robert A.
Holt of Celera Genomics in Rockville. The 14
million cost was born by the National Institutes
of Health, by Genoscope in France and other
sources.
Hybrid
WGA
31Wednesday
- WGA,
- Shotgun Sequencing,
- Hybrid Approach.
- Compartmentalized
- Shotgun
- Approach
- Please read
- Science 291 1304-1315