Molecular biology story: DNA "the Queen molecule" - PowerPoint PPT Presentation

1 / 116
About This Presentation
Title:

Molecular biology story: DNA "the Queen molecule"

Description:

Pasteur Genopole le-de-France, Plate-forme technologique 4. View ... Close up of capillaries from a capillary sequencing machine. Courtesy of Celera Genomics ... – PowerPoint PPT presentation

Number of Views:208
Avg rating:3.0/5.0
Slides: 117
Provided by: odilekalog
Category:

less

Transcript and Presenter's Notes

Title: Molecular biology story: DNA "the Queen molecule"


1
Molecular biology story DNA "the Queen
molecule"
Bioinformatics and Comparative Genome Analysis 
Monday, march 19th 2007 Tunis
Odile Ozier-Kalogeropoulos Institut
Pasteur Université Pierre et Marie Curie E-mail
odozier_at_pasteur.fr
2
Introduction
3
Genomes two views
4
View of genomes for biologists
http//www.pasteur.fr/externe
http//genetique.snv.jussieu.fr
5
View of genomes for computer scientists
Pasteur Genopole Île-de-France, Plate-forme
technologique 4
6
DNA molecule two views
7
View 1
James Watson and Francis Crick (1953)
8
View 2
5'
3'
3'
5'
9
DNA sequence one view
10
DNA sequence one view
11
Sequencing DNA, "the Queen molecule"
12
Sequencing DNA, "the Queen molecule"
Most of sequencing methods are based on the
natural living systems use to copy and repair
their own genomes
13
Reminder!
Cell DNA synthesis
14
Reminder!
Cell DNA synthesis
The main role of DNA polymerase
15
Cell DNA synthesis
3'
http//www.snv.jussieu.fr/vie/dossiers/sequencage/
sequence.htm
16
Cell DNA synthesis
17
Cell DNA synthesis
18
Cell DNA synthesis
19
1 Foundation of the current state-of-the-art
production genome sequencing
20
1 Foundation of the current state-of-the-art
production genome sequencing
21
1 Foundation of the current state-of-the-art
production genome sequencing
The Sanger method
22
1 Foundation of the current state-of-the-art
production genome sequencing
The Sanger method
1977
23
1 Foundation of the current state-of-the-art
production genome sequencing
The Sanger method
1977
30th year celebration!
24
DNA isolation
Sample preparation
The Sanger method
Sequence production
Assembly and analysis
25
DNA isolation
Sample preparation
The Sanger method
Sequence production
Assembly and analysis
26
The Sanger method
Focus on
Sequence production
27
The Sanger method
http//www.snv.jussieu.fr/vie/dossiers/sequencage/
sequence.htm
28
The Sanger method
DNA polymerase
DNA polymerase
http//www.snv.jussieu.fr/vie/dossiers/sequencage/
sequence.htm
29
The Sanger method
http//www.snv.jussieu.fr/vie/dossiers/sequencage/
sequence.htm
30
The Sanger method
Fragment separation by electrophoresis on
acrylamide gel (resolution 1 base)
31
The Sanger method
Reading progression
Fragment separation by electrophoresis on
acrylamide gel (resolution 1 base)
32
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing
centers
33
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sanger production-scale genome sequencing
requires the 4 successive steps
1
2
DNA isolation
Sample preparation
Laboratory
Chan E.Y. (2005), Mutation res, 573, 13-40
34
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sanger production-scale genome sequencing
requires the 4 successive steps
1
2
3
DNA isolation
Sample preparation
Sequence production
Robots
Laboratory
Chan E.Y. (2005), Mutation res, 573, 13-40
35
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sanger production-scale genome sequencing
requires the 4 successive steps
1
2
3
4
DNA isolation
Sample preparation
Sequence production
Assembly and analysis
Robots
Computers
Laboratory
Chan E.Y. (2005), Mutation res, 573, 13-40
36
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sanger production-scale genome sequencing
requires the 4 successive steps
1
2
3
4
DNA isolation
Sample preparation
Sequence production
Assembly and analysis
Robots
Computers
Laboratory
Humans
Chan E.Y. (2005), Mutation res, 573, 13-40
37
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sequence production
Sequencing robots
Lab technician working with sequencing
machines Courtesy of Celera Genomics
DNA isolation
Sample preparation
Room filled with sequencing machines Courtesy of
Celera Genomics
Laboratory
38
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Sequencing robots
Assembly and analysis
Close up of capillaries from a capillary
sequencing machine Courtesy of Celera Genomics
Computers
Lab with sequencing machines Courtesy of Celera
genomics
39
2 Current state-of-the-art production genome
sequencing in high-throughput sequencing centers
Assembly and analysis
Computers
Plate-forme Génomique, Institut Pasteur
40
3 Sequencing statistics
41
http//www.genomesonline.org
42
Bacteria Archea
Eukarya
Metagenomes
http//www.genomesonline.org
43

others

F
USA



UK
F
High-throughput sequencing centers by country
http//www.genomesonline.org
44
4 Why continue sequencing?
45
  • 4 Why continue sequencing?
  • Comparative genomics
  • Impact on biomedical research
  • The personal genome project

46
  • 4 Why continue sequencing?
  • Comparative genomics
  • Impact on biomedical research
  • The personal genome project

47
Figure 1   Evolutionary relationship between
metazoans that are sequenced or due for
sequencing.  The simplified phylogenetic
relationships between the metazoans for which the
complete, or nearly complete, genome sequences
are available or will be available soon.
Evolutionary distances (in million years)
Abel Ureta-Vidal, Laurence Ettwiller
Ewan Birney (2003), Nature rev. genet., 4,
pp251-262
48
- International sequence databases Sequence
fragments of 100 000 species - Estimation of the
number of species 14 millions at least...
Number of sequences in GenBank (log scale)
Shendure, 2004 and Wikipedia
The phylogenetic sequence deficit for the Metazoa

Mark Blaxter, 2002
49
- International sequence databases Sequence
fragments of 100 000 species - Estimation of the
number of species 14 millions at least...
Vertebrates
Arthropodes
Nematodes
Number of sequences in GenBank (log scale)
Shendure, 2004 and Wikipedia
The phylogenetic sequence deficit for the Metazoa

Mark Blaxter, 2002
50
- International sequence databases Sequence
fragments of 100 000 species - Estimation of the
number of species 14 millions at least...
Vertebrates
Arthropodes
Nematodes
Number of sequences in GenBank (log scale)
Shendure, 2004 and Wikipedia
molluscs, worms..
The phylogenetic sequence deficit for the Metazoa

Mark Blaxter, 2002
51
  • 4 Why continue sequencing?
  • Comparative genomics
  • Impact on biomedical research
  • The personal genome project

52
-Single Nucleotide Polymorphism SNP
53
HapMap Project
A freely-available public resource to increase
the power and efficiency of genetic association
studies to medical traits
  • High-density SNP genotyping across the genome
    provides information about
  • SNP validation, frequency, assay conditions
  • correlation structure of alleles in the genome

Mark J. Daly, PhD
54
Associated alleles reported
Kirov 2004
Straub 2002 Van den Oord 2003
Williams 2004 Bray 2005
Van den Bogaert 2003 Funke 2004
Mark J. Daly, PhD
Schwab 2003
55
  • 4 Why continue sequencing?
  • Comparative genomics
  • Impact on biomedical research
  • The personal genome project

56
Sequencing of individual human genomes as a
component of preventative medicine
The National Human Genome Research Institute
(NHGRI) solicits grant applications to develop
novel technologies that will enable extremely
low-cost genomic DNA sequencing. (2005-2006)
Revolutionary Genome Sequencing
Technologies The 1000 Genome For 2015
57
5 Improvements of the Sanger method during
these 30 years
58
5 Improvements of the Sanger method during these
30 years
DNA isolation
Sample preparation
Sequence production
Assembly and analysis
59
5 Improvements of the Sanger method during these
30 years
  • Production of template DNA
  • Labelling Radioactivity/Fluorescent dyes
  • - Analysis of the DNA fragments produced
  • Radioactivity detection/
  • Laser within an automated DNA sequencing machine
  • Electrophoresis acrylamide gel/capillaries

DNA isolation
Sample preparation
Sequence production
Assembly and analysis
60
5 Improvements of the Sanger method during these
30 years
  • Production of template DNA
  • Labelling Radioactivity/Fluorescent dyes
  • - Analysis of the DNA fragments produced
  • Radioactivity detection/
  • Laser within an automated DNA sequencing machine
  • Electrophoresis acrylamide gel/capillaries

DNA isolation
Sample preparation
Sequence production
Assembly and analysis
61
5 Improvements of the Sanger method during these
30 years
  • Production of template DNA
  • Labelling Radioactivity/Fluorescent dyes
  • - Analysis of the DNA fragments produced
  • Radioactivity detection/
  • Laser within an automated DNA sequencing machine

DNA isolation
Sample preparation
Sequence production
Assembly and analysis
62
5 Improvements of the Sanger method during these
30 years
  • Production of template DNA
  • Labelling Radioactivity/Fluorescent dyes
  • - Analysis of the DNA fragments produced
  • Radioactivity detection/
  • Laser within an automated DNA sequencing machine
  • Electrophoresis acrylamide gel/capillaries

DNA isolation
Sample preparation
Sequence production
Assembly and analysis
63
  • Production of template DNA
  • around 1985

DNA isolation
Need of single-stranded DNA for sequencing
64
(No Transcript)
65
  • Sequencing of pure single-stranded DNA from
    recombinant M13 particles

66
  • Production of template DNA
  • around 1990

DNA isolation
  • Double-stranded DNA from recombinant plasmids or
    PCR products
  • denatured by heat or alcali for sequencing

67
DNA isolation
  • Recent improvement of
  • template DNA production

Multiple displacement amplification
Phi29 DNA Polymerase is the replicative
polymerase from the Bacillus subtilis phage
phi29
DNA templates can be amplified 10 000 fold in a
few hours
Blanco, L. and Salas, M. (1984) Proc. Natl. Acad.
Sci. USA, 81, 5325-5329)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
Recent improvement of template DNA production
Principle
Blanco, PNAS,1989
72
DNA isolation
Applications of the multiple displacement
amplification
73
DNA isolation
Applications of the multiple displacement
amplification
1. Whole human genome amplification using this
method
2. Sequencing the genome of a single cell
74
DNA isolation
Applications of the multiple displacement
amplification
1. Whole human genome amplification using this
method Phi29 DNA polymerase is able to amplify
linear DNA
(Dean et al, PNAS, 2002)
75
DNA isolation
Applications of the multiple displacement
amplification
1. Whole human genome amplification using this
method Phi29 DNA polymerase is able to amplify
linear DNA
Cascading strand displacement
Linear DNA
Circular DNA
(Dean et al, PNAS, 2002)
76
DNA isolation
Applications of the multiple displacement
amplification
1. Whole human genome amplification using this
method Phi29 DNA polymerase is able to amplify
linear DNA
1-10 copies of human genomic DNA 20-30 mg
product
18 hours at 30C
DNA amplification yield after MDA
(Dean et al, PNAS, 2002)
77
DNA isolation
Applications of the multiple displacement
amplification
1. Whole human genome amplification using this
method Phi29 DNA polymerase is able to amplify
linear DNA
  • For
  • Genome sequencing
  • Genetic analysis on blood, microdissected
    tissues...
  • Prenatal diagnosis,
  • Anthropological samples...

(Dean et al, PNAS, 2002)
78
DNA isolation
Applications of the multiple displacement
amplification
2. Sequencing the genome of a single cell
(Zhang et al, Nature Biotech, 2006)
79
Nature Biotechnology 24, 657 - 658 (2006)
doi10.1038/nbt0606-657 Single-cell
genomics Clyde A Hutchison III  J Craig Venter
Phi29 DNA Polymerase is the replicative
polymerase from the Bacillus subtilis phage
phi29.This polymerase has exceptional strand
displacement and processive synthesis properties.
The polymerase has an inherent 3gt5 proofreading
exonuclease activity (Blanco, L. and Salas,
M. (1984) Proc. Natl. Acad. Sci.
USA, 81, 5325-5329)
Figure 1. Sequencing the genome of a single
cell. A single cell is isolated by dilution or by
cell sorting. The cell is lysed and the
chromosome is denatured by alkaline treatment.
The cellular DNA is amplified gt109-fold by
multiple displacement amplification (MDA) using
random primers. The hyperbranched DNA product is
resolved by shearing and enzymatic treatments,
then cloned and shotgun sequenced. Ideally, a
complete genome sequence could be assembled from
the data and then annotated.
80
DNA isolation
Applications of the multiple displacement
amplification
2. Sequencing the genome of a single cell
A pioneer work and a new world
Polymerase cloning "Ploning"
The authors refer to the DNA populations
amplified from single cell as Polymerase clones,
or "plones"
  • Two limitations in this first experiments
  • Bias in "plonable" amplification
  • Chimeric plones (about 6)

(Zhang et al, Nature Biotech, 2006)
81
DNA isolation
Applications of the multiple displacement
amplification
2. Sequencing the genome of a single cell
Most of the diversity of the biosphere remains
unsampled.
(Zhang et al, Nature Biotech, 2006)
82
DNA isolation
Applications of the multiple displacement
amplification
2. Sequencing the genome of a single cell
Most of the diversity of the biosphere remains
unsampled. The ability to sequence an entire
genome from a single uncultured cell should
allowed to reveal this enormous biodiversity.
(Zhang et al, Nature Biotech, 2006)
83
DNA isolation
Applications of the multiple displacement
amplification
2. Sequencing the genome of a single cell
Most of the diversity of the biosphere remains
unsampled. The ability to sequence an entire
genome from a single uncultured cell should
allowed to reveal this enormous biodiversity.
Metagenomics
(Zhang et al, Nature Biotech, 2006)
84
6 Alternatives to the Sanger method
Sequencing single molecules of DNA
85
Reminder!
The Sanger method is based on the analysis of
populations of DNA molecules
- Analysis of the DNA fragments produced
Radioactivity detection/ Laser within an
automated DNA sequencing machine
Sequence production
86
6 Alternatives to the Sanger method Sequencing
single molecules of DNA
Cycle extention method on single molecules
1- Template DNA is arrayed on a surface or wells
2- Sequencing reaction steps including
nucleotide incorporation and washes are
performed to identify each base pair. 3- The
extended base pair is detected by fluorescence
or luminescence.
87
Sequential base incorporation steps
Template
Primer
Surface
Chan E.Y. (2005), Mutation res, 573, 13-40
88
Main features of cycle extention methods
compared to Sanger
  • Massive parallelism
  • Short read lengths
  • Potential for cost reduction

89
Pyrosequencing is the most famous cycle
extention method
90
From Biotage, http//www.pyrosequencing.com
91
Pyrosequencing
From Biotage, http//www.pyrosequencing.com
92
From Biotage, http//www.pyrosequencing.com
93
a, Read length distribution for the 306,178
high-quality reads of the M. genitalium
sequencing run. This distribution reflects the
base composition of individual sequencing
templates. b, Average read accuracy, at the
single read level, as a function of base position
for the 238,066 mapped reads of the same run
From Biotage, http//www.pyrosequencing.com
94
The two main problems of pyrosequencing
a, Read length distribution for the 306,178
high-quality reads of the M. genitalium
sequencing run. This distribution reflects the
base composition of individual sequencing
templates. b, Average read accuracy, at the
single read level, as a function of base position
for the 238,066 mapped reads of the same run
From Biotage, http//www.pyrosequencing.com
95
Pyrosequencing massive parallelism
Genome sequencing in microfabricated
high-density picolitre reactors
Margulies et al, 2005
96
Genomic DNA is fragmented, ligated to adapters
and separated into single strands
Fragments are bound to beads under conditions
one fragment by bead. The beads are captured in
droplets of a PCR-reaction-mixture-in-oil
emulsion. PCR amplification occurs within each
droplet. Each bead at the end of PCR reaction
carries 10 million copies of an unique DNA
template.
Margulies, 2005, Nature, 437, pp376-380
Margulies et al, 2005
97
The emulsion is broken, the DNA strands
denatured and the beads carrying single stranded
DNA clones are deposited into wells of a
fibre-optic slide.
Smaller beads carrying immobilized enzymes
required for pyrosequencing are deposited into
each well.
Margulies et al, 2005
98
Sequencing instrument
  • Fluidic assembly
  • The well-containing
  • fibre-optic slide
  • c) Computer providing
  • the user interface and
  • the instrument control

Margulies et al, 2005
99
De novo assembly of the bacterial genomes Test on
Mycoplasma genitalium (580 000 bp)
14 hours!
Density of wells 480/1mm2 Total of wells on a
slide 1.6 millions!
Margulies et al, 2005
100
7 Sequencing or resequencing?
101
7 Sequencing or resequencing?
  • Sequencing for studies of genomes of unknown
    species
  • needing long read length
  • Resequencing for individual studies using a
    known genome
  • as guide

102
Comparison of sequencing methods
Sanger method
ABI 3730xl
Adapted from Chan E.Y. (2005), Mutation res, 573,
13-40
103
Comparison of sequencing methods
Sanger method
ABI 3730xl
Adapted from Chan E.Y. (2005), Mutation res, 573,
13-40
454 technology
104
Comparison of sequencing methods
Sanger method
ABI 3730xl
Adapted from Chan E.Y. (2005), Mutation res, 573,
13-40
454 technology
105
Comparison of sequencing methods
Sanger method
ABI 3730xl
Adapted from Chan E.Y. (2005), Mutation res, 573,
13-40
454 technology
106
Choice of sequencing method
Example of Neanderthal DNA
DNA from a fragment of 38 000-year-old
Neanderthal fossil found in 1980 in Vindija cave
(Croatia)
Neanderthal DNA constraints
Advantages of Pyrosequencing
  • No bacterial cloning
  • No template competition for amplification
  • Read length about 200 bp
  • Each sequenced product stems from just one
  • original single stranded template molecule of
  • known orientation (difference with PCR)
  • Rare short DNA
  • fragments
  • Many
  • contaminations

Green R.E. et al, 2006
107
Principle
Lambert and Millar (2006), Green et al, (2006)
http//WWW.454.COM/
108
Results
Analysis of one million base pairs of Neanderthal
DNA
Location on the human karyotype of Neanderthal
DNA
Schematic tree illustrating the number of
nucleotide changes inferred to have occured on
hominoid lineages
Green et al, (2006)
109
Conclusions
110
Conclusions
- Sequencing today is performed in big centers
111
Conclusions
- Sequencing today is performed in big centers
- The number of sequences is exponentially
growing up....
112
Conclusions
- Sequencing today is performed in big centers
- The number of sequences is exponentially
growing up....
But the bottle neck remains sequence analysis....
113
Conclusions
- Sequencing today is performed in big centers
- The number of sequences is exponentially
growing up....
But the bottle neck remains analysis of
sequences....
Precisely, the goal of the present course
"Bioinformatics and Comparative Genome Analysis"
is to give you tools to participate to
improvements of this knowledge domain...
114
So... Good work on the Queen molecule!
Thanks to the organizers!
And thanks for your attention!
115
Plan of the course
1
2
116
Plan of the course (conted)
3
Write a Comment
User Comments (0)
About PowerShow.com