Title: 14. DNA Sequence Analysis
114. DNA Sequence Analysis
a). Analysis of a DNA sequence i). Southern
blotting analysis to detect mutated DNA ii).
DNA sequencing analysis b). Molecular diagnosis
of a mutation i). Allele-specific
oligonucleotide (ASO) analysis ii). PCR
analysis gene deletion trinucleotide
repeat expansion
2- What further can be done with a gene isolated by
positional cloning? - Map the gene to a chromosomal region to
determine if it - correlates with a site containing a known
- cytogenetic abnormality
- Cytogenetic analysis
- Fluorescence in situ hybridization (FISH)
- Look for abnormalities in the DNA sequence to
determine - if the gene is mutated, to identify the specific
mutation, - and to screen for mutations in other individuals
- Southern blotting
- DNA sequencing
- Allele-specific oligonucleotide (ASO) analysis
- PCR analysis
3- Southern blotting analysis to detect mutated DNA
- complete gene deletion
- partial gene deletion
32P labeled probe
A
complete deletion
B
C
partial deletion
4Southern blotting analysis A. Normal DNA
B. Complete deletion C. Partial deletion
A B C
left
right
right
center
5Resolution of restriction fragments for the
partial deletion
C
C
6DNA Sequencing Analysis
- Sanger dideoxynucleotide chain
termination analysis - Rationale of the technique
- DNA to be sequenced is synthesized under
conditions that - give rise to all possible length fragments
starting at one - end of the DNA sequence
- knowing what nucleotide lies at the end of each
fragment - provides the order of the nucleotides in the DNA
and hence - the nucleotide sequence
7- Procedure for the Sanger dideoxy chain
termination technique - DNA to be sequenced is cloned into a vector,
adjacent to a primer site - Four reactions are set up -- each containing one
of four ddNTPs - DNA is synthesized in the presence of the
ddNTPs, giving rise to sets - of DNA products representing all of the possible
size fragments - for the unknown sequence
- The fragments are resolved by gel
electrophoresis and the sequence - is read up from the bottom of the gel by
identifying the lane giving - the next larger size fragment
8- DNA to be sequenced is cloned into the EcoRI
site - immediately adjacent to the primer binding site
9- For sequencing
- the DNA is denatured into single strands
- the primer is hybridized to the template strand
- DNA is synthesized using DNA polymerase
5 3
3
10BASE (A, T, G, C)
O
P
O
P
O
P
C
O-
O
O-
OH
HO
BASE (A, T, G, C)
O
P
O
P
O
P
C
O-
O
O-
OH
11Structure of AZT
12ddATP DNA dNTPs
ddTTP DNA dNTPs
ddGTP DNA dNTPs
ddCTP DNA dNTPs
5 3
A T C A T G T C A T C A A G T C T A G C A C
T A
- Each fragment is
- terminated by a ddA
T A G T A
- All possible products
- of the reaction
- containing ddATP
T A G T A C A
T A G T A C A G T A
T A G T A C A G T A G T T C A
T A G T A C A G T A G T T C A G A
135 3
A T C A T G T C A T C A A G T C T A G C A C
A
T
G
C
Longer fragments
Sequence of the strand that was synthesized
TAGTACAGTAGTTCAGATCGTG
Shorter fragments
14Initial points derived from the first draft of
the human genome sequence based on the published
report of the International Human Genome
Sequencing Consortium Nature (2001) vol. 409,
pp. 860-921
1). The genomic landscape shows marked variation
in the distribution of a number of features,
including genes, transposable elements, GC
content, CpG islands and recombination rate. This
gives us important clues about function. For
example, the developmentally important HOX gene
clusters are the most repeat-poor regions of the
human genome, probably reflecting the very
complex coordinate regulation of the genes in the
clusters.
2). There appear to be about 30,00040,000
protein-coding genes in the human genomeonly
about twice as many as in worm or fly. However,
the genes are more complex, with more alternative
splicing generating a larger number of protein
products.
153). The full set of proteins (the 'proteome')
encoded by the human genome is more complex than
those of invertebrates. This is due in part to
the presence of vertebrate-specific protein
domains and motifs (an estimated 7 of the
total), but more to the fact that vertebrates
appear to have arranged pre-existing components
into a richer collection of domain architectures.
4). Hundreds of human genes appear likely to have
resulted from horizontal transfer from bacteria
at some point in the vertebrate lineage. Dozens
of genes appear to have been derived from
transposable elements.
5). Although about half of the human genome
derives from transposable elements, there has
been a marked decline in the overall activity of
such elements in the hominid lineage. DNA
transposons appear to have become completely
inactive and long-terminal repeat (LTR)
retroposons may also have done so.
166). The pericentromeric and subtelomeric regions
of chromosomes are filled with large recent
segmental duplications of sequence from elsewhere
in the genome. Segmental duplication is much more
frequent in humans than in yeast, fly or worm.
7). Analysis of the organization of Alu elements
explains the long-standing mystery of their
surprising genomic distribution, and suggests
that there may be strong selection in favour of
preferential retention of Alu elements in GC-rich
regions and that these 'selfish' elements may
benefit their human hosts.
8). The mutation rate is about twice as high in
male as in female meiosis, showing that most
mutation occurs in males.
179). Cytogenetic analysis of the sequenced clones
confirms suggestions that large GC-poor regions
are strongly correlated with 'dark G-bands' in
karyotypes.
10). Recombination rates tend to be much higher
in distal regions (around 20 megabases) of
chromosomes and on shorter chromosome arms in
general, in a pattern that promotes the
occurrence of at least one crossover per
chromosome arm in each meiosis.
11). More than 1.4 million single nucleotide
polymorphisms (SNPs) in the human genome have
been identified. This collection should allow the
initiation of genome-wide linkage disequilibrium
mapping of the genes in the human population.
18Allele-specific oligonucleotide (ASO) analysis
Normal ASO
A
A
T
C
Genotype / /- -/- Hybridization
Mutant ASO
G
G
C
T
Genotype / /- -/- Hybridization
19Detection of D F508 in the cystic fibrosis gene
by ASO
Normal allele
CF allele
20Polymerase chain reaction (PCR) analysis 1).
primers are designed to flank the region to be
amplified in target DNA 2). primers are annealed
to denatured DNA 3). DNA is synthesized using Taq
polymerase (from Thermus aquaticus) 4). primers
are annealed again and the process is repeated
through 20-30 cycles, geometrically amplifying
the target sequence 5). DNA is analyzed by gel
electrophoresis
1).
2).
3).
214).
22(No Transcript)
23Detection of Lesch-Nyhan deletions by multiplex
PCR
- exons in the HGPRT (HPRT) gene
1 2 3 4 5
6 7 8 9
- PCR products resulting from the use of multiplex
primers -- each product - is flanked by its own set of leftward
and rightward primers - for example, for exons 1 and 2
1
2
PCR products
24Detection of Lesch-Nyhan deletions by Multiplex
PCR
1) deletion of exon 2 2) total gene deletion 3)
deletion of exons 6-9 4) deletion of exon 9 5)
normal
25- Repeat expansion
- trinucleotide repeats
- VNTRs (variable number of tandem repeats)
CAG CAG CAG CAG
PCR product
CAG CAG CAG CAG CAG CAG CAG
26- Learning objectives
- understand how Southern blotting analysis can be
used to detect - both partial and complete deletions of a gene
- understand the principle of Sanger
dideoxynucleotide sequencing - know the steps involved and the reagents used in
sequencing DNA - using the Sanger technique
- understand how a sequencing gel is interpreted
- understand how to screen for point mutations
using ASO analysis - understand the principle of PCR
- understand how PCR can be used to screen for
mutations in genes - know the major points learned from the human
genome sequencing - project