Title: Analyzing DNA Sequences and DNA Barcoding
1LESSON 9 Analyzing DNA Sequences and DNA
Barcoding
PowerPoint slides to accompany Using
Bioinformatics Genetic Research
2How DNA Sequence Data is Obtained for Genetic
Research
Obtain Samples Blood , Saliva, Hair Follicles,
Feathers, Scales
Genetic Data
Compare DNA Sequences to One Another
Extract DNA from Cells
Sequence DNA
TTCAACAACAGGCCCAC TTCACCAACAGGCCCAC TTCATCAACAGGCC
CAC
- GOALS
- Identify the organism from which the DNA was
obtained. - Compare DNA sequences to each other.
Image Source Wikimedia Commons
3Overview of DNA Sequencing
Mix with primers Perform sequencing reaction
DNA Sample
4Sequence Both Strands of DNA
Sequence 1 Top Strand
Sequence 2 Bottom Strand
Image Source Wikimedia Commons
5Compare the Two Sequences
Sequence 1 Top (F)
Sequence 2 Bottom (R)
Bioinformatics tools like BLAST can be used to
compare the sequences from the two strands.
Image Source Wikimedia Commons
6Analyzing DNA Sequences
Day One 1. Obtain two chromatograms for each
sample.
2. Align the sequences with BLAST.
Day Two 3. Visualize the chromatograms using
FinchTV. Compare BLAST alignments against base
calls in chromatogram.
4. Review any differences and determine which
base is most likely correct.
5. Edit and trim the DNA sequence using quality
data.
Day Three 6. Translate the sequence to check for
stop codons.
ATGCCGTAA M P STOP
7. Use BLAST to identify origin of sequence.
8. Use BOLD to confirm identity and make
phylogenetic tree.
Image Source NCBI, FinchTV, BOLD.
7Viewing DNA Sequences with FinchTV
Image Source FinchTV
8DNA Peaks Can Vary in Height and Width
Image Source FinchTV
9Quality Values Represent the Accuracy of Each
Base Call
Quality values represent the ability of the DNA
sequencing software to identify the base at a
given position. Quality Value (Q) log10 of the
error probability -10. Q10 means the base has
a one in ten chance (probability) of being
misidentified. Q20 probability of 1 in 100 of
being misidentified. Q30 probability of 1 in
1,000 of being misidentified. Q40 probability
of 1 in 10,000 of being misidentified.
10Quality Values Are Used When Comparing Sequences
Quality values represent the ability of the DNA
sequencing software to identify the base at a
given position.
Image Source FinchTV
11Background Noise May Be Present
Image Source FinchTV
12The Beginning and Ends of Sequences Are Likely To
Be Poor Quality
Image Source FinchTV
13Examples of Chromatogram Data
1
2
3
Circle 1 Example of a series of the same
nucleotide (many Ts in a row). Notice the
highest peaks are visible at each
position. Circle 2 Example of an ambiguous
base call. Notice the T (Red) at position 57
(highlighted in blue) is just below a green peak
(A) at the same position. Look at the poor
quality score on bottom left of screen (Q12). An
A may be the actual nucleotide at this position.
Circle 3 Example of two As together. The
peaks look different, but are the highest peaks
at these positions.
Image Source FinchTV
14Analyzing DNA Sequences
Day One 1. Obtain two chromatograms for each
sample.
2. Align the sequences with BLAST.
Day Two 3. Visualize the chromatograms using
FinchTV. Compare BLAST alignments against base
calls in chromatogram.
4. Review any differences and determine which
base is most likely correct.
5. Edit and trim the DNA sequence using quality
data.
Day Three 6. Translate the sequence to check for
stop codons.
ATGCCGTAA M P STOP
7. Use BLAST to identify origin of sequence.
8. Use BOLD to confirm identity and make
phylogenetic tree.
Image Source NCBI, FinchTV, BOLD.
15Transcription and Translation Begin at the Start
Codon
16There Are Six Potential Reading Frames in DNA
17Frame-Shifts, Amino Acid Changes, and Stop Codons
M D G STOP
Reading Frame 2
5- A T G G A C G G A T G A G 3
Accidental insertion of an extra G when editing
Reading Frame 1 M T G E