Title: Interpreting DNA profiles
1Interpreting DNA profiles
2Main problems with forensic DNA profiling
Although DNA profiling methods are highly
sensitive and accurate, limitations exist. Most
forensic cases do not involve biological
evidence, and in those that do, the evidence may
not be informative. Evidence may be degraded and
yield no profile or may give partial profiles
that are not informative. Contamination can be
a serious complication in the interpretation of a
DNA profile. It might be necessary to obtain
profiles from family members or crime scene
officers in order to eliminate contaminating
bands in a DNA profile. One of the most
serious complications is human error, either
innocent or intentional. An example of human
error Sotolusson case.
3Sotolusson case
Sotolusson was housed in the North Las Vegas jail
because of a prior criminal record that includes
a conviction for aggravated stalking. Although
he had served out his sentence for this crime,
such a serious criminal record gave the
Immigration and Naturalization Service the right
to take him into custody,
Lazaro Sotolusson
while they determined whether he should be
allowed to remain in the United States.
Sotolusson is a native of Cuba. Joseph Coppola,
his cellmate at the North Las Vegas jail, then
accused Sotolusson of sexual assault, and an
investigation into the allegations led to the
collection of DNA samples from both men. The
samples were taken to the police crime lab, where
Flynn said the labels on the two men's DNA were
accidentally switched. Police then ran the two
DNA profiles through a lab computer, and the
computer matched Sotolusson's mislabeled DNA with
DNA evidence gathered from two previously
unsolved rapes in the valley. Unaware of the
mistake, prosecutors proceeded to charge
Sotolusson with both of the unsolved rapes and
also with the sexual assault on Coppola. But in
April all charges against Sotolusson were dropped
after a DNA expert hired by Hoffman and Deputy
Public Defender Darin Imlay discovered the
clerical mistake. After confirming the error, it
was learned that forensics lab safeguards aimed
at catching such mistakes failed two police lab
employees reviewed the findings and did not
detect the error.
4Unaware of the mistake, prosecutors proceeded to
charge Sotolusson with both of the unsolved rapes
and also with the sexual assault on Coppola.
But in April all charges against Sotolusson
were dropped after a DNA expert hired by Hoffman
and Deputy Public Defender Darin Imlay discovered
the clerical mistake. After confirming the
error, it was learned that forensics lab
safeguards aimed at catching such mistakes
failed two police lab employees reviewed the
findings and did not detect the error. Although
human error is impossible to eliminate,
consideration of all the evidence in a case may
pinpoint those instances where accidental or
intentional errors have occurred.
5Interpretation of genotypes
The data collection process leaves the analyst
with only a series of peaks in an
electropherogram or bands on a gel. The peak
information (DNA size and quantity) must be
converted into a common language that will allow
data to be compared between laboratories. This
common language is the sample genotype. A locus
genotype is the allele, in the case of a
homozygote, or alleles, in the case of
heterozygote, present in a sample for a
particular locus and is normally reported as the
number of repeats present in the allele. A
sample genotype or STR profile is produced by the
combination al all of the locus genotypes into a
single series of numbers. This profile is what is
entered into a case report or a DNA database for
comparison purposes to other samples.
6STR alleles from the same sample that are
amplified with different primer sets or analyzed
by different detection platforms will differ in
size. However, by using locus-specific allelic
ladders, allele peak sized may be accurately
converted into genotypes. These genotypes then
provide the universal language for comparing STR
profiles.
7Genotyping process
The multiplex STR kits in use today take
advantage of multiple fluorescent dyes that can
be spectrally resolved. The various dye colors
are separated and the peaks representing DNA
fragments are identified and associated with the
appropriate color. The DNA fragments are then
sized by comparison to an internal sizing
standard. Finally, the PCR product sizes for the
questioned sample are correlated to an allelic
ladder that has been sized in a similar fashion
with internal standards. The allelic ladder
contains alleles of known repeat content and is
used much like a measuring ruler to correlate the
PCR product sizes to the number of repeat units
present for a particular STR locus. From this
comparison of the unknown sample with the known
allelic ladder, the genotype of the unknown
sample is determined.
8Sizing DNA fragments
DNA fragments represented by peaks in capillary
electropherograms or band on a gel can be sized
relative to an internal size standard that is
mixed with the DNA samples. The internal size
standard is typically labeled with a different
colored dye so that it can be spectrally
distinguished from the DNA fragment of unknown
size. The GS500 -ROX size standard contains 16
DNA fragments, ranging in size from 35 bp to 500
bp, that have been labeled with the red
fluorescent dye ROX.
9DNA fragment analysis and Genotyping software
Fairly sophisticated software has been developed
to take sample electrophoretic data rapidly
through the genotyping process just described.
For ABI users, this is done in two steps by two
different software programs. GeneScan software
is used to spectrally resolve the dye colors for
each peak and to size the DNA fragments in each
sample. The resulting electopherograms are then
imported into the second software
program. Genotyper determines each samples
genotype by comparing the sized of alleles
observed in a standard allelic ladder sample to
those obtained at each locus tested in the DNA
sample.
10Manual intervention in STR Genotype
determinations
While STR allele calls may be made in an
automated fashion with either Genotyper or STaR
CallTM, the resulting genotype information needs
to be examined manually by experienced analysts.
Data analysis and review is essential for
confirming STR results prior to making
reports. Software algorithms follow set
parameters and criteria and hence can never be as
effective at making difficult calls as a trained
examiner. Strict guidelines for data
interpretation should be in place to avoid
problems with individual bias when the data are
reviewed. However, there is always enough
variation between data sets that not every
situation can be covered by a predetermined rule.
11Laboratories typically have two independent reads
of the data by different operators. The
genotypes must agree with each other before
results will be reported or passed on for
uploading to a DNA database. Likewise, a match
between two samples is only reported if the two
DNA profiles display the same patterns.
12Factors affecting genotyping results
There are a number of issues that are important
to obtaining accurate genotype results. Some
issues are biology related and some are
technology related. For example, the amount of
stutter or incomplete 3-nucleotide addition
present are biology issues related to the amount
of DNA template used in the PCR amplification. On
the other hand, pull-up artifacts and threshold
issues result from the fluorescent technology and
software used for genotyping the samples. Three
parts of the genotyping process are crucial to
the success of genotyping samples. These include
the matrix file, the internal size standard, and
the allelic ladder sample.
13The matrix file
The matrix file is critical for proper color
separation in an electropherogram. If the
observed peaks are not associated with the proper
dye label, then the sample genotype cannot be
correctly determined. Matrix files are
established by running samples that contain each
of the dyes individually. The results of the
individual dye runs are combined to form a
mathematical matrix that is used to subtract the
contribution of other colors in the overlapping
spectra. A matrix is most accurate under
consistent environmental conditions. Thus, if the
electrophoresis buffer is changed, a new matrix
should be established in order to obtain the most
accurate color deconvolution between the
different dyes.
14The internal size standard
The internal size standard is necessary for the
proper sizing of DNA fragment peaks detected in
an electropherogram. If any of the peaks in the
size standard are below the peak detection
threshold established in the data collection and
analysis software, then the sizing algorithms
will not work properly and STR alleles may be
sized incorrectly. An analyst should check to
make sure that the internal size standard peaks
were all detected properly before proceeding to
genotype the STR alleles in a sample.
15The allelic ladder
The allelic ladder is the standard to which STR
alleles are compared to obtain the sample
genotype. The alleles in an allelic ladder need
to be resolved from one another and above the
peak detection threshold of the data collection
and analysis software in order to correctly call
STR alleles in unknown samples. The sizes
obtained for each allele in the allelic ladder
are used to make the final genotyper
determination in the unknown samples. Therefore,
they must be determined correctly.
16Sizing algorithm issues
The most common algorithm used for determining
the DNA fragment size is known as the local
Southern method. This method uses the size of
two peaks on either side of the unknown one being
measured in order to make the calculations. For
example the 165.05 bp peak size is determined
with local Southern sizing by the position of the
150 bp and 160 bp on the lower side and the
position of the 200 and the 250 peaks on the
upper side. The local Southern method works
very well for accurate sizing of DNA fragments
over the 100-450 bp size range necessary for STR
alleles.
Butler, 2002
17However, there are some caveats that should be
kept in mind that depend upon the internal size
standard used. DNA fragment peaks that are
larger than the internal sizing standard cannot
be accurately determined. Nor can peaks that
fall near the edge of the region defined by the
internal sizing standard due to the fact that two
peaks from the size standard are needed on either
side of the unknown peak. For the GS500-Rox
internal standard commonly used with the AmpFISTR
kits, any unknown peaks falling above 490 bp or
below 50 bp will not be sized with the local
Southern method. Likewise, if the signal
intensity for any of the calibration peaks in the
internal sizing standards is too weak, then
unknown peaks in that region will not be sized
accurately.
18Partial STR profiles
If the genomic DNA in a sample is severely
degraded or PCR inhibitors are present, only a
partial STR profile may be obtained. Usually
the larger STR loci in a multiplex reaction, such
as D18S51 and FGA, will be the first to fail on a
degraded DNA sample. When only a partial
profile is obtained, the significance of a match
will be lower because there are fewer loci to
compare.
19Mixture interpretation
Mixtures of DNA from two or more individuals are
common in some forensic cases and must be dealt
with in the interpretation of the DNA profiles.
In evaluating the evidence, an analyst must
decide whether the source of the DNA in the
questioned sample is from a single individual or
more than one person. This may be accomplished
by examination of the number of alleles detected
at each locus as well a peak height ratios and/or
band intensities on a gel. Occasionally extra
peaks occur in the data that should not be
confused with true alleles.
20Extra peaks observed
Stutter peaks Adenylation Pull-up peaks Dye
blobs Spikes Sample contaminants
21Developing an interpretation strategy
- A forensic DNA laboratory should develop its own
STR interpretation guidelines based upon their
own validation studies and results reported in
the literature. - Practical experience with instrumentation and
results from performing casework are also
important factors in developing an interpretation
strategy. - Conduct necessary validation studies and gain
experience in your laboratory - Utilize analysts experience.
- Use literature references as a resource in
understanding if an off-ladder allele has been
observed before. - Validation studies will define observe stutter
ratios for each locus, establish minimum peak
heights, and define heterozygous peak ratios
within a locus.
22When in doubt on a samples correct result, the
sample should be re-tested. This may be as
simple as re-injecting it on the ABI or putting
another aliquot of the sample on the next gel.
Even if sample re-testing involves
re-extracting and / ore re-amplifying the
problem ample, it is worthwhile in order to
obtain an accurate result.
23A match or not a match That is the question
24Generally, the process of comparing two or more
samples is limited to one of three possible
outcomes that are submitted in a case
report Match peaks between the compared STR
profiles have the same genotypes and no
unexplainable differences exist between the
samples. Statistical evaluation of the
significance of the match is usually reported
with the match report. Exclusion the genotype
comparison shows profile differences that can
only be explained by the two samples originating
from different sources. Inconclusive the data
does not support a conclusion as to whether the
profiles match. This finding might be reported if
two analysts remain in disagreement after review
and discussion of the data, and it is felt that
insufficient information exists to support any
conclusion.
25In forensic DNA tying, if any one STR locus fails
to match when comparing the genotypes between two
or more samples, then the profiles between the
questioned and reference sample will be declared
a non-match, regardless of how many other loci
match. Paternity testing is an exception to
this because of the possibility of mutational
events. When analyzing and reporting the results
of parentage cases, an allowance for one possible
mutation is often made. In other word, if 13 loci
are used and the questioned parentage is included
for all but one locus, the data from the
non-inclusive allele will be attributed to a
possible mutation. Interpretation of results
in forensic casework is a matter of professional
judgment and expertise. Interpretation of
results within the context of a case is the
responsibility of the case analyst with
supervisors or technical leaders conducting a
follow-up verification of the analysts
interpretation of the data as part of the
technical review process.
26When coming to a final conclusion regarding a
match or an exclusion between two or more DNA
profiles, laboratory interpretation guidelines
should be adhered to by both the case analyst and
the supervisor.
However, as experience using various analytical
procedures grows, interpretation guidelines may
evolve and improve. These guidelines should
always be based on the use of proper controls and
validated methods. A typical DNA forensics case
involves comparing the DNA profile from an
evidence sample to a profile derived from a
suspect. There are three possible outcomes of
this comparison the profiles match, the profiles
do not match, or the data are inconclusive. If
the suspect and evidence samples do not match, it
can be concluded that the suspect was not the
source of the crime scene sample therefore, an
exclusion.
27 If the DNA profiles from the suspect and
evidence samples are indistinguishable, the
profiles are said to match. In this case, the
samples either came from the same person or came
from two different people who simply share the
same DNA profiles by chance. In order to
present the DNA evidence in such a way as to
convey its significance, it is necessary to
estimate the probability that the two profiles
are a random match.
28DNA profile probabilities
There are several different way in which match
probabilities are calculated. However, the
simplest is termed the profile probability
method. The profile probability is the
probability that a person chosen at random from a
population would have the same DNA profile as the
evidence or suspect samples. The following is
and example of how to calculate a profile
probability. This example involves calculating
the profile probability at five STR loci as set
out in the following table
29Profile Frequency 0.00002 (1 in 50,000)
30The CSF1PO locus
In this profile of five STR loci, the person
exhibited two different-sized alleles at the
CSG1PO locus (alleles of 10 and 11 repeats). In
the Caucasian American database of 430 alleles
(that is, two alleles from each of 215 people
sampled), these alleles were observed 108 and 133
times, respectively. Therefore, the frequency
of observing these alleles at random in the
Caucasian American population would be 0.25 (the
p frequency) and 0.31 (the q frequency),
respectively. The person who contributed this
DNA profile can be assumed to have received each
of his or her CSF1PO alleles at random from each
parent. In other words, the probability of
receiving allele 10 from the mother and allele 11
from the father is pq.
31Similarly, the probability of receiving allele 11
from the mother and allele 10 from the father is
also pq. Therefore, the total probability of
receiving a 10,11 genotype by chance is 2pq. In
this case, 2pq is about 16. One can see from
this example that DNA profiling at one locus is
not very discriminating, as about 16 of the
population would share this 10,11 DNA profile
just by chance. The strength of a profile match,
however, increases as one adds more loci to the
analysis.
32Profile Frequency 0.00002 (1 in 50,000)
33The TPOX locus
This person exhibited two identical TPOX alleles
(of 8 repeats) and is therefore homozygous at the
TPOX locus. The combined probability of
inheriting the eight allele from each parent is
pp p2 and the frequency that one would observe
the p2 genotype in a Caucasian American
population would be about 28. The probability
that a person would have a combined
TPOX8,8/CSF1OP 10,11 genotype would be 28 of 16
about 4.
34The TH01, vWA, and D5S818 loci
The probability calculations for these loci are
the same for the remaining loci. By multiplying
all the genotype probabilities at the five loci,
one obtains an overall profile probability of
0.00002 or a one in 50,000 chance that a person
chosen at random from that population would show
the same DNA profile. The method of multiplying
all the frequencies of genotypes at locus is
sometimes called the product rule. It is the most
frequently used method of DNA profile
interpretation, and is widely accepted in U.S.
courts.
35Is a persons DNA profile unique?
36At present, the FBI uses 13 core STR loci in its
profiles. The expected genotyped frequency of
the most common 13-locus profile would be less
than one in 10 billion, and depends slightly upon
allele frequencies in different populations.
Although these numbers would strongly suggest
that two matching profiles came from the same
person, they cannot rule out the possibility of a
random match. As one increases the number of
loci analyzed in a DNA profile, the probability
of a random match in the population becomes
smaller. If enough loci were analyzed, one might
be certain that the DNA profile is unique. The
FBIs policy is that if the match probability is
much lower than one in 290 million (US
population) then it can be said with reasonable
certainty that the DNA profile is unique to one
individual.
37Several situations exist that modify the profile
probability calculations and the interpretation
of matching profiles Because they developed
from a single fertilized egg, identical twins
have identical DNA. Therefore, their DNA profiles
will be identical. The frequency of identical
twins is about one every 250 births. Because
they share parents, siblings often share alleles
at any locus. About a quarter of the time,
siblings will share both alleles at a particular
locus. About half the time, they will share one
allele at a locus. Using a DNA profile of 13 core
STR loci, the profile probability is about
100,000 times greater if the DNA samples come
from siblings than if they come from two
unrelated persons. For an example of siblings
that matched at a large number of loci, Polish
Dragnet Apprehends Serial Rapist.
38A parent and a child will always share one allele
at a locus, but they will not usually share two
alleles. Other relatives may share a single
allele at a locus, but rarely will they share
both alleles at a locus. The allele frequencies
and probability calculations described above are
based on the assumption that the population in
question is large, with little interrelatedness
or inbreeding. For populations that do not meet
these assumptions, profile probabilities must be
adjusted to reflect certain degrees of
interrelatedness.
39If a defendants profile matches that of the
crime scene sample, does that prove the
defendants guilt?
40It is important to remember that a match between
a crime scene DNA profile and a suspects profile
does not necessarily prove guilt, in the absence
of other evidence. Human error or contamination
may contribute to a match between a profile from
a crime scene sample and a profile from an
innocent person. In addition, a suspects DNA
may be introduced to a crime scene before, during
or after the crime for reasons unrelated to the
suspects involvement in the crimes. Also, DNA
may be introduced to a crime scene by inadvertent
or deliberate tampering. Conversely, a DNA
profile exclusion does not necessarily mean
innocence. In a rape case, for example, a suspect
may not contribute the semen sample, but may have
been involved in the crime by restraining the
victim. Once again, DNA profiles must always be
interpreted in the context of all available
evidence.
41(No Transcript)