Title: Molecular Epidemiology
1Molecular Epidemiology
2- This is the principle technique of scientific
inquiry by changing the scale of description, we
move from unpredictable, unrepeatable individual
cases to collections of cases whose behavior is
regular enough to allow generalizations to be
made. (S. Levin, 1947)
3Epidemiology
- Originally Study(ology) upon(Epi) populations of
people(demes) - Now much broader.
- inquiry into events that take place over very
different temporal scales From identification
of organisms that have diverged millions of years
ago, to the tracing of contacts.
4On a Large Scale
Identity of an infectious agent in an outbreak
5Ribosomal RNA
Coding Regions Highly conserved across widely
divergent species. Transcribed Spacer Regions
Less conserved. Different between closely
related Species. Non-transcribed Spacer
Regions Vary between and among species.
6(No Transcript)
7Where is it?
8What is it?
- Microscopy
- Molecular Methods
Figure 1.-Oocysts of a Cyclospora Species (Panel
A), Cryptosporidium muris (Panel B), and C.
parvum (Panel C) (Modified Acid-Fast Stain)
DNA sequencing. Use Moderately variable
regions. Such as the transcribed spacer.
9Cyclospora ITS-1
- Use conserved primers from the flanking (coding
regions), - amplify and sequence ITS.
- 2. Design primers common to all Cyclospora
isolates. - 3. Test sensitivity and specificity of primers.
- 1. Use conserved primers from the flanking
(coding regions), amplify and sequence ITS. - 2. Design primers common to all Cyclospora
isolates. - 3. Test sensitivity and specificity of primers.
Amplified 36 C. cayetanensis from around the
world Did NOT amplify 20 species with similar
pathology Among them Cryptosporidia. Faint band
from Babesia gibsoni.
10Assumptions
- False Positives Less stringent PCR conditions
- False negatives Overly stringent conditions,
- combined with unforeseen
- mutation in primer regions
11Zooming in
- The way to study events on a large scale may not
be the way to study events on a small scale
(think physics) - What is TRUE on one scale may not be true on
another scale.
12On a Smaller Scale
Strains Transmission cycles
13GIARDIA
Giardia has Two Heads!
14Mycobacterium tuberculosis
15Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
16Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
- 1/10 will become sick
17Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
- 1/10 will become sick
- 2.7 million die each year
18Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
- 1/10 will become sick
- 2.7 million die each year
- TB is the largest single agent killer of
19Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
- 1/10 will become sick
- 2.7 million die each year
- TB is the largest single agent killer of Women.
20Mycobacterium tuberculosis
- According to the WHO
- 2 Billion infected
- 1/10 will become sick
- 2.7 million die each year
- TB is the largest single agent killer of Women.
Young.
21Mycobacterium tuberculosis
- What is the frequency of exogenous re-infection?
With - MDR-TB?
- What are the transmission dynamics in endemic
- countries?
22Methods to differentiate strains
- Isoenzymes/allozymes older methods.
- RFLP
- RAPD/ AP-PCR
- AFLPs
- Sequence surrogates report nucleotic changes
indirectly
23Isoenzymes
- Isoenzymes/allozymes electrophoresis to
determine differences in enzymes. Allozymes
detect differences between alleles of a given
enzyme. Very weak. - Detect 60 of change, only at enzyme loci.
- Giardia divided into 2 clades evidence for
zoonosis
24RFLP
- Restriction fragment length polymorphism
- Usually a true sequence surrogatea difference in
RFLP pattern is ideally due to a change in the
nucleotide sequence at one or many restriction
sites. - RFLPs are highly dependent on experimental
conditions.
25GIARDIA RFLP of Intergenic rRNA Spacer (IGS)
RFLP of the IGS locus differentiates Four strains
compared to 2 identified By isoenzyme analysis.
26TB-RFLP with Insertion Sequences
- IS6110- Fingerprinting use alu to digest
genome. Little variation in RFLP. Question is,
in which fragments is the insertion element
present? - IS6110 is a transposon that jumps around the
genome. - IS6110 is not purely a sequence surrogate, it
is also a transposon surrogate
27IS6110
- The ruler is ALIVE
- It is dynamic, and reaches equilibrium slower
than TB in an outbreak.
28IS6110
- of IS6110 copies in TB genomes varies from 0 to
25. When copy number is low, klt5, there is less
change in fingerprints - -contact investigation is very hard.
29RAPD or AP-PRC
- RAPD/AP-PCR- Amplify with random primers.
- Sequence surrogateTests whether there is a
change in the template regions only. Analysis is
the same as that for RFLP. - Cycles of low-stringency leads to amplification
of contaminants. - Highly dependent on reaction conditions.
- Groupings correspond to Isozymes.
30AFLPs
- AFLPs digest DNA, ligate to adaptors, PCR
- Dont need low-stringency steps, less
non-specific amplification. - Same analysis as RFLPs, need .2 to 1mg of DNA.
- No good for Giardia and other parasitesneed too
much DNA.
31Smaller Still
Identifying Clonal Lineages Tracking transmission
32Methods
- Minisatellites
- Microsatellites
- IGS rDNA intergenomic spacer
33Microsatellites
- Simple Sequence Repeats
- Repeating motifs for 2-5bp
- Scattered throughout the genome
- Amenable to PCR and cloning due to small allele
size.
34Minisatellites
- Repeating motifs 10-100 bp
- Analysed with DNA
- probes specific for a single locus.
35TB Spoligotyping
- Spacer Oligotyping
- Direct repeat (DR) locus 36bp, freq. varies
- Use primers somewhere in the DR, amplify
non-repetitive spacer sequences 34-41bp - Identify the spacers by hybridization to know
sequence oligonucleotides - Need sequence to generate the oligos
36Depends on
- Dynamics of DR regions.
- Change in sequence in non-repetitive regions.
- DR regions-are they at equilibrium?
- How often do they repeat?
- -Not yet known
37Spoligotypes vs. IS6110
- IS6110 IS6110 types Spoligotypes
- 1 1 10
- 2-5 7 8
- gt5 80 52
- Spoligotyping can identify M. bovis (BCG vaccine)
- Detection and strain differentiation can be done
- Simultaneously without culture.
38Crossing scales
39Crossing scales
- DNA sequence of small subunit (SSU) ribosomal RNA
(highly conserved) suggests four groups of
Giardia. Groups 3 and 4 are only in Dogs. 293bp - 1-------GCG------_G---------T-------C-------------
------ - 2-------ATC-------AC---------G------G-------------
------ - 3-------ATC-------AC---------A------G---------T---
----- - 4-------ATC-------AC---------A------A----------T--
--A- - 1 and 2 are mainly in humans, though some dogs
have 3. 2,3,4 and four are nearly identical - Is this good evidence against zoonosis?
-
40Models of Nucleotide Substitution
- On a large scale, we can calculate the rate of
substitution, then estimate the likelihood of any
given substitution and control for confounders
(transition-transversion, codon bias etc). - On a small scale we do not know rate, the process
is nearly random, and confounders may be
irrelevant
41Distributions
BINOMIAL Pr(Yy)n!/(y!(n-y) Py(1-P)n-y
Mean nP Variance nP(1-P) POISSON Pr(Yy)
uye-u / y! Mean and Variance u Central Limit
Theorem Large number of events? normal
distribution Binomial- coin toss. Poisson- rare
events. Tossing a 100,000 sided die.
42Kimuras 2 parameter
- For instance, as the rate of transition and
transversion become small Kimuras 2 parameter
model reduces to a one parameter model - K -(1/2) ln1-2P-Qv(1-2Q) ?
- KP Q
- where K is the distance per site
- and P and Q are the fractions of sites with
transition vs/ transversion changes.
43How to Analyze RFLP and other sequence surrogates
- Two sources of information number of bands, and
size of each fragment. - -In practice, it can be difficult to score
changes in fragment size. Most studies - look only at the presence or absence of
- a certain pattern.
44Nei and Lis model for RFLP
- The expected frequency of restriction sites with
r nucleotide pairs depends on GC content and GC
content of restriction site sequence - A (g/2)r1(1-g)/2r2
- G GC of genome
- r1, r2 are GC, and AT frequencies in
Restriction site. r1r2r
45- mtnumber of nucleotide pairs in genome
- mta n, the expected of restriction sites
- What is the probability that the n changes over
time t?
46- Mutations are a Poisson process.
- P e-rlt
- lMutation rate/nucleotide
- r Length of restriction sequence
- t Time
47Nei and Li continued
- n(t) number of bands at time t n1(t) n2(t)
- n1(t) of sites that do not change
- n2(t) number of new sites.
- E(n)n0P mta(1-P) or E(n2) E(n1)
- Variance n1(t) and n2(t) are independent
- Var n(t) Varn1(t)Varn2(t)
- n1(t) is binomial, n2(t) is poisson
- Var n(t)n0P(1-P) mta(1-P)
48IS6110 is modelled similarily
- Transposition is raremodeled as a Poisson
process - Prob of at least 1 change 1-ekqt
- Where k of copies of transposon in genome
- And q is the rate of transposition when k1
49Really Small-New Technology
- Genetic marking of drug resistance, or virulence
- -Represenational Difference Analysis (RDA)
- -High-throughput genotyping
- -Microarrays
50Representational Difference Analysis
- Cloning the Differences Between Two Complex
Genomes Lisitsyn Science, feb 1993 - Uses Subtractive and Kinetic enrichment to purify
fragments present in one population, but absent
in another. - Basically differential amplification of
polymorphic fragments
51High-Throughput Genotyping
- Flourescent labels incorporated into RAPDs,
microsatellites and AFLP - Can run in ONE electrophoresis lane.
- Result complicated fingerprints that take into
account variation at different levels.
52Conclusions
- 1 The strongest analyses will be those that
consider variation on multiple temporal levels. - 2. Everyone says their technique is economically
feasible for use in endemic countries no one
says how much their technique costs. - 3. Stay away from Guatemalan raspberries.