Title: Mapping
1Mapping
- April 7, 2005
- Todd Scheetz
2Introduction
What is mapping? determining the location of
elements within a genome, with respect to
identifiable landmarks. Resolution
(lowest)
(highest)
Present in the genome
Base-pair resolution
3Introduction
- Earliest examples noticing that certain traits
co-segregate across generations -- linked.
(Mendels peas) - Types of mapping
- genetic mapping
- physical mapping
- restriction mapping
- cytogenetic mapping
- somatic cell mapping
- radiation hybrid mapping
- comparative mapping
4Introduction
Genetic mapping Utilize recombination events to
estimate distance between genetic markers. Look
at a population and estimate the recombination
fraction ? recombinants / total Best if a
measured between close markers. Unit of
distance in genetic maps centimorgans, cM 1 cM
1 chance of recombination between markers
5Genetic mapping example I
- Genes on two different chromosomes
- Independent assortment during meiosis
- No linkage
F1
9
3
3
1
6Genetic mapping example II
- Genes very close together on same chromosome
- Will usually end up together after meiosis
- Tightly linked
F1
1
2
1
7Genetic mapping example III
- Genes on same chromosome, but not very close
together - Recombination will occur
- Frequency of recombination proportional to
distance between genes - Measured in centiMorgans
recombinants
8Genetic Mapping
- Requires informative markers -- polymorphic
- and a population with known relationships
- RFLP random fragment length polymorphism
- STRP short tandem repeat polymorphism
- SNP single nucleotide polymorphism
- Greater polymorphism is desirable.
A2
A2
A2
A1
A2
A1
n
n
n
PIC 1 - ? pi2 - ? ? pi2 pj2
i1
i1
ji1
9RFLP
- Restriction-fragment length polymorphism
- Cut genomic DNA from two individuals with
restriction enzyme - Run Southern blot
- Probe with different pieces of DNA
- Sequence difference creates different band pattern
1
2
200
400
GGATCC CCTAGG
GGATCC GCTAGG
GGATCC CCTAGG
1
600
400
200
400
GGATCC CCTAGG
GCATCC GGTAGG
GGATCC CCTAGG
200
2
10SSLP
- Simple-sequence length polymorphism
- Most genomes contain repeats of three or four
nucleotides - Length of repeat varies
- Use PCR with primers external to the repeat
region - On gel, see difference in length of amplified
fragment
1
2
ATCCTACGACGACGACGATTGATGCT
1
18
12
2
ATCCTACGACGACGACGACGACGATTGATGCT
11SNP
- Single-nucleotide polymorphism
- One-nucleotide difference in sequence of two
organisms - Found by sequencing
- Example Between any two humans, on average one
SNP every 1,000 base pairs
ATCGATTGCCATGAC ATCGATGGCCATGAC
1
2
SNP
12Genetic Mapping
A2 B2
A2 B2
A2 B2
A1 B1
A1 B1
A1 B1
A1 B2
A2 B1
A2 B2
A1 B1
A1 B1
A1 B1
A2 B2
A2 B2
A2 B2
A1 B1
A1 B1
A1 B1
A1 B1
A1 B1
A1 B1
A1 B1
A1 B1
A1 B1
A1 B2
A2 B1
NR
NR
NR
NR
NR
R
R
? recombinant / total 2/7 0.286
13Genetic Mapping
Theta calculation with inbred populations/strains
bn
det
bn
det
x
det
bn
det
bn
bn
det
bn
det
x
det
bn
det
bn
bn
det
bn
det
bn
det
bn
det
det
bn
det
bn
det
bn
det
bn
? recombinant total
5/1000 0.005
banded
detached
banded, detached
wild-type
483
512
2
3
14Genetic Mapping
- ? theoretical maximum of 50
- independent assortment
- Most accurate when measured between close
markers. - Unit of distance in genetic maps centimorgans,
cM - d - 0.5 ln(1 - 2?)
- d 0.25 ln(1 2?)/(1 - 2?)
- 1 cM 1 chance of recombination between markers
15Genetic Mapping
16Fundamental Genetics (Background for Linkage
Analysis)
- Rule of Segregation
- offspring receive ONE allele (genetic material)
from the pair of alleles possessed by BOTH
parents - Rule of Independent Assortment
- alleles of one gene can segregate independently
of alleles of other genes - (Linkage Analysis relies on the violation of
Independent Assortment Rule)
17Examples
18Examples
19Example
20BBS4 Pedigree
21Hardy-Weinberg Equilibrium
- Rule that relates allelic and genotypic
frequencies in a population of diploid, sexually
reproducing individuals if that population has
random mating, large size, no mutation or
migration, and no selection - Assumptions
- allelic frequencies will not change in a
population from one generation to the next - genotypic frequencies are determined in a
predictable way by allelic frequencies - the equilibrium is neutral -- if perturbed, it
will reestablish within one generation of random
mating at the new allelic frequency
22(No Transcript)
23H-W
- f(AA) p2
- f(Aa) 2pq
- f(aa) q2
- (pq)2
- (p2 q2 r2 2pq 2pr 2qr) (pqr)2
24Dominant and Recessive Penetrance
Modeledpenetrance P(pt gt)
- DD Dd dd
- 1 1 0
- DD Dd dd
- 0.9 0.9 0.0
- DD Dd dd
- 0 0 1
- DD Dd dd
- 0 0 0.8
25- Given the following observations family
structure, affection status, genotypes, and
disease allele frequencies. Assuming a model for
the disease, can we calculate the probability
that these observations fit an assumed model???
26Linkage Analysis
- Goal find a marker linked to a disease gene.
- LOD score log of likelihood ratio
- LR?data k Pdata ?
- theta estimate of genetic distance
(recombination fraction) between marker and
disease - proportion of recombinant gametes/total gametes
27Linkage Analysis
- Linkage analysis calculates the likelihood that
the inheritance pattern of the phenotype
(disease) is supported by the observed
inheritance patterns (genotypes) in a pedigree. - few monogenic models, easy to test
- more difficult to find models explaining
inheritance in polygenic models - parameter maximization
28Linkage Analysis Programs
- FASTLINK - 2 point
- O(n2), where n number of markers
- GeneHunter - multipoint, 2 point
- O(n2), where n number of people
29Allele Sharing
- tries to show that affected family members
inherit the same chromosomal regions more often
than expected by chance
30Allele Sharing Example
Needs at least sibs.
31Association Studies
- Allelic association studies provide the most
powerful method for locating genes of small
effect contributing to complex diseases and
traits. Daniels, Am J Hum Genet 621189-1197,
1998. - Linkage analysis
- genome wide screen, 400 markers 10 cM (10 MB),
association needs 4000 polymorphic markers - generally need nuclear family or larger
- Association finds linkage disequilibruim
32Association Studies
- Association is simply a statistical statement
about the co-occurrence of alleles or phenotypes.
Allele A is associated with disease D if people
who have D also have A more (or maybe less) often
than would be predicted from the individual
frequencies of D and A in the population. Pg.
286 Human Molecular Genetics 2, Tom Strachan
33Examples
- HLA-DR4 (antigen marker)
- 36 in UK
- 78 with rheumatoid arthritis
- CF( RFLP markers XV2.c (X1,X2), KM19(K1,K2))
- Marker Alleles CF(case) Normal(control)
- X1, K1 3 49
- X1, K2 147 19
- X2, K1 8 70
- X2, K2 8 25
- CF associated with X1, K2 in 89 (Strachan)
34Linkage Disequilibrium
- linkage equilibrium (aka Hardy-Weinberg) is true
if - P(gt1,gt1gt2,gt2) P(gt1,gt1)P(gt2,gt2)
where P(haplotype) - case vs controls
- TDT (heterozygous marker transmitted), HRR
(untransmitted alleles as control) - allelic associations (outbred populations)
maintained at only lt 1cM
35Homozygosity Mapping
36BBS2 genetic mapping
C16
1 2 3 4 5 6 7 8 9 10 11 12
37BBS2 genetic mapping
unaffected
affected
C16
1 2 3 4 5 6 7 8 9 10 11 12
38Introduction
- Physical mapping
- Relies upon observable experimental outcomes
- hybridization
- amplification
- May or may not have a distance measure.
39Restriction Mapping
Background on restriction enzymes cut DNA at
specific sites Ex. EcoRI cuts at GAATTC sites
are often palindromic 5- GAATTC -3 3- CTTAAG
-5 may leave blunt ends
or overlaps
GGCC GG CC CCGG CC GG
GAATTC G AATTC CTTAAG CTTAA G
40Restriction Mapping
Restriction maps show the relative location of a
selection of restriction sites along linear or
circular DNA.
EcoRI
HindIII
BamHI
PstII
HindIII
BamHI
PstII
41Restriction Mapping
BamHI PstI
BglII BamHI
BglII PstI
BglII
BamHI
PstI
5.2
4.2
3.6
3.5
3.3
2.6
1.7
1.7
1.4
1.4
1.2
1.2
1.2
1.0
1.0
0.9
0.7
0.5
0.3
0.3
0.3
BglII
BamHI
PstI
BglII
PstI
0.3
0.7
2.6
0.9
0.5
1.2
42Restriction Mapping
Creating a restriction map from a double digest
experiment is NP-complete. No polynomial-time
solution. As the number of fragments increase,
the complexity increases as A!B!. if the two
single-enzyme reactions generate 6 and 8
fragments respectively, 29,030,400 potential
permutations to evaluate
A A! 1 1 2 2 3 6 4 24 5 120 6 720 7 5040 8 40,320
43Restriction Mapping
- Multiple valid solutions possible.
- Reflections
- Equivalence
- A 1,3,3,12 B 1,2,3,3,4,6
- AB 1,1,1,1,2,2,2,3,6
3
3
12
1
4320 map configurations, but only 208 distinct
solutions.
A
1
3
3
6
2
4
B
1
1
3
2
1
1
2
2
6
AB
1
3
3
12
A
1
3
3
2
4
6
B
1
1
1
3
1
2
2
2
6
AB
44Cytogenetic Mapping
Cytogenetic mapping refers to observing a map
location in reference to a chromosomal banding
pattern.
45CytogeneticMapping
These methods allow a rough determination of
location, but to not yield a direct measure of
distance.
46Cytogenetic Mapping
47Somatic Hybrid Mapping
Somatic cell mapping can be used to map an
element to a portion of a genome. typically with
chromosome resolution Exploits the ability of
rodent (hamster) cells to stably integrate
genetic material from other species. Cells from
the target genome are fused with hamster cells.
The resulting cells are then screened for cells
(hybrids) that have retained one or more of the
chromosomes from the target genome. Ideally, a
complete set of hybrids can be constructed such
that each has retained a single chromosome from
the target genome.
48Somatic Hybrid Mapping
Chromosome
1
2
3
4
5
Probe1
0
1
0
0
0
Probe2
1
1
0
0
0
Probe3
1
1
1
1
1
Probe1 -- maps to chromosome 2 Probe2 -- maps to
chromosomes 3 and 4 -- possible paralogs,
pseudogene, or low-copy repeat Probe3 -- maps
to all chromosomes -- possible high-copy
repeat or ribosomal gene
49Somatic Hybrid Mapping
A subset of the data used to map the Blood
Coagulating Factor III to human chromosome 1.
50Somatic Hybrid Mapping
Finer mapping (higher resolution) can be obtained
if hybrids are present in the panel that contain
partial chromosomes. (E.g., translocations) Such
a strategy is expensive, because numerous hybrids
have to be screened to identify hybrids
containing the partially retained chromosomes. A
more cost-effective and high-resolution
alternative is Radiation Hybrid Mapping.
51Radiation Hybrid Mapping
Radiation hybrid mapping is a method for
high-resolution mapping. Exploits the ability of
rodent cells (hamster cells) to stably
incorporate genetic material from fused
cells. Pro Resolution is tunable, relatively
cheap Con Difficult to compare results from
different groups
52Radiation Hybrid Mapping
53Radiation Hybrid Mapping
The data obtained from a radiation hybrid
experiment is similar to that from a somatic cell
hybrid. It is the retention data for the given
locus for each hybrid. This data is generally
displayed as a vector of numbers or letters 1 or
for retention 0 or - for non-retention 2 or ?
for ambiguous or unknown Ex. RN_ALB 0100110102010
001100100100000102210010.. RN_HEM 0101110102000100
101100200010100110010..
54Radiation Hybrid Mapping
Analytical methods -- Many ranging from
minimizing the number of obligate breaks to
sophisticated methods relying on maximum
likelihood or maximum posterior probability
methods. ? AB- A-B TH(RA RB -
2RARB) d - ln (1 - ?) NOTE ? ? 0,1
RA retention of A RB retention of B TH
total number of hybrids
55Summary of Mapping Strategies
56Comparative Mapping
Can be very useful in utilizing animal models of
human disease, and also in exploring the causes
of complex diseases. Comparing gene content,
localization and ordering among multiple species.
57Comparative MappingSources of Information
sequence
sequence
BLAST
mapping
mapping
potential orthologs
colocalization
Putative orthologs and syntenic segments
58Comparative MappingSources of Information
- GeneMap 99 (human)
- 42,000 ESTS
- 12,500 genes
- Mouse RH consortium (mouse)
- 14,000 ESTs
- UIowa EST placements (rat)
- 13,793 ESTs
59Current Status
Initial comparative map (Welcome Trust and Otsuka
Lab) about 500 previously identified
orthologs human-mouse-rat University of Iowa
comparative maps 13,973 placed ESTs 3057
significant mouse hits 9109 significant human EST
hits 10,148 significant hits to GenBanks nt
database 2479 rat ESTs in preliminary human-rat
comparative map 1671 rat ESTs in preliminary
mouse-rat comparative map
60Comparative MappingExamples
RNO18
MMU18
0
1200
100
900
200
600
300
300
400
61Comparative MappingExamples
HSA7
RNO4
400
0
RNO12
100
500
RNO12
HSA11
600
200
RNO5
700
300
HSA7p
HSA4
62Resources
Genome browsers http//genome.ucsc.edu http//ww
w.ensembl.org http//www.ncbi.nih.gov/