Title: Lecture 16: Introduction to Linkage Disequilibrium
1Lecture 16 Introduction to Linkage Disequilibrium
October 19, 2015
2Exam 2
- Wednesday, October 28 at 630 in lab
- Genetic Drift, Population Structure, Population
Assignment, Individual Identity, Paternity
Analysis, and Linkage Disequilibrium - Sample exam posted on website
- Review on Monday, October 26
3Last Time
- Population structure and gene flow
- Introduction to paternity analysis
4Today
- Multiple loci and independent segregation
- Estimating linkage disequilibrium
- Causes of linkage disequilibrium
5Extending to Multiple Loci
- So far, only considering dynamics of alleles at
single loci - Loci occur on chromosomes, linked to other loci!
- The fitness of a single locus ripped from its
interactive context is about as relevant to real
problems of evolutionary genetics as the study of
the psychology of individuals isolated from their
social context is to an understanding of mans
sociopolitical evolution - Richard Lewontin (quoted in Hedrick 2005)
- Size of region that must be considered depends on
Linkage Disequilibrium
6Gametic (Linkage) Disequilibrium (LD)
- Nonrandom association of alleles at different
loci into gametes - Haplotype Genotype of a group of loci in LD
- LD is a major factor in evolution
- LD itself provides insights into population
history - Estimation of LD is critical for ALL population
genetic data
7Nomenclature and concepts
- Two loci, two alleles
- Frequency of allele i at locus 1 is pi
- Frequency of allele i at locus 2 is qi
8Nomenclature and concepts
B1
A1
A2
B2
A1 and B1 are in coupling phase A1 and B2 are in
repulsion phase
9Gametic Disequilibrium
- Easiest to think about physically linked loci,
but not necessarily the case
What Are Expected Frequencies of Gametes in a
Population Under Independent Assortment?
10What are expected frequencies of gametes with
complete linkage?
The frequency of the gametes in the current
population. Expected to stay stable in the
absence of other departures from H-W
11Linkage disequilibrium measure, D
12Problem D is sensitive to allele frequencies
- Gamete frequencies must be between 0 and 1
- Maximum D set by allele frequencies
Solution D' D/Dmax ranges from -1 to 1
Example, if D is positive p10.5, q20.5,
Dmax0.25 but p10.1, q20.9, Dmax0.09
Dmax Calculation If D is positive, Dmax is
lesser of p1q2 or p2q1 If D is negative, Dmax is
lesser of p1q1 or p2q2
13LD can also be estimated as correlation between
alleles
- r can also be standardized to a -1 to 1 scale
- It is equivalent to D in this case
14Recombination
- Shuffling of parental alleles during meiosis
- Occurs for unlinked loci and linked loci
- Rate of recombination for linked markers is
partially a function of physical distance
15Recombination Rate
What is the expected recombination rate for
unlinked loci?
16Expected Gamete Frequencies Double Homozygote
A1
B1
17Expected Gamete Frequencies Double Heterozygote
A2
B2
18LD is partially a function of recombination rate
- Expected proportions of haplotypes produced in a
population after 1 generation of mating
Offspring Genotypes
Where c is the recombination rate
and D0 is the initial amount of LD
19Recombination degrades LD over time
Where t is time (in generations) and e is base of
natural log (2.718)
20Effects of recombination rate on LD
- Decline in LD over time with different
theoretical recombination rates (c) - Even with independent segregation (c0.5),
multiple generations required to break up allelic
associations
Where t is time (in generations) and e is base of
natural log (2.718)
21LD varies substantially across human genome
NATUREVol 43727 October 2005
Average r2 for pairs of SNP separated by 30 kb in
1 Mb windows
- LD affected by location relative to telomeres and
centromeres, chromosome length, GC content,
sequence polymorphism, and repeat composition - Highest and lowest levels of LD found in
gene-rich regions
22Human HapMap Project and Whole Genome Scans
NATUREVol 43727 October 2005
- LD structure of human Chromosome 19
(www.hapmap.org) - 1 common SNP genotyped every 700 bp for 270
individuals (3.4 million SNP) - 9.2 million SNP in total
23LD in the Poplar Genome
- LD declines rapidly with distance
- LD higher in genes than in genome as a whole
- Loci separated by kilobases still in LD!
Slavov et al. 2012 New Phyt 196713-725
24Recombination Across Poplar Chromosomes
- Substantial variation in recombination rate
- Related to repeat composition, methylation, and
distance from centromere
25Recombination rate varies among individuals
- Rate is often higher in females than males
- Rate varies among individuals within males and
females
Variation in recombination rate in the MHC region
(3.3 Mb in human sperm donors
26Genetic Drift and LD
- Begin with highly diverse haplotype pool
- Drift leads to chance increase of certain
haplotypes - Generates nonrandom association between alleles
at different loci (LD)
27Genetic Drift and LD
- Why doesnt recombination reduce LD in this
situation?
28LD is partially a function of recombination rate
- Expected proportions of gametes produced by
various genotypes over two generations - Effective LD increases with homozygosity
29Effect of Drift on LD
- Drift and recombination will have opposing
effects on LD
Where r2 is the squared correlation coefficient
for alleles at two loci, Ne is effective
population size, and c is recombination rate
- 4Nec is population recombination rate,
- Expression approaches 0 for large populations or
high recombination rates
30Combined effects of Drift and Recombination
- LD declines as a function of population
recombination rate - Effects of chance fluctuation of gamete
frequencies
Nec
31How should inbreeding affect linkage
disequilibrium?