Title: Population Genetics
1Population Genetics
2Evolution by Natural Selection
- Unlike Mendel, Charles Darwin made a big splash
when his defining work, "On the Origin of Species
by Means of Natural Selection, or the
Preservation of Favoured Races in the Struggle
for Life" (which we refer to as The Origin of
Species) published in 1859. - Darwin set forth a scientific theory that
described how one species could give rise to
another species, given sufficient time. It was
heavily attacked at the time (and continuing to
this day) by people who thought that it
contradicted their religious beliefs.
Nevertheless, the basic theory has survived and
flourished, and today it is one of the main
pillars of biological theory.
3Fitness
- A fundamental concept in evolutionary theory is
fitness, which can defined as the ability to
survive and reproduce. Reproduction is key to
be evolutionarily fit, an organism must pass its
genes on to future generations. - Basic idea behind evolution by natural selection
the more fit individuals contribute more to
future generations than less fit individuals.
Thus, the genes found in more fit individuals
ultimately take over the population. - Natural selection requires 3 basic conditions
- 1. there must be inherited traits.
- 2. there must be variation in these traits among
members of the species. - 3. some inherited traits must affect fitness
4Genetics of Populations
- Darwin didnt understand how inheritance
worked--Mendels work was still in the future.
It wasnt until the 1930s when Mendelian
genetics was incorporated into evolutionary
theory, in what is called the Neo-Darwinian
synthesis. - Translated into Mendelian terms, the basis for
natural selection is that alleles that increase
fitness will increase in frequency in a
population. - Thus, the main object of study in evolutionary
genetics is the frequency of alleles within a
population. - A population is a group of organisms of the
same species that reproduce with each other.
There is only one human population we all
interbreed. - The gene pool is the collection of all the
alleles present within a population. - We are mostly going to look at frequencies of a
single gene, but population geneticists generally
examine many different genes simultaneously.
5Allele and Genotype Frequencies
- Each diploid individual in the population has 2
copies of each gene. The allele frequency is the
proportion of all the genes in the population
that are a particular allele. - The genotype frequency of the proportion of a
population that is a particular genotype. - For example consider the MN blood group. In a
certain population there are 60 MM individuals,
120 MN individuals, and 20 NN individuals, a
total of 200 people. - The genotype frequency of MM is 60/200 0.3.
- The genotype frequency of MN is 120/200 0.6
- The genotype frequency of NN is 20/200 0.1
- The allele frequencies can be determined by
adding the frequency of the homozygote to 1/2 the
frequency of the heterozygote. - The allele frequency of M is 0.3 (freq of MM)
1/2 0.6 (freq of MN) 0.6 - The allele frequency of N is 0.1 1/2 0.6
0.4 - Note that since there are only 2 alleles here,
the frequency of N is 1 - freq(M).
6Heterozygosity and Polymorphism
- A gene is called polymorphic if there is more
than 1 allele present in at least 1 of the
population. Genes with only 1 allele in the
population are called monomorphic. Some genes
have 2 alleles they are dimorphic. - In a study of white people from New England, 122
human genes that produced enzymes were examined.
Of these, 51 were monomorphic and 71 where
polymorphic. On the DNA level, a higher
percentage of genes are polymorphic. - Heterozygosity is the percentage of heterozygotes
in a population. Averaged over the 71
polymorphic genes mentioned above, the
heterozygosity of this population of humans was
0.067.
7Hardy-Weinberg Equilibrium
- Early in the 20th century G.H. Hardy and Wilhelm
Weinberg independently pointed out that under
ideal conditions you could easily predict
genotype frequencies from allele frequencies, at
least for a diploid sexually reproducing species
such as humans. - For a dimorphic gene (two alleles, which we will
call A and a), the Hardy-Weinberg equation is
based on the binomial distribution - p2 2pq q2 1
- where p frequency of A and q frequency of
a, with p q 1. - p2 is the frequency of AA homozygotes
- 2pq is the frequency of Aa heterozygotes
- q2 is the frequency of aa homozygotes
- H-W can be viewed as an extension of the Punnett
square, using frequencies other than 0.5 for the
gamete (allele) frequencies.
8Hardy-Weinberg Example
- Taking our previous example population, where the
frequency of M was 0.6 and the frequency of N was
0.4. - p2 freq of MM (0.6)2 0.36
- 2pq freq of MN - 2 0.6 0.4 0.48
- q2 freq of NN (0.4)2 0.16
- These H-W expected frequencies dont match the
observed frequencies. We will examine the
reasons for this soon.
9Rare Alleles and Eugenics
- A popular idea early in the 20th century was
eugenics, improving the human population
through selective breeding. The idea has been
widely discredited, largely due to the evils of
forced eugenics practiced in certain countries
before and during World War 2. We no longer
force genetically defective people to be
sterilized. - However, note that positive eugenics encouraging
people to breed with superior partners, is still
practiced in places. - The problem with sterilizing defectives is that
most genes that produce a notable genetic
diseases are recessive only expressed in
heterozygotes. If you only sterilize the
homozygotes, you are missing the vast majority of
people who carry the allele. - For example, assume that the frequency of a gene
for a recessive genetic disease is 0.001, a very
typical figure. Thus p 0.999 and q 0.001.
Thus p2 0.998, 2pq 0.002, and q2 0.000001.
The ratio of heterozygotes (undetected carriers)
to homozygotes (people with the disease) is 2000
to 1 you are sterilizing only 1/2000 of the
people who carry the defective allele. This is
simply not a workable strategy for improving the
gene pool.
10Nazi Eugenics
"The Threat of the Underman. It looks like this
Male criminals had an average of 4.9 children,
criminal marriage, 4.4 children, parents of slow
learners, 3.5 children, a German family 2.2
children, and a marriage from the educated
circles, 1.9 children."
11Estimating Allele Frequencies from Recessive
Homozygote Frequency
- If Hardy-Weinberg equilibrium is assumed (an
assumption we will examine shortly), it is
possible to estimate the allele frequencies for a
gene that shows complete dominance even though
heterozygotes cant be distinguished from the
dominant homozygotes. - The frequency of recessive homozygotes is q2.
Thus, the frequency of the recessive allele is
the square root of this. Very simple. - For example, the recessive genetic disease PKU
has a frequency in the population of about 1 in
10,000. q2 thus equals 0.0001 (10-4). The
square root of this is 0.01 (10-2), which implies
that the frequency of the PKU allele is 0.01 and
the frequency of the normal allele is 0.99. Thus
the frequency of the heterozygous genotype is 2
0.99 0.01 0.198. Abut 2 of the population
is a carrier of the PKU allele. - Note again this ASSUMES H-W equilibrium, and
this assumption is not always true.
12Necessary Conditions for Hardy-Weinberg
Equilibrium
- The relationship between allele frequencies and
genotype frequencies expressed by the H-W
equation only holds if these 5 conditions are
met. None of them is completely realistic, but
all are met approximately in many populations. - If a population is not in equilibrium, it takes
only 1 generation of meeting these conditions to
bring it into equilibrium. Once in equilibrium,
a population will stay there as long as these
conditions continue to be met. - 1. no new mutations
- 2. no migration in or out of the population
- 3. no selection (all genotypes have equal
fitness) - 4. random mating
- 5. very large population
13Testing for H-W Equilibrium
- If we have a population where we can distinguish
all three genotypes, we can use the chi-square
test once again to see if the population is in
H-W equilibrium. The basic steps - 1. Count the numbers of each genotype to get the
observed genotype numbers, then calculate the
observed genotype frequencies. - 2. Calculate the allele frequencies from the
observed genotype frequencies. - 3. Calculate the expected genotype frequencies
based on the H-W equation, then multiply by the
total number of offspring to get expected
genotype numbers. - 4. Calculate the chi-square value using the
observed and expected genotype numbers. - 5. Use 1 degree of freedom (because there are
only 2 alleles).
14Example
- Data 26 MM, 68 MN, 106 NN, with a total
population of 200 individuals. - 1. Observed genotype frequencies
- MM 26/200 0.13
- MN 68/200 0.34
- NN106/200 0.53
- 2. Allele frequencies
- M 0.13 1/2 0.34 0.30
- N 0.53 1/2 0.34 0.70
- 3. Expected genotype frequencies and numbers
- MM p2 (0.30)2 0.09 (freq) x 200 18
- MN 2pq 2 0.3 0.7 0.42 (freq) 200 84
- NN q2 (0.70)2 0.49 (freq) 200 98
- 4. Chi-square value
- (26 - 18)2 / 18 (68 - 84)2 / 84 (106 - 98)2 /
98 - 3.56 3.05 0.65
- 7.26
- 5. Conclusion The critical chi-square value for
1 degree of freedom is 3.841. Since 7.26 is
greater than this, we reject the null hypothesis
that the population is in Hardy-Weinberg
equilibrium. -
15Relaxing the H-W Conditions Random Mating
- The fullest meaning of random mating implies
that any gamete has an equal probability of
fertilizing any other gamete, including itself.
In a sexual population, this is impossible
because male gametes can only fertilize female
gametes. - More or less random mating in a sexual population
is achieved in some species of sea urchin, which
gather in one place and squirt all of their
gametes, male and female, out into the open sea.
The gametes then find each other and fuse
together to become zygotes. - In animal species, mate selection is far more
common than random fertilization. A very general
rule is assortative mating, that like tends to
mate with like tall people with tall people,
short people with short people, etc. This rule
is true for externally detectable phenotypes such
as appearance, but invisible traits like blood
groups are usually close to H-W equilibrium in
the population. - Assortative mating is most easily analyzed as a
tendency for inbreeding. You are more like your
relatives than you are to random strangers. Thus
you are somewhat more likely to mate with a
distant relative than would be expected by chance
alone.
16My Boyfriend is Type B
Japanese Blood Type Personality Chart Japanese Blood Type Personality Chart
Type A Type A
Best Traits Conservative, reserved, patient, punctual, perfectionist and good with plants.
Worst Traits Introverted, obsessive, stubborn, self conscious, and uptight
Type B Type B
Best Traits Creative and passionate. Animal loving. Optimistic and flexible
Worst Traits Forgetful, irresponsible, individualist
Type AB Type AB
Best Traits Cool, controlled, rational. Sociable and popular. Empathic
Worst Traits Aloof, critical, indecisive and unforgiving
Type O Type O
Best Traits Ambitious, athletic, robust and self-confident. Natural leaders
Worst Traits Arrogant, vain and insensitive. Ruthless
in Korean, written and directed by Choi Seok-Won
17Measuring Inbreeding
- Recall that inbreeding decreases the number of
heterozygotes in the population each generation
of selfing decreases the number of heterozygotes
by 1/2. - By comparing the number of heterozygotes observed
to the number expected for a population in H-W
equilibrium, we can estimate the degree of
inbreeding. - A measure of inbreeding in the inbreeding
coefficient, F. - F 1 - (obs hets) / (exp hets).
- If F 0, the observed heterozygotes is equal to
the expected number, meaning that the population
is in H-W equilibrium. - If F 1, there are no heterozygotes, implying a
completely inbred population. - Thus, the higher F is, the more inbred the
population is.
18Example
- Wild oats is a common plant in California, the
cause of the golden-brown hillsides all summer
out there. - Wild oats can pollinate itself, but the pollen
also blows in the wind so it can cross fertilize.
The task is to estimate the relative proportions
of these two types of mating. - Data for the phosphoglucomutase (Pgm) gene
- 104 AA, 9 AB, 42 BB 155 total individuals
- H-W calculations
- freq of A 104 1/2 9 108.5 / 155 0.7
- freq of B 1 - freq(A) 0.3
- exp heterozygotes 2pq 2 0.7 0.3 0.42
(freq) 155 65.1 - F 1 -(obs hets) / (exp hets) 1 - 9 / 65.1 1
- 0.14 - F 0.84
- This is a very inbred population most matings
are self-pollination.
19Inbreeding Depression and Genetic Load
- For most species, including humans, too much
inbreeding leads to weak and sickly individuals,
as seen in this example of mice inbred by
brother-sister matings. - Inbreeding depression is caused by homozygosity
of genes that have slight deleterious effects.
It has been estimated that on the average, each
human carries 3 recessive lethal alleles. These
are not expressed because they are covered up by
dominant wild type alleles. This concept is
called the genetic load. - However, it has been argued that some amount of
inbreeding is good, because it allows the
expression of recessive genes with positive
effects. The level of inbreeding in the US has
been estimated (from Roman Catholic parish
records) at about F 0.0001, which is
approximately equivalent to each person mating
with a fifth cousin.
gen litter size dead by 4 weeks
0 7.50 3.9
6 7.14 4.4
12 7.71 5.0
18 6.58 8.7
24 4.58 36.4
30 3.20 45.5
20Mutation
- Mutation is unavoidable. It happens as a result
of radiation in the environment cosmic rays,
radioactive elements in rocks and soil, etc., as
well as mutagenic chemical compounds, both
natural and artificially made, and just as a
chance event inherent in the process of DNA
replication. - However, the rate of mutation is quite low for
any given gene, about 1 copy in 104 - 106 is a
new mutation. - Mutations provide the necessary raw material for
evolutionary change, but by themselves new
mutations do not have a measurable effect on
allele or genotype frequencies.
21Migration
- Migration is the movement of individuals in or
out of a population. Migration is necessary to
keep a species from fragmenting into several
different species. Even as low a level as one
individual per generation moving between
populations is enough to keep a species unified. - Migration can be thought of as combining two
populations with different allele frequencies and
different numbers together into a single
population. After one generation of random
mating, the combined population will once again
be in H-W equilibrium.
22Migration Examples
- Population X has 20 individuals with frequency of
the A allele 0.8. Population Y has 10
individuals with frequency of the A allele 0.2.
The two populations mix. What is the frequency
of A in the final population? - There are 20 10 30 individuals in the final
population, for a total of 60 copies of the gene.
- For population X, 40 0.8 32 copies are A, and
8 are a. - For population Y, 20 0.2 4 copies are A, and
16 are a. - Adding these together, the final population has
32 4 36 A alleles and 8 16 24 a alleles.
Out of 60 alleles, the frequency of A is 36/60
0.6 - A real example African Americans have a large
proportion of African ancestry, but also some
European ancestry. The Duffy blood group has an
allele with a frequency of 0 among West African
populations, and an average frequency of 0.43
among European populations. Other blood groups
can also be used in this technique very little
assortative mating occurs on the basis of blood
group. - In Oakland CA, African-Americans are reported to
have about 22 European ancestry - In Charleston South Carolina, the proportion is
about 3.7
23Selection
- Selection is the primary factor driving
evolution. Genes that confer increased fitness
tend to take over a population. Note that random
events also play a big factor sometimes a good
gene is lost due to chance events. Also, a gene
that confers increased fitness in one environment
may confer decreased fitness in another
environment. - Selection can occur at many places in the life
cycle the embryo might be defective, the fetus
might not survive to birth, the immature
offspring might be killed, the individual might
not be able to find a mate or might be sterile. - We will simplify all of this by assuming that the
gametes are produced at random and combine at
random, to produce a population of zygotes in H-W
equilibrium. Then, we will apply selection to
the zygotes, killing off different proportions of
the different genotypes. - Fitness is a function of the genotype. We will
define the relative fitness of the best
genotype as equal to 1.0, and the fitnesses of
the two other genotypes as equal to or less than
1.
24Selection Against Recessive Homozygote
- This situation is what happens with a recessive
genetic disease. Heterozygotes and dominant
homozygotes are indistinguishable and have the
same relative fitness 1.0. The recessive
homozygote has the genetic disease and a fitness
less than 1. The exact fitness depends on the
nature of the disease. - Start with a population where p 0.6 and q
0.4, and assume that the aa homozygote has a
relative fitness of 0.1 (i.e. 90 of the aa
offspring die without reproducing). - The zygotes produces (in H-W equilibrium) are
0.36 AA, 0.48 Aa, and 0.16 aa. - Selection on the zygotes reduces the aas by 90,
to 0.016. - However, proportions must add to 1.0, so we
divide each proportion by a correction factor.
The correction factor is the sum of the remaining
proportions 0.36 0.48 0.016 0.856. - So, after selection, the frequency of AA is 0.36
/ 0.856 0.42. The frequency of Aa is 0.48 /
0.856 0.56. The frequency of aa is 0.016 /
0.856 0.019. - Final allele frequencies A 0.42 1/2 0.56
0.70. a 1 - freq(A) 0.3.
25Selection Favoring the Heterozygote
- Some genes maintain 2 alleles in the population
by having the heterozygote more fit than either
homozygote. - An example is HbS, the sickle cell hemoglobin
allele. In rural West Africa, where malaria is
endemic and medical support is rudimentary, the
relative fitness of the HbA homozygote is
estimated at 0.85, due to susceptibility to
malaria. The relative fitness of the HbS
homozygote is estimated at approximately 0, with
almost none reaching reproductive age due to
sickle cell disease. The heterozygote is the
most fit, so it given a relative fitness of 1.0.
Under these conditions, it is possible to predict
an equilibrium frequency of the HbS allele of
about 0.13. This is approximately what is seen
in various West African countries.
26Genetic Drift
- Genetic drift is the random changes in allele
frequencies. Genetic drift occurs in all
populations, but it has a major effect on small
populations. - For Darwin and the neo-Darwinians, selection was
the only force that had a significant effect on
evolution. More recently it has been recognized
that random changes, genetic drift, can also
significantly influence evolutionary change. It
is thought that most major events occur in small
isolated populations. - Simple example A population of 1 female and 2
males, where the female chooses only 1 male to
mate with. Assume that the female has the Aa
genotype, male 1 is AA, and male 2 is aa. - initially the allele frequencies are 0.5 A and
0.5 a - if male 1 gets to mate, the offspring will have
a 0.75 A, 0.25 a frequency - if male 2 mates, the offspring will be 0.25 A
and 0.75 a.
27Fixation of Alleles
- Genetic drift causes allele frequencies to
fluctuate randomly each generation. However, if
the frequency of an allele ever reaches zero, it
is permanently eliminated from the population.
The other allele, whose frequency is now 1.0, is
fixed, which means that all individuals in the
population will be homozygous for that allele.
This continues for all future generations (in the
absence of mutation). - The average rate at which alleles become fixed is
a function of the population size. The larger
the population, the longer it takes for fixation
to occur.
28Population Bottlenecks and Founder Effect
- Bottlenecks and the founder effect are closely
related phenomena. - Founder effect If a small group of individuals
leaves a larger population and develops into a
separate, isolated population, the allele
frequencies in the new population are determined
by the allele frequencies in the founders. Since
these frequencies are probably different from
those found in the general population, the new
population will have a different set of
frequencies. - This is especially true for rare alleles, which
can suddenly become prominent if one of the
founders has the rare allele.
29Founder Effect Example
- Founder effect example the Amish are a group
descended from 30 Swiss founders who renounced
technological progress. Most Amish mate within
the group. One of the founders had Ellis-van
Crevald syndrome, which causes short stature,
extra fingers and toes, and heart defects. Today
about 1 in 200 Amish are homozygous for this
syndrome, which is very rare in the larger US
population. - Note the effect inbreeding has here the problem
comes from this recessive condition becoming
homozygous due to the mating of closely related
people.
30Bottlenecks
- A population bottleneck is essentially the same
phenomenon as the founder effect, except that in
a bottleneck, the entire species is wiped out
except for a small group of survivors. The
allele frequencies in the survivors determines
the allele frequencies in the population after it
grows large once again. - Example Pingalop atoll is an island in the South
Pacific. A typhoon in 1780 killed all but 30
people. One of survivors was a man who was
heterozygous for the recessive genetic disease
achromatopsia. This condition caused complete
color blindness. Today the island has about 2000
people on it, nearly all descended from these 30
survivors. About 10 of the population is
homozygous for achromatopsia This implies an
allele frequency of about 0.26.
31Human Bottleneck
- The human population is thought to have gone
through a population bottleneck about 100,000
years ago. There is more genetic variation among
chimpanzees living within 30 miles of each other
in central Africa than there is in the entire
human species. - The tree represents mutational differences in
mitochondrial DNA for various members of the
Great Apes (including humans).