Title: Genomic Conflict and DNA Sequence Variation
1Genomic Conflict and DNA Sequence Variation
Marcy K. Uyenoyama Department of Biology Duke
University
2Overview
- Population genetics
- Historically model-rich
- Present need model-based interpretation of
observed patterns of genomic variation - What are hallmarks of each model?
- Self-incompatibility systems in plants
- Recognizing genomic conflict due to sexual
antagonism
3Canonical models
- Neutral evolution
- Pure neutrality distribution of offspring
number is independent of any trait in parent - Demographic history deme founding, gene flow
- Purifying selection maintain functioning state
against random deleterious mutations - Selection
- Balancing selection maintenance of different
forms - Selective sweeps substitution of most fit for
less fit
4Hallmarks of evolution
- How do we know it when we see it?
- Patterns evident in genome variation
- Model selection
- Choosing among a small number of canonical models
for any particular system
5A random sample of genes
Observed
Sample
Ancestral sequence
6Allele and mutation spectra
Site frequency spectrum
Number of mutations
Multiplicity
a a1 6, a3 1, a5 1, a6 1, for ai the
number of alleles with multiplicity i
7The neutral coalescent
Sample root from stationary distribution of
P,mutation transition matrix and bifurcate
- After an interval choose a lineage at random
- Replace it by two identical copies with
probability - Mutate it according to P with probability
8Evolutionary rates
- Events on level k
- Bifurcation at rate Mutation at rate
- Population parameters ratios of rates
- Next event is a bifurcation/coalescence with
probability
9Allele and mutation spectra
Site frequency spectrum
Number of mutations
Multiplicity
a a1 6, a3 1, a5 1, a6 1, for ai the
number of alleles with multiplicity i
10Infinite-alleles model
- Mutation
- Novel allelic types formed at rate u per gene per
generation - Reproduction
- Frequency of allele i in the parental population
pi - Multinomial sampling of N genes to form the
offspring - To find probability of the sample of n genes
(n1, n2, , nk) or (a1, a2, , an) - for k the number of distinct haplotypes
(alleles) ni the number of replicates of allele
i ai the number of alleles with i replicates
11Ewens sampling formula
a (a1, a2, , an), for ai the number of alleles
represented by i replicates in a sample of size
n ? 2Nu, for N the effective number of genes
and u the per-locus, per-generation rate of
mutation
Ewens (1972, Theoretical Population Biology)
12Allele and mutation spectra
Site frequency spectrum
Number of mutations
Multiplicity
a a1 6, a3 1, a5 1, a6 1, for ai the
number of alleles with multiplicity i
13Population genomics
About 750 accessions isolated from natural
populations worldwide Summary statistics for
sample of 19 entire genomes
http//www.arabidopsis.org
14Arabidopsis SNP spectra
2
Minor allele counts
3
5
6
7
8
4
Site frequency spectra differ among functional
classes
Kim et al. (2008 Nature Genetics. 39 1151)
15ESF conditioned on two alleles
- Biallelic sample of size m
- Multiplicities i and (m i )
independent of ?!
16Ewens sampling formula
a (a1, a2, , an), for ai the number of alleles
represented by i replicates in a sample of size
n ? 2Nu, for N the effective number of genes
and u the per-locus, per-generation rate of
mutation
Ewens (1972, Theoretical Population Biology)
17Actual site frequency spectra
Excess of rare and common types, deficiency of
intermediate types Data from NIEHS Environmental
Genome Project Direct resequencing of loci
considered environmentally-sensitive Global
representation of ethnicities
Hernandez, Williamson, and Bustamante (2007)
18Spectrum shape
Signature of expansion? Expansions maintain more
rare mutations Signature of selective
sweep? Neutral variants experience selection asa
population bottleneck
Black constant population size Grey recent
expansion from small population size
19Arabidopsis SNP spectra
2
Minor allele counts
3
5
6
7
8
4
Site frequency spectra differ among functional
classes
Kim et al. (2008 Nature Genetics. 39 1151)
20Modelling a SNP data set
Nordborg (2001 Handbook of Statistical Genetics)
- Single segregating mutation in the sample
genealogy - Conditional on exactly one segregating site,
determine the distribution of the size (number of
descendants) of the branch on which the mutation
occurs - Exactly two alleles in the sample
- Conditional on two haplotypes, bearing any number
of segregating sites, determine the distribution
of numbers of the two alleles
21Conditioning
- Two alleles
- One segregating site
22Multiplicity conditioned on a SNP
- Single segregating site in a sample of size m
- Multiplicity i
dependent on ? !
Ganapathy and Uyenoyama (2009 Theoretical
Population Biology)
23Arabidopsis SNP spectra
2
Minor allele counts
3
5
6
7
8
4
Site frequency spectra differ among functional
classes
Kim et al. (2008 Nature Genetics. 39 1151)
24Overview
- Population genetics
- Historically model-rich
- Present need model-based interpretation of
observed patterns of genomic variation - What are hallmarks of each model?
- Self-incompatibility systems in plants
- Recognizing genomic conflict due to sexual
antagonism
25Genomic conflict
- Phenotypes
- Multiple genes generally influence a given
phenotype - Conflict
- Target trait value differs among genes that
control phenotype - Sexual antagonism
- Male and female function collaborate in
reproduction - Genes influencing each function may come into
conflict
26Conflict and genomic variation
- Mating type regions as a battleground
- S-locus controls self-incompatibility in
flowering plants - How does sexual antagonism affect the pattern of
molecular-level variation within the S-locus? - What are hallmarks of conflict?
- Develop a basis for inference
- Model-based approach to the analysis of genetic
variation
27- Flower development
- Basic perfect flower includes both male and
female components - Fertilization
- Pollen grains deposited on stigma germinate and
pollen tubes grow down style to the ovary
Mariana Ruiz http//commons.wikimedia.org/wiki/Fil
eMature_flower_diagram.svg
28Mariana Ruiz http//commons.wikimedia.org/wiki/Fil
eMature_flower_diagram.svg
- Gametophytic SI (GSI)
- Specificity expressed by individual pollen grain
or tube determined by own S-allele - Pollen rejection
- Growth of pollen tube arrested in style
Norbert Holstein http//commons.wikimedia.org/wiki
/FileGametophytic_self-incompatibility.png
29Mariana Ruiz http//commons.wikimedia.org/wiki/Fil
eMature_flower_diagram.svg
- Sporophytic SI (SSI)
- Specificity expressed by individual pollen grain
or tube determined by the S-locus genotype of its
parent - Pollen rejection
- Germination of pollen grain may be arrested at
stigma surface
Norbert Holstein http//commons.wikimedia.org/wiki
/FileSporophytic_self-incompatibility.png
30Mariana Ruiz http//commons.wikimedia.org/wiki/Fil
eMature_flower_diagram.svg
Norbert Holstein http//commons.wikimedia.org/wiki
/FileGametophytic_self-incompatibility.png
Pistil (A) component rejection ofrecognized
specificities Pollen (B) component declaration
ofspecificity
Norbert Holstein http//commons.wikimedia.org/wiki
/FileSporophytic_self-incompatibility.png
31Mating type regions
Uyenoyama (2005)
32Human Y chromosome
Skaletsky et al. (2003 Nature 423 825)
- Non-recombining male-specific Y (MSY)
- Euchromatic region 23 MB
- Differences between two random Ys every 3 4 KB
- Mammalian sex determinant SRY
- Y-linked regulator of transcription of many
male-specific Y-linked genes
33Mating type regions
- Linkage between pistil (A) and pollen
(B)components is essential to SI function - Pollen declaration of specificity
- Pistil rejection of recognized specificities
Uyenoyama (2005)
34Brassica S-locus
Natural populations often contain 30 50
S-alleles
Nasrallah (2000 Curr. Opin. Plant Biol.)
35Ubiquitin tags proteins for degradation
- Style S-RNase disrupts pollen tube growth
- Upon entering a pollen tube, S-RNases initially
sequestered in a vacuole - In incompatible crosses, vacuole breaks down,
releasing S-RNases into cytoplasm of pollen tube - Pollen SLF (S-locus F-box)
- Mediator of ubiquitinylation (attachment of
ubiquitin) - Disables all S-RNases except those of the same
specificity
Vierstra (2009, Nature Reviews Molecular Cell
Biology)
36Sexual antagonism
- Pistil why reject fertilization?
- Screening of potential mates may improve
offspring quality - Cost under incomplete reproductive compensation
ovules may go unfertilized - Pollen why provoke rejection?
- Self-rejection may improve quality of own ovules
- Rejection by other plants reduces siring success
- Hide behind another S-specificity in sporophytic
SI? - Decline to declare S-specificity altogether?
37GSI model
- Basic discrete time recursion
- Symmetries in genotype and allele frequencies
- Model change in frequency of focal allele i,
assuming all other alleles in equal frequency
Wright (1937, Genetics)
38Diffusion approximation
- Change in allele frequency
- Diffusion equation coefficients
- holds for large population size (N) and u (rate
of mutation to new S-alleles) of order 1/N
Wright (1937, Genetics)
39Wrights diffusion model
- Diffusion with jumps
- Turnover rate
40Expansion of time scale under balancing selection
- High rate of invasion of rare alleles
- Promotes invasion of new and retention of rare
types - Maintains high numbers of alleles
- Genealogical relationships
- Tree shape similar under symmetric balancing
selection and neutrality - Greatly expanded time scale
Takahata (1993, Mechanisms of Molecular Evolution)
41S-allele turnover
- Quasi-equilibrium of S-alleles
- Invasion of new, rare S-alleles balanced by
extinction of common S-alleles - Expansion of time scale
- Rate of divergence among S-allele classes similar
to rate among neutral lineages, but in a
population of size fN
42Gametophytic SI models
- Basic discrete time recursion
- Diffusion approximation
- Parameters
- Effective population size (N)
- Rate of mutation to new S-specificities (u)
43Simulation results
- Stationary distribution of allele frequency
- Most time spent close to deterministic
equilibrium (1/n) or in boundary layer close to
extinction - Number of S-alleles
- Analytical expectation for number of common
S-alleles
Vallejo-MarÃn and Uyenoyama (2008)
44Mariana Ruiz http//commons.wikimedia.org/wiki/Fil
eMature_flower_diagram.svg
Norbert Holstein http//commons.wikimedia.org/wiki
/FileGametophytic_self-incompatibility.png
Pistil (A) component rejection ofrecognized
specificities Pollen (B) component declaration
ofspecificity
Norbert Holstein http//commons.wikimedia.org/wiki
/FileSporophytic_self-incompatibility.png
45Pollen specificity in GSI
- Each pollen expresses its own specificity
- Rarer specificities are incompatible with fewer
plants - Incompatible matings
- For n S-alleles in equal frequencies, a pollen
type is incompatible with a proportion 2/n of all
plants
Norbert Holstein http//commons.wikimedia.org/wiki
/FileGametophytic_self-incompatibility.png
46Sexual antagonism
- Pistil why reject fertilization?
- Screening of potential mates may improve
offspring quality - Cost under incomplete reproductive compensation
ovules may go unfertilized - Pollen why provoke rejection?
- Self-rejection may improve quality of own ovules
- Rejection by other plants reduces siring success
- Hide behind another S-specificity in sporophytic
SI? - Decline to declare S-specificity altogether?
47Fate of style-part mutant
Full SC
Polymorphism
Full SI
48Fate of pollen-part mutant
Full SC
Relative viability of inbred offspring (s )
Disruption
Polymorphism
Full SI
Self-pollen fraction (s)
Uyenoyama, Zhang, and Newbigin (2001)
49An
Bn
Sn
Sb
Sa
Sn1
Uyenoyama, Zhang, and Newbigin (2001)
50An
Bn
TURN OFF Partial breakdown of SIby pollen
disablement
Sn
Evolutionarily unlikely
Sb
Sa
TURN ON Restoration of SIby stylar recognition
Evolutionarily unlikely
Sn1
Uyenoyama, Zhang, and Newbigin (2001)
51Joint genealogies
Solanaceae and Plantaginaceae
Rosaceae
- Unlike S-RNase genes, SLF genes show
- Low divergence between allelic types
- No trans-specific sharing of lineages
Newbigin, Paape, and Kohn (2008)
52Cycles of loss/restoration of SI?
- Family-specific genealogies
- Rosaceae do highly-diverged, ancient SFB
lineages reflect continuous operation or
restoration of same F-box genes? - Solanaceae, Plantaginaceae Recruitment of new
F-box genes? - Turnover of pollen-specificity loci
- Expression and recognition of a paralogue of the
former pollen specificity gene? - Can homologues be distinguished from paralogues
with new function?
53Brassica S-locus
Natural populations often contain 30 50
S-alleles
Nasrallah (2000 Curr. Opin. Plant Biol.)
54An appeal for inference methods
- Sexual antagonism in mating type regions
- Neutral variation in linked regions
- Rates of substitution at determinants of mating
type - Inference
- Goal use the pattern of variation in population
samples of genomic regions as a basis for
inference about the evolutionary process - Detection
- genomic conflict and other forms of selection
- mating systems and population structure
55Pollen specificity in SSI
- Codominance
- Both specificities expressed
- Almost twice as many incompatible styles under
SSI than GSI for same number of S-alleles - Complete dominance
- One specificity expressed
Norbert Holstein http//commons.wikimedia.org/wiki
/FileSporophytic_self-incompatibility.png
56SRK genealogies
- Sporophytic SI
- Diploid genotype of pollen parent determines
S-specificity of each pollen grain - Class I is dominant over Class II, with
codominance within class - Class II pollen-recessive
- Lower number of segregating alleles, each with
relatively higher frequency in population - Greater genealogical relationship within class?
Edh, Widén and Ceplitis (2009)
57Is class II younger than class I?
- MRCA ages
- Class I 25.5 8.1 MY
- Class II 3.1 0.9 MY
- I/II 41.4 12.7 MY
- Origin of SLG/SRK system
- 42.1 9.0 MY
Uyenoyama (1995)