A%20Bayesian%20approach%20to%20inferring%20recent%20selective%20sweeps%20in%20West%20African%20Anopholes%20gambiae%20populations - PowerPoint PPT Presentation

About This Presentation
Title:

A%20Bayesian%20approach%20to%20inferring%20recent%20selective%20sweeps%20in%20West%20African%20Anopholes%20gambiae%20populations

Description:

2Department of Biostatistics, UCLA School of Public Health, Los Angeles CA 90095-1772 USA. Using microsatellite alleles to detect recent selective sweeps ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: A%20Bayesian%20approach%20to%20inferring%20recent%20selective%20sweeps%20in%20West%20African%20Anopholes%20gambiae%20populations


1
A Bayesian approach to inferring recent selective
sweeps in West African Anopholes gambiae
populations
  • John Marshall1, Professor Robert Weiss2

1Department of Biomathematics, UCLA School of
Medicine, Los Angeles CA 90095-1766
USA 2Department of Biostatistics, UCLA School of
Public Health, Los Angeles CA 90095-1772 USA
2
Using microsatellite alleles to detect recent
selective sweeps
  • Microsatellites
  • Tandem repeats of short DNA segments typically
    1-5 bp in length
  • Alleles defined by number of repeats at a
    particular locus
  • Multiallelic ? highly informative markers
  • Factors affecting variance in microsatellite
    allele size
  • Locus specific
  • Microsatellite mutation rate (mainly due to
    slippage during DNA replication)
  • Population specific
  • Effective population size
  • Population-level events (migration, bottlenecks)
  • Population and locus specific
  • Hitchhiking of a microsatellite allele to a
    selected gene

3
The lnRV statistic
  • From population genetics, variance in
    microsatellite allele size at a given locus (j)
    in a given population (i) is a function of
    effective population size (Nei) and
    microsatellite mutation rate (?j)
  • Taking the ratio of expected variances in
    microsatellite allele sizes for a pair of
    populations (i1 and i2) thus removes the
    locus-dependence
  • For a pair of populations (i1 and i2) the ratio
    of variances for a set of loci (j1,2,,T) can be
    calculated
  • Using coalescent simulations, the lnRV values
    have empirically been shown to follow a normal
    distribution.
  • A microsatellite near to a selected locus is
    expected to have reduced variance and hence to
    have an lnRV value that is an outlier from the
    otherwise normal distribution of lnRV values

4
Pros and cons of the lnRV statistic
  • CONS
  • Much information is lost when a set of allele
    size data at a particular locus for all
    individuals in a population is reduced to a
    single value
  • Only makes pair-wise comparisons
  • Difficult to extrapolate methodology to gt2
    populations
  • Inferences from pairs of populations are not
    carried over to other populations
  • Masking can occur when multiple outliers expand
    the confidence interval and lead to none or only
    a subset of outliers being detected
  • PROS
  • Easy and fast to calculate
  • Intuitive to understand
  • Can cope with a very large number of loci
  • Not sensitive to genetic drift, migration or
    inbreeding since these processes affect all loci
    to the same extent and so are removed in the
    ratio calculation

5
The Bayesian model
Distribution of microsatellite allele sizes
Mean components
Variance components
(i indexes population, j indexes locus, k indexes
individual)
6
Consistency between lnRV statistic and Bayesian
ANOVA
Bayesian ANOVA
lnRV statistic
Relative selection
7
Bayesian statistics for detecting selective sweeps
For a given locus j, the population with the
smallest fractional reduction in allele size
variance is denoted imax and has this
corresponding variance component.
Relative selection at locus j can be measured
relative to population imax, e.g.
  • Here BnM has the largest ? value so is least
    selected
  • BnB and SeB have the smallest ? values so are
    most selected
  • The extent of selection can be measured by
  • And

8
Pros and cons of Bayesian approach
  • CONS
  • Can take a long time to converge
  • Sometimes requires a lot of computer power
  • Bayesian methods are more difficult to implement
  • Require well-specified prior distributions
  • Require programming, use of complicated software
  • Inferences are slightly determined by subjective
    choice of prior distributions
  • PROS
  • Doesnt shrink data down to summary statistics
    before analysis
  • Can be used to compare gt2 populations at once
  • Inferences from one population are carried over
    to all others
  • Can cope with any number of selected loci without
    shielding occurring
  • Supplies quantitative measures of the probability
    that selection has occurred
  • Can cope well with tiny sample sizes

9
Microsatellite data for West African Anopholes
gambiae populations
  • 1998 data set
  • Allele size data collected at 21 microsatellite
    loci dispersed throughout Anopholes gambiae
  • 5 subpopulations
  • Bamako chromosomal form in villages of Banambani
    and Selinkenyi
  • Mopti chromosomal form in villages of Banambani
    and Selinkenyi
  • Savannah chromosomal form in village of Banambani
  • 2003 data set
  • Microsatellite allele size data collected at 12
    microsatellite loci dispersed throughout
    Anopholes gambiae chromosome 3
  • Data taken for 12 subpopulations
  • Mopti chromosomal form in the villages of Oure,
    Dire, Kondi, Nampala, Torkya and Banikane
  • Savannah chromosomal form in the villages of
    Oure, Gono, Kokouna, Pimperena, Soulouba and
    Madina Diasra

10
Loci likely targeted by recent selective sweeps
(1998 data set)
Applying the Bayesian ANOVA model to the 1998
data set, there is evidence of selection (in
order of magnitude) in
025
Locus Chromosome Chromosomal form Location
1 637 2L Bamako Banambani Selinkenyi
2 038 X Savannah Banambani
3 135 2 Mopti Banambani Selinkenyi
4 079 2 Mopti Banambani Selinkenyi
5 175 2 Mopti Banambani Selinkenyi
6 095 2R Mopti Banambani Selinkenyi
7 025 X Savannah Banambani
637
637 /
Locus 637
11
Loci likely targeted by recent selective sweeps
(2003 data set)
Applying the Bayesian ANOVA model to the 2003
data set, there is evidence of selection (in
order of magnitude) in
Locus Chromosome Chromosomal form Location
1 119 3R Mopti Oure
2 127 3R Savannah Oure
3 093 3L Mopti Kondi Banikane
3 093 3L Savannah Gono Kokouna
4 812 3 Mopti Nampala Dire
5 817 3L Savannah Soulouba
6 555 3 Savannah Madina Daisra
119
Locus 119
12
Implications for recent selection in Anopholes
gambiae genome
  • 1998 data set
  • Strongest evidence for selection is for
  • locus 637 (chromosome 2) in Bamako form
  • locus 038 (X chromosome) in Savannah form
  • Most selected loci are on chromosome 2
  • For a given chromosomal form collected at
    Banambani and Selenkenyi, selection seems to be
    evident in both locations
  • The same does not apply for a given location
    where multiple chromosomal forms are collected
  • Suggests there is more gene flow between these
    two villages than there is between chromosomal
    forms
  • 2003 data set
  • Strongest evidence for selection is for
  • locus 119 (chromosome 3R) in Mopti form in Oure
  • Locus 127 (chromosome 3R) in Savannah form in
    Oure
  • Selected loci are dispersed throughout chromosome
    3 (only chromosome 3 loci were analyzed in this
    data set)
  • This time there is very little correlation for
    given chromosomal forms collected at neighbouring
    locations
  • Possibly selection on chromosome 3 is weaker
    (1998 data set showed no selection on chromosome
    3)

-093
119
-577
059-
Write a Comment
User Comments (0)
About PowerShow.com