Bayesian Hierarchical Model for QTLs - PowerPoint PPT Presentation

About This Presentation
Title:

Bayesian Hierarchical Model for QTLs

Description:

www. geocities.com/ResearchTriangle/Forum/4463/anigenetics.gif – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 31
Provided by: Susan834
Learn more at: http://people.uncw.edu
Category:

less

Transcript and Presenter's Notes

Title: Bayesian Hierarchical Model for QTLs


1
www. geocities.com/ResearchTriangle/Forum/4463/ani
genetics.gif
2
Bayesian Hierarchical Model for QTLs
  • Susan Simmons
  • University of North Carolina Wilmington

3
CollaboratorsDr. Edward BooneDr. Ann
StapletonMr. Haikun Bao
4
DNA
5
Chromosome
6
Genes
7
Genetic Map
8
Chromosome 1 of ProtozoaCryptosporidium parvum
9
Chromosome 1 of Homo sapiens
10
Alleles
11
Genetic Maps
  • Many more maps available at www.ncbi.nih.gov
  • Knowing information about genes now allows us to
    find associations between genes and outcomes
    (phenotypes)

12
Some examples
  • In 1989 a breakthrough was made for the disease
    of cystic fibrosis.
  • Location (or locus) is 7q31.2 - The CFTR gene is
    found in region q31.2 on the long (q) arm of
    human chromosome 7 (single gene responsible for
    this disease).
  • The disease arises when an individual has two
    recessive copies at this location.
  • An individual with one dominant and one recessive
    is said to be a carrier of the disease.
  • Genetic screening to determine disease.

13
Green revolution
  • The Green Revolution is the increase in food
    production stemming from the improved strains of
    wheat, rice, maize and other cereals in the 1960s
    developed by Dr Norman Borlaug in Mexico and
    others under the sponsorship of the Rockefeller
    Foundation
  • Created new species of wheat and rice that
    produced higher yield.

14
QTL
  • Better medical treatments and increased
    agriculture are only two examples in which
    identifying the location on the genome can have
    an impact.
  • Identifying the region on the genome (or on the
    chromosome) responsible for a quantitative trait
    (as opposed to qualitative as disease) is known
    as Quantitative Trait Locus (QTL).

15
Existing software
  • Zhao-Bang Zengs group at NC State has QTL
    Cartographer
  • Karl Broman (John Hopkins) has an R program that
    performs a number of algorithms for QTLs
  • To use these algorithms (and a number of other
    published algorithms) only one observation per
    genotype can be used

16
World of plants
17
Why plants?
  • Increase yield to feed our increasing population
  • Make plants resistant to UV-B exposure

18
Plants, continued
  • Control
  • Design and Environment
  • Reproduction
  • Design (RIL is one of the best designs for
    detecting QTLs) Alleles are homozygous
  • Cost
  • Time

19
Plant QTL experiments
  • In most experiments, a number of replicates or
    clones are observed within each line
  • A number of plant biologist use some summary
    measure to use conventional methods
  • Information is lost (and can be
    misleadingexample in Conte et al (unpublished))
  • Hierarchical model to incorporate replicates
    within each line

20
Data
  • Trait or phenotype, yij , i 1,..,L where L is
    the number of lines and j 1, , ni (number of
    replicates within each line)
  • Design matrix, X is L x M where M is the number
    of markers on the genetic map

21
Hierarchical Model
  • Hierarchical Model
  • yij N(li,si2)
  • li N(XiTb,t 2)
  • Priors
  • t 2 Inverse c 2 (1)
  • bk N(0,100)
  • si2 Inverse c 2 (1)

22
Posterior Model Probability
  • Let ? denote the set of all possible models.
    Given data D, the posterior probability of model
    ki is given by Bayes Rule
  • (These probabilities are implicitly conditioned
    on the set ?)

23
Posterior Model continued
  • To compute probability of the model given the
    data in previous slide ( ), we need to
    compute P(Dki), where
  • qi is the vector of unknown parameters for model
    ki

24
Integration
  • This integration can become difficult since the
    length of the unknown parameters is 2L M 2.
    Use Monte Carlo estimate of the integral
  • Where , j 1,,t are samples from the
    posterior distribution

25
Search strategy
  • The activation probability, P(bj ?0D) is defined
    as
  • There are 2M number of potential models,which can
    make the calculation of P(bj ?0D)
    computationally intensive
  • Instead, we define a conditional probability
    search approach

26
C2
C3
C4
C5
C1
C41
C42
C21
C22
C211
C212
C422
C421
C4212
C4211
27
Simulated data
  • Using the line information from the Bay x Sha RIL
    population, a single QTL was simulated on the
    fourth marker of the first chromosome.
  • The Bay x Sha population has 5 chromosomes.

28
C2 0.4
C3 0.6
C4 0.4
C5 0.0029
C1 1
C31 0.063
C32 0.063
C12 0.9362
C11 1
C111 0.818
C112 0.927
C122 0.108
C121 0.114
C1112 0.014(M2)
C1111 0.041 (M1)
C1121 0.083(M3)
C1122 1(M4)
29
Comments
  • Need to run model on more simulations
  • Would like to compare this search strategy to a
    stochastic search
  • Would like to include epistasis in the model

30
Thank you
Write a Comment
User Comments (0)
About PowerShow.com