Zoology 2005 Part 2 - PowerPoint PPT Presentation

About This Presentation
Title:

Zoology 2005 Part 2

Description:

Enhancer reporter assays. luciferase reporter. promoter. enhancer ... enhancer. Enhancer elements affect promoter expression. Large-Scale Genetic Mapping ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 58
Provided by: richar433
Category:
Tags: part | zoology

less

Transcript and Presenter's Notes

Title: Zoology 2005 Part 2


1
Zoology 2005 Part 2
  • Richard Mott

2
Inbred Mouse Strain Haplotype Structure
  • When the genomes of a pair of inbred strains are
    compared,
  • we find a mosaic of segments of identity and
    difference (Wade et al, Nature 2002).
  • A QTL segregating between the strains must lie in
    a region of sequence difference.
  • What happens when we compare more than two
    strains simultaneously?

3
No Simple Haplotype Block Mosaic
Yalcin et al 2004 PNAS
4
But a Tree Mosaic
5
In-silico Mapping
  • Simple idea-
  • Collect phenotypes across a set of inbred strains
  • Genotype the strains (ONCE)
  • Look for phenotype-genotype correlation
  • Works well for simple Mendelian traits (eg coat
    colour)
  • Suggested as a panacea for QTL mapping

6
In-silico Mapping Problems
  • Less well-suited for complex traits
  • Number of strains required grows quickly with the
    complexity of the trait. Suggested at least 100
    strains required, possibly more if epistasis is
    present
  • Require high-density genotype/sequence data to
    ensure identity-by-state identity by-descent
  • May be very useful for the dissection of a QTL
    previously identified in a F2 cross (look for
    patterns of sequence difference)

7
Recombinant Inbred Lines
  • Panels of inbred lines descended form pairs of
    inbred strains
  • Genomes are inbred mosaics of the founders
  • Lines only need be genotyped once
  • Similar to in-silico mapping except
  • identity-by-descentidentity-by-state
  • Coarser recombination structure
  • ?lower resolution mapping?

8
BXD chromosome 4
9
Testing if a variant is functional without
genotyping it(Yalcin et al, Genetics 2005)
  • Requirements
  • A Heterogeneous Stock, genotyped at a skeleton of
    markers
  • The genome sequences of the progenitor strains
  • A statistical test

10
Merge Analysis
  • Each polymorphism groups together the founders
    according to their alleles
  • If the polymorphism is functional, then a model
    in which the phenotypic strain effects are
    estimated after merging the strains together
    should be as good as a model where each strain
    can have an independent effect.
  • Compare the fit of merged and unmerged
    genetic models to test if the variant is
    functional.
  • If the fit of the merged model is poor then that
    variant can be eliminated.

11
Merge Analysis
12
Merge Analysis
13
How can we show a gene under a QTL peak affects
the trait?
  • Genetic Mapping identifies Functional Variants,
    not Genes
  • Could be a control element affecting some other
    gene

14
Quantitative Complementation
KO
0
15
Quantitative Complementation
KO
wt
High
Low
30
0
50
100
16
Quantitative Complementation
KO
wt
High
Low
d
30
0
50
100
17
Quantitative Complementation
KO
wt
High
Low
d
d
30
0
50
100
D d - d
18
Quantitative Complementation
KO
wt
High
Low
d
d
30
0
50
100
D d - d
19
Using Functional Information to Confirm Genes
  • Further experiments
  • further bioinformatics, eg networks, functional
    annotation (GO, KEGG)
  • candidate gene sequencing
  • gene expression analyses (eQTL) of
  • founder strains
  • HS

20
Mouse/human sequence comparison
21
Enhancer reporter assays
luciferase reporter
promoter
enhancer
promoter
enhancer
luciferase reporter
22
Enhancer elements affect promoter expression
23
Large-Scale Genetic Mapping
  • Using a Heterogeneous Stock
  • Multiple Phenotypes collected in parallel

24
Predictions (from simulation of an HS population)
  • In a population of 1,000 HS animals
  • Genome-wide power to detect 5 QTL 0.92
  • Resolution lt 2 Mb

25
Study design
  • 2,000 mice
  • 15,000 diallelic markers
  • More than 100 phenotypes
  • each mouse subject to a battery of tests spread
    over weeks 5-9 of the animals life
  • more (post-mortem) phenotypes being added

26
Phenotypes
27
Covariates
  • For each phenotype, we recorded covariates, eg,
  • experimenter
  • time of day
  • apparatus (eg, Shock Chamber 3)

28
Data collection
  • All animals microchipped
  • Automated data checking, processing and uploading
  • All data uploaded into the Integrated Genotyping
    System (IGS) database

29
Genotypes from Illumina
  • Genotyped and phenotyped 2,000 offspring
  • Genotyped 300 parents
  • Pedigree analysis shows genotyping was 99.99
    accurate
  • 11, 558 markers polymorphic in HS

30
QTL mapping
  • Models
  • HAPPY and single marker association
  • Fitting framework
  • Linear regression of (transformed) phenotypes
  • Survival analysis for latency data
  • Logit-based models for categorical data
  • Significant covariates incorporated into the null
    model, eg

Startle TestChamber BodyWeight Year Age
Hour Gender
Null

additive genetic info for locus
Additive
Null

full genetic info for locus
Full
Null

31
QTL mapping
  • Significance tests
  • partial F-test (linear models), Chi-square / LRT
    (others)
  • Significance thresholds
  • different for each phenotype
  • have to take into account LD
  • fit distribution to scores of permuted data

32
E-values
  • We set score thresholds using ideas from sequence
    databank search programs such as BLAST

33
E-values
  • We set score thresholds using ideas from sequence
    databank search programs such as BLAST
  • The E-value of a threshold is the number of times
    you would expect to see a false positive exceed
    the threshold in a genome scan

34
E-values
  • We set score thresholds using ideas from sequence
    databank search programs such as BLAST
  • The E-value of a threshold is the number of times
    you would expect to see a false positive exceed
    the threshold in a genome scan
  • Applying the Bonferroni correction to the number
    of marker intervals is too severe because LD
    makes neighbouring scores correlated.

35
E-values
  • We set score thresholds using ideas from sequence
    databank search programs such as BLAST
  • The E-value of a threshold is the number of times
    you would expect to see a false positive exceed
    the threshold in a genome scan
  • Applying the Bonferroni correction to the number
    of marker intervals is too severe because LD
    makes neighbouring scores correlated.
  • Permutation analyses indicate the score of the
    most significant expected random score amongst
    all 12000 marker intervals behaves as if it was
    drawn from M4000 independent tests.

36
E-values
  • We set score thresholds using ideas from sequence
    databank search programs such as BLAST
  • The E-value of a threshold is the number of times
    you would expect to see a false positive exceed
    the threshold in a genome scan
  • Applying the Bonferroni correction to the number
    of marker intervals is too severe because LD
    makes neighbouring scores correlated.
  • Permutation analyses indicate the score of the
    most significant expected random score amongst
    all 12000 marker intervals behaves as if it was
    drawn from M4000 independent tests.
  • Hence a nominal P-value of p corresponds to an
    E-value of pM

37
Problems
  • Our population includes both siblings and
    unrelateds
  • We have ignored this distinction
  • And therefore
  • Confounding environmental family effects with
    genetic family effects
  • Allowing ghost peaks due to linkage
    disequilibrium between markers within a sibship
  • Our solution so far
  • (1) Investigating the effect of environmental
    factors and building covariates into the model
  • (2) Identify peaks by a multiple conditional fit

38
Multiple Peak FittingForward Selection
  • For each phenotypes genome scan
  • Make list of all peaks gt genome-wide threshold T
  • Fit most significant peak, P1
  • Go through list of peaks, refitting each on
    conditional upon the most significant peak.
  • Add the most significant remaining peak, P2
  • Continue refitting remaining peaks P3 , P4 and
    adding them into model until the most significant
    remaining peak lt T

39
Peaks found by multiple conditional fit
Multiple conditional fit (using additive model
only)
number of phenotypes
40
Database for scans
41
Database for scans
Additive model
Full model
  • E-value thresholds
  • additive only
  • Elt0.01 is about the same as genome-wide corrected
    plt0.01.

42
Database for scans
zoom in
43
Covariates
44
QTL Mapping Validation
  • Coat colour
  • Detection of known QTLs

45
Coat colour genes
46
A known QTL HDL
Wang et al, 2003
HS mapping
47
High Resolution QTLs
Phenotype Chrom Mb Method Ref HS position
Cue freezing 3 70-83 Genome tagged mice Liu 2003 71-73
Obesity 2 142-168 Congenic Demant 2004 150-153
10 week body mass 1 156-160 Progeny testing Christians 2004 154.5-156
Emotionality 1 143-148 HS Mott 2000 143-144.5
Emotionality 10 123-127 HS Mott 2000 121.5-122.7
Emotionality 12 54-57 HS Mott 2000 55.5-56.5
Emotionality 15 64-77 HS Turri 1999 63.5-66
48
New QTLs two examples
  • Freeze.During.Tone (from Cue Conditioning
    behavioural experiment) 1 peak
  • of CD4 in CD3 cells (immunology assay)
  • 10 peaks

49
Cue Conditioning
  • Freezing in response to a conditioned stimulus

50
Cue Conditioning
  • Freeze.During.Tone huge effect, small number of
    genes

chr15
cntn1 Contactin precursor (Neural cell
surface protein)
51
CD4 cells in CD3 cells
  • huge effect but lots of genes

52
CD4 in CD3 (under peak)
53
All QTLs
  • 608 peaks
  • Median interval is 938,936 bp
  • or about 9 genes per peak

54
Summary
  • The HS project so far has
  • phenotyped 2,500 HS mice
  • genotyped 2,300 mice
  • mapped over 140 phenotypes
  • identified more than 600 potential QTLs

55
Confirming gene candidates
  • Increased mapping resolution through
  • include epistasis
  • multivariate
  • G x E
  • pleiotropy
  • sex effects
  • Further experiments
  • further bioinformatics, eg networks, functional
    annotation (GO, KEGG)
  • candidate gene sequencing
  • gene expression analyses (eQTL) of
  • founder strains
  • HS

56
Confirming gene candidates epistasis
Single marker association of pairwise epistasis
57
Work of many hands
  • Carmen Arboleda-Hitas
  • Amarjit Bhomra
  • Peter Burns
  • Richard Copley
  • Stuart Davidson
  • Simon Fiddy
  • Jonathan Flint
  • Polinka Hernandez
  • Sue Miller
  • Richard Mott
  • Chela Nunez
  • Gemma Peachey
  • Sagiv Shifman
  • Leah Solberg
  • Amy Taylor
  • Martin Taylor
  • Jordana Tzenova-Bell
  • William Valdar
  • Binnaz Yalcin
  • Dave Bannerman
  • Shoumo Bhattacharya
  • Bill Cookson
  • Rob Deacon
  • Dominique Gauguier
  • Doug Higgs
  • Tertius Hough
  • Paul Klenerman
  • Nick Rawlins
  • Project funded by
  • The Wellcome Trust, UK
Write a Comment
User Comments (0)
About PowerShow.com