Association analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Association analysis

Description:

None 0.443 1.000 1.000. 0.443 1.000 1.000 815.628 ... Recombination event. during meiosis. Recombinant gamete transmitted, harboring mutation ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 56
Provided by: ibm3148
Category:

less

Transcript and Presenter's Notes

Title: Association analysis


1
Association analysis
  • Shaun Purcell
  • Boulder Twin Workshop 2004

2
Overview
  • Candidate gene association
  • Haplotypes and linkage disequilibrium
  • Linkage and association
  • Family-based association

3
What is association?
  • Categorical traits
  • disease susceptibility genes
  • Continuous traits
  • quantitative trait loci, QTL

4
Disease traits
Is there a difference in allele/genotype
frequency between cases and controls?
  • Case Control
  • AA n1 n2
  • Aa n3 n4
  • aa n5 n6

5
Disease traits
Is there a difference in allele/genotype
frequency between cases and controls?
  • Case Control
  • AA 30 25 p2
  • Aa 50 50 2p(1-p)
  • aa 20 25 (1-p)2

, p-value
Test for independence
6
Disease traits
General model
Additive model
Dominant model for A
1 df
1 df
2 df
Effect sizes calculated as odds ratios
7
Relative risk
  • D D-
  • E a b
  • E- c d
  • Risk in E a / ( a b )
  • Risk in E- c / ( c d )
  • Relative risk of exposure (a /( a b )) / (c
    /(c d ))

8
Odds ratio
  • D D-
  • E a b
  • E- c d
  • Odds in D a/c
  • Odds in D- b/d
  • Odds ratio (a/c) / (b/d)

9
Quantitative traits
ID Y G A D 001 0.34 aa -1 0 002 1.23 Aa 0 1 003 1
.66 Aa 0 1 004 2.74 AA 1 0 005 1.33 AA 1 0

Y aA dD e
10
Some web resources
  • BGIM
  • http//statgen.iop.kcl.ac.uk/bgim/
  • Introductory tutorials on twin analysis, primer
    on maximum likelihood, Mx language.
  • GxE moderator models
  • http//statgen.iop.kcl.ac.uk/gxe/
  • Power calculation
  • http//statgen.iop.kcl.ac.uk/gpc/
  • Case/control association tools
  • http//statgen.iop.kcl.ac.uk/gpc/model/

11
(No Transcript)
12
Relative risk
P(DAA) / P(Daa) labelled RR(AA) P(DAa) /
P(Daa) labelled RR(Aa)
13
Genetic models
14
Tests
15
Multiple samples
  • Constrain frequencies across samples
  • Constrain effects across samples
  • Can test genetic models with effects and/or
    frequencies constrained to be equal
  • Can perform tests of homogeneity of effects
    and/or frequencies across samples

16
An example2 case/control samples
  • Population frequency 5

17
(No Transcript)
18
  • Homogeneous effects across samples
  • Homogeneous allele frequencies across samples
  • Model p RR(Aa) RR(AA) -2LL
  • ----- - ------ ------ ----
  • Gen 0.367 1.979 3.663
  • 0.367 1.979 3.663 793.143
  • Mult 0.367 1.911 3.651
  • 0.367 1.911 3.651 793.199
  • Dom 0.401 1.990 1.990
  • 0.401 1.990 1.990 802.927
  • Rec 0.405 1.000 1.921
  • 0.405 1.000 1.921 805.064
  • None 0.442 1.000 1.000
  • 0.442 1.000 1.000 815.628

19
  • Heterogeneous effects across samples
  • Homogeneous allele frequencies across samples
  • Model p RR(Aa) RR(AA) -2LL
  • ----- - ------ ------ ----
  • Gen 0.367 1.235 2.136
  • 0.367 2.890 5.547 786.498
  • Mult 0.367 1.440 2.073
  • 0.367 2.282 5.208 788.262
  • Dom 0.401 1.216 1.216
  • 0.401 2.936 2.936 796.422
  • Rec 0.405 1.000 1.519
  • 0.405 1.000 2.195 803.849
  • None 0.443 1.000 1.000
  • 0.443 1.000 1.000 815.628

20
  • TESTS OF GENETIC MODELS -- ASSUMING EQ EFFECTS
    EQ FREQS

  • Gen vs None (2 df) 22.485 p 0.000
  • Mult vs None (1 df) 22.429 p 0.000
  • Dom vs None (1 df) 12.701 p 0.000
  • Rec vs None (1 df) 10.564 p 0.001
  • Gen vs Mult (1 df) 0.056 p 0.813
  • Gen vs Dom (1 df) 9.784 p 0.002
  • Gen vs Rec (1 df) 11.921 p 0.001
  • TESTS OF GENETIC MODELS -- ASSUMING UNEQ EFFECTS
    EQ FREQS

  • Gen vs None (4 df) 29.130 p 0.000
  • Mult vs None (2 df) 27.366 p 0.000
  • Dom vs None (2 df) 19.205 p 0.000
  • Rec vs None (2 df) 11.779 p 0.003
  • Gen vs Mult (2 df) 1.764 p 0.414

21
Indirect association
Genotyped markers
QTL
Ungenotyped markers
22
Recombination
Homologous chromosomes in one parent
Paternal chromosome
Maternal chromosome
Recombination event during meiosis
Recombinant gamete transmitted, harboring mutation
23
Recombination
Homologous chromosomes in one parent
Paternal chromosome
Maternal chromosome
No recombination event during meiosis
Nonrecombinant gamete transmitted, not harboring
mutation
24
Linkage affected sib pairs
Paternal chromosome
Maternal chromosome
First affected offspring, no recombination
Second affected offspring, recombinant gamete
IBD sharing from this one parent (0 or 1)
1
0
25
Association analysis
  • Mutation occurs on a red chromosome

26
Association analysis
  • Mutation occurs on a red chromosome

27
Association analysis
  • Association due to linkage disequilibrium

28
Haplotypes
  • A a
  • M AM aM
  • m Am am
  • This individual has aa and Mm genotypes
  • and am and aM haplotypes

a
m
M
a
29
Haplotypes
  • A a
  • M AM aM
  • m Am am
  • This individual has Aa and Mm genotypes
  • and AM and am haplotypes
  • but given only genotype data,
  • consistent with Am/aM as well as AM/am

a
m
A
M
30
Haplotypes
  • A a
  • M AM aM
  • m Am am
  • This individual has AA and Mm genotypes
  • and AM and Am haplotypes

A
m
A
M
31
Equilibrium haplotype frequencies
  • A a
  • M pr ps p
  • m qr qs q
  • r s

32
Linkage disequilibrium
  • A a
  • M pr D ps - D p
  • m qr - D qs D q
  • r s
  • DMAX Min(qs, pr)
  • D D /DMAX
  • r2 D / pqrs

33
Haplotype analysis
  • Estimate haplotypes from genotypes
  • Associate haplotypes with trait
  • Haplotype Freq. Odds Ratio
  • AAGG 40 1.00
  • AAGT 30 2.21
  • CGCG 25 1.07
  • AGCT 5 0.92
  • baseline, fixed to 1.00

34
(No Transcript)
35
Linkage Association
Trait
aa
Aa
AA
QTL genotype
36
Variance Components
  • Means
  • M1 M2
  • Variance-covariance matrix
  • V1 C21
  • C12 V2

ASSOCIATION
LINKAGE
37
Variance Components
  • Means
  • M1 bG1 M2 bG2
  • Variance-covariance matrix
  • V1 C21 q(?-½)
  • C12 q(?-½) V2

ASSOCIATION b regression coef. G individuals
genotype
LINKAGE q regression coef. ? IBD sharing
0 , ½ , 1
38
Components of a Genetic Theory
  • POPULATION MODEL
  • Allele genotype frequencies
  • Demographics population history
  • Linkage disequilibrium, haplotype structure
  • TRANSMISSION MODEL
  • Mendelian segregation
  • Identity by descent genetic relatedness
  • PHENOTYPE MODEL
  • Biometrical model of quantitative traits
  • Additive dominance components

39
Linkage without association
3/5
2/6
3/5
2/6
3/2
3/6
5/2
5/6
Both families are linked with the marker but
a different allele is involved.
40
Linkage and association
3/6
2/4
3/5
2/6
4/6
2/6
3/2
3/6
6/2
5/6
6/6
6/6
All families are linked with the marker and
allele 6 is associated with disease
Linkage is just association within families
41
Association without linkage
Controls
Cases
6/6
6/2
3/5
3/4
3/6
5/6
2/4
3/2
3/6
2/2
4/6
2/6
2/5
5/2
Allele 6 is more common in the GREEN
population The disease is more common in the
GREEN population a spurious association
42
TDT
  • Transmission disequilibrium test
  • test for linkage and association

aa
Aa
AA
AA
AA
Aa
Aa
AA
Aa
Aa
Aa
AA
43
TDT A disease allele
  • AA x Aa AA x Aa aa x Aa aa x Aa
  • AA Aa
    Aa aa
  • -
    -
  • 0.5 0.5
    -
  • -
    0.5 0.5

Additive
Dominant
Recessive
44
Between and within components
Sib1
Sib2
45
Between and within components
  • Fulker et al (1999)

Note W S1 B
46
Parental genotypes
  • Use parental genotypes to generate B
  • Examples
  • AA from AAxAA W 0
  • Aa from AAxAa W -0.5
  • Aa from AaxAa W 0

47
assoc.mx
  • Sibling pair sample
  • B and W components precalculated in input file
  • Single SNP genotype
  • Quantitative trait

48
assoc.dat
s1 s2
g1 g2 b
w1 w2
  • -0.007 -0.972 -1 0 -0.5
    -0.5 0.5
  • -0.829 -0.196 1 1 1
    0 0
  • 0.369 0.645 1 1 1
    0 0
  • 0.318 1.55 0 1 0.5
    -0.5 0.5
  • 1.52 0.910 0 0 0
    0 0
  • -0.948 -1.55 1 1 1
    0 0
  • 0.596 -0.394 1 0 0.5
    0.5 -0.5
  • -1.91 -0.905 0 1 0.5
    -0.5 0.5
  • 0.499 0.940 1 0 0.5
    0.5 -0.5
  • -1.17 -1.29 1 0 0.5
    0.5 -0.5
  • -0.16 -1.81 1 1 1
    0 0

49
  • ! Mx script for QTL association sib pairs,
    univariate
  • Group 1
  • Calc NG2
  • Begin Matrices
  • ! Parameters
  • B Full 1 1 free ! association between
    component
  • W Full 1 1 free ! association within
    component
  • M Full 1 1 free ! mean
  • S Full 1 1 free ! Shared residual variance
  • N Full 1 1 free ! Nonshared residual variance
  • ! Definition variables
  • C Full 1 1 ! association between
  • X Full 1 1 ! association within, sib 1
  • Y Full 1 1 ! association within, sib 2
  • End Matrices

50
  • Group2 Data Group
  • Data NI7 NO0
  • RE fileassoc.dat
  • Labels Sib1 Sib2 g1 g2 b w1 w2
  • Select Sib1 Sib2 b w1 w2 /
  • Definition b w1 w2 /
  • Matrices Group 1
  • Means M BC WX M BC WY /
  • Covariance
  • S N S _
  • S S N /
  • Specify C b /
  • Specify X w1 /
  • Specify Y w2 /

51
Models
  • B W
  • B Full 1 1 free
  • W Full 1 1 free
  • !Equate W 1 1 1 B 1 1 1
  • B W
  • B Full 1 1 free
  • W Full 1 1 free
  • Equate W 1 1 1 B 1 1 1
  • B
  • B Full 1 1 free
  • W Full 1 1
  • !Equate W 1 1 1 B 1 1 1
  • BW0
  • B Full 1 1
  • W Full 1 1
  • !Equate W 1 1 1 B 1 1 1

52
Tests
  • Test HA H0
  • Standard association test B W BW0
  • Test of stratification B W B W
  • Robust association test B W B

53
assoc.mx
  • Model B W -2LL df
  • B W -0.478 -0.365 2103.96 795
  • B W -0.420 -0.420 2105.05 796
  • B -0.4778 2127.01 796
  • BW0 2163.34 797

Test of total association HA BW 2105.05
H0 BW0 2163.34 ?-2LL 58.29, df
1, p lt 1e-14
54
assoc.mx
  • Model B W -2LL df
  • B W -0.478 -0.365 2103.96 795
  • B W -0.420 -0.420 2105.05 796
  • B -0.4778 2127.01 796
  • BW0 2163.34 797

Test of stratification HA B W 2103.96
H0 B W 2105.05 ?-2LL 1.09, df
1, p 0.29
55
assoc.mx
  • Model B W -2LL df
  • B W -0.478 -0.365 2103.96 795
  • B W -0.420 -0.420 2105.05 796
  • B -0.4778 2127.01 796
  • BW0 2163.34 797

Test of within association HA B W 2103.96
H0 B 2127.01 ?-2LL 23.06,
df 1, p lt 1e-6
56
Implementation
  • QTDT
  • Abecasis et al (2001) AJHG
  • extends between/within model to general pedigrees
  • multiple alleles
  • covariates
  • combined test of linkage and association
  • discrete as well as quantitative traits

57
Linkage Association
  • families
  • detectable over large distances gt10 cM
  • large effects OR gt3, variancegt10
  • unrelateds or families
  • detectable over small distances lt1 cM
  • small effects ORlt2, variancelt1
Write a Comment
User Comments (0)
About PowerShow.com