Title: Power calculation for QTL association (discrete and quantitative traits)
1Power calculation for QTL association(discrete
and quantitative traits)
- Shaun Purcell Pak Sham
- Advanced Workshop
- Boulder, CO, 2003
2Threshold
Discrete
Variance components
Case-control
Case-control
TDT
TDT
3Discrete trait calculation
- p Frequency of high-risk allele
- K Prevalence of disease
- RAA Genotypic relative risk for AA genotype
- RAa Genotypic relative risk for Aa genotype
- N, ?, ? Sample size, Type I II error rate
4Risk is P(DG)
- gAA RAA gaa gAa RAa gaa
- K p2 gAA 2pq gAa q2 gaa
- gaa K / ( p2 RAA 2pq RAa q2 )
- Odds ratios (e.g. for AA genotype) gAA / (1-
gAA ) - gaa / (1- gaa )
5Need to calculate P(GD)
- Expected proportion d of genotypes in cases
- dAA gAA p2 / (gAAp2 gAa2pq gaaq2 )
- dAa gAa 2pq / (gAAp2 gAa2pq gaaq2 )
- daa gaa q2 / (gAAp2 gAa2pq gaaq2 )
- Expected number of A alleles for cases
- 2NCase ( dAA dAa / 2 )
- Expected proportion c of genotypes in controls
- cAA (1-gAA) p2 / ( (1-gAA) p2 (1-gAa) 2pq
(1-gaa) q2 )
6Full contingency table
- A allele a allele
- Case 2NCase ( dAA dAa / 2 ) 2NCase ( daa
dAa / 2 ) - Control 2NControl ( cAA cAa / 2 ) 2NControl (
caa cAa / 2 )
7Threshold selection
- Genotype AA Aa aa
- Frequency q2 2pq p2
- Trait mean -a d a
- Trait variance ?2 ?2 ?2
8P(X) ?GP(XG)P(G)
P(X)
Aa
AA
aa
X
9P(GXltT) P(XltTG)P(G) / P(XltT)
P(X)
Nb. the cumulative standard normal distribution
gives the area under the curve, P(X lt T)
Aa
T
AA
X
10Incomplete LD
- Effect of incomplete LD between QTL and marker
Note that linkage disequilibrium will depend on
both D and QTL marker allele frequencies
11Incomplete LD
- Consider genotypic risks at marker
- P(DMM) (pm1 d)2 P(DAA)
- 2(pm1 d)(qm1- d) P(DAa)
- (qm1- d)2 P(Daa)
- / m12
- Calculation proceeds as before, but at the marker
Haplo.
Geno.
AM/AM
AAMM
AM/aM or aM/AM
AaMM
aM/aM
aaMM
MM
12Discrete TDT calculation
- Calculate probability of parental mating type
given affected offspring - Calculate probability of offspring genotype given
parental mating type and affected - Calculate overall probability of heterozygous
parents transmitting allele A as opposed to a - Calculate TDT test statistic, power
13Fulker association model
The genotypic score (1,0,-1) for sibling i is
decomposed into between and within components
deviation from sibship genotypic mean
sibship genotypic mean
14NCPs of B and W tests
Approximation for between test
Approximation for within test
Sham et al (2000) AJHG 66
15Practical Exercise
- Calculation of power for simple case-control
study. - DATA frequency of risk factor in 30 cases and
30 controls - TEST 2-by-2 contingency table chi-squared
- (1 degree of freedom)
16Step 1 determine expected chi-squared
- Hypothetical risk factor frequencies
- Case Control
- A allele present 20 10
- A allele absent 10 20
Chi-squared statistic 6.666
17Step 2. Determine the critical value for a given
type I error rate, ?
- inverse central chi-squared distribution
P(T)
Critical value
T
18Step 3. Determine the power for a given critical
value and non-centrality parameter
- non-central chi-squared distribution
P(T)
Critical value
T
19Calculating Power
1. Calculate critical value (Inverse central ?2)
Alpha
0 (under the null)
20- http//workshop.colorado.edu/pshaun/gpc/pdf.html
- df 1 , NCP 0
- ? X
- 0.05
- 0.01
- 0.001
21Determining power
- df 1 , NCP 6.666
- ? X Power
- 0.05 3.84146
- 0.01 6.6349
- 0.001 10.827
221. Planning a study
- Candidate gene study
- A disease occurs in 2 of the population
- Assume multiplicative model
- genotype risk ratio Aa 2
- genotype risk ratio AA 4
- 100 cases, 100 controls
- What if the risk allele is rare vs common?
232. Interpreting a negative result
- Negative candidate gene TDT study,
- 82 affected offspring trios
- affection scoring gt2 SD above mean
- candidate gene SNP allele frequency 0.25
- Desired 80 power, 5 type I error rate
- What is the minimum detectable QTL variance
(assume additivity)?
24Planning a study
- p N cases (N controls)
- 0.01 1144
- 0.05 247
- 0.2 83
- 0.5 66
- 0.8 126
- 0.95 465
- 0.99 2286
25Interpreting a negative result
- QTL Power
- 0.00 0.05
- 0.01 0.34
- 0.02 0.60
- 0.03 0.78
- 0.04 0.88
- 0.05 0.94
26Exploring power of association using GPC
- Linkage versus association
- difference in required sample sizes for specific
QTL size - TDT versus case-control
- difference in efficiency?
- Quantitative versus binary traits
- loss of power from artificial dichotomisation?
27Linkage versus association
QTL linkage 500 sib pairs, r0.5 QTL
association 1000 individuals
28Case-control versus TDT
p 0.1 RAA RAa 2
29Quantitative versus discrete
K0.5
K0.2
K0.05
To investigate use threshold-based
association Fixed QTL effect (additive, 5,
p0.5) 500 individuals For prevalence K Group
1 has N and T Group 2 has N
and T
30Quantitative versus discrete
- K T (SD)
- 0.01 2.326
- 0.05 1.645
- 0.10 1.282
- 0.20 0.842
- 0.25 0.674
- 0.50 0.000
31Quantitative versus discrete
32- Incomplete LD
- what is the impact of D values less than 1?
- does allele frequency affect the power of the
test? - (using discrete case-control calculator)
- Family-based VC association between and within
tests - what is the impact of sibship size? sibling
correlation? - (using QTL VC association calculator)
33Incomplete LD
- Case-control for discrete traits
- Disease K 0.1
- QTL RAA RAa 2 p 0.05
- Marker1 m 0.05 D 1, 0.8, 0.6, 0.4, 0.2,
0 - Marker2 m 0.25 D 1, 0.8, 0.6, 0.4, 0.2,
0 - Sample 250 cases, 250 controls
34Incomplete LD
- Genotypic risk at marker1 (left) and marker2
(right) as a function of D
35Incomplete LD
- Expected likelihood ratio test as a function of D
36Family-based association
- Sibship type
- 1200 individuals, 600 pairs, 400 trios, 300
quads - Sibling correlation
- r 0.2, 0.5, 0.8
- QTL (diallelic, equal allele frequency)
- 2, 10 of trait variance
37Family-based association
- NCP proportional to variance explained
- Between test
- ? with ? sibship size and ? sibling correlation
- Within test
- 0 for s1, ? with ? sibship size and ? sibling
correlation
38Between-sibship association
39Within-sibship association
40Total association
41GPC
- Usual URL for GPC
- http//statgen.iop.kcl.ac.uk/gpc/
-
Purcell S, Cherny SS, Sham PC. (2003) Genetic
Power Calculator design of linkage and
association genetic mapping studies of complex
traits. Bioinformatics, 19(1)149-50