Title: Statistical methods for genetic association studies
1Statistical methods forgenetic association
studies
http//www.stats.gla.ac.uk/paulj/assoc_study_stat
s.ppt
2- A tutorial on statistical methods for population
association studies - David Balding
- Nature Reviews Genetics (2006) 7781-791
3(No Transcript)
4Recombination
X/x unobserved causative mutation A/a distant
marker B/b linked marker
5Approaches to finding disease genes
- Population-based association study
- unrelated subjects
- Family-based association study
- nuclear families
- Admixture mapping
- recently admixed population
- Linkage mapping
- large pedigrees
6Types of population association study
- Candidate causative polymorphism
- SNP (single nucleotide polymorphism), deletion,
duplication - Candidate causative gene (5-50 marker SNPs)
- evidence from linkage study or function
- Candidate causative region (100s of marker SNPs)
- evidence from linkage study
- Genome-wide (gt300,000 marker SNPs)
- no prior evidence required
7Common disease common variant (CDCV) hypothesis
8Preliminary analysis data quality
- Assuming mating is random and the population is
large, HWE genotype frequencies will apply - Allele frequencies
- P(X) p
- P(x) q
- HWE genotype frequencies
- P(XX) p2
- P(Xx) 2pq
- P(xx) q2
- Useful data quality check
- chi-squared or exact test
- log QQ plot
- But can discard causative mutations
p q
p p2 pq
q pq q2
9Log QQ plot
10Preliminary analysis dealing with missing data
- Imputation
- various methods maximum likelihood probalistic
hot-deck regression modelling - test for independence of missingness and
case-control status
11Choice of inheritance model
12Choice of inheritance model
13Choice of inheritance model
14Tests of association single SNP
- Case-control
- Treat genotype as factor with 3 levels, perform
2x3 goodness-of-fit test. Loses power if effect
is additive - Count alleles rather than individuals, perform
2x2 goodness-of-fit test. Out of favour because - sensitive to deviation from HWE
- risk estimates not interpretable
Major allele homozygote (0) Heterozygote (1) Minor allele homozygote (2)
Case
Control
15Tests of association single SNP
- Case-control
- Cochran-Armitage test
- loses power if additivity assumption wrong
Cochran-Armitage test
16Tests of association single SNP
- Case-control
- Armitage or goodness-of-fit? Depends on
- Prior knowledge of inheritance (additive,
dominant, etc) - Genotype frequencies, e.g. use Armitage test when
minor allele is rare, goodness-of-fit test
otherwise
17Tests of association single SNP
- Case-control
- Logistic regression
- Easily incorporates inheritance model (additive,
dominant, etc) - But assumes phenotype is outcome variable not
genotype, so easier to justify for prospective
studies
18Tests of association single SNP
- Continuous outcome
- Linear regression
- Ordered categorical outcomes
- Multinomial regression
19Problems population stratification
20Correcting for population stratification
- Genomic control
- Genotype null SNPs and use to calculate
background inflation in test statistic due to
population stratification - Limited to simple single-SNP analyses
- Can over- or under-correct
- Other approaches using null SNPs
- Regression, principal components analysis, model
underlying demography
21Problems multiple testing
- Bonferroni correction
- conservative when SNPs are linked
- Permutation
- computationally demanding
- False discovery rate
- Bayesian approaches
22Tests of association multiple SNPs
- Advantages
- Many SNPs may be linked to a gene, but
individually may not have a significant effect - Interactions between SNPs can be modelled
- Tag SNPs can reduce testing of redundant linked
SNPs - Methods
- Linear regression, logistic regression
- Armitage test
- Haplotype-based methods
- Natural interpretation
- But power reduced due to multiple alleles
23Haplotypes
Nature Genetics 37, 915 - 916 (2005)
24(No Transcript)
25Inferring haplotype phase
26Inferring haplotype phase
?
27Inferring haplotype phase
28Inferring haplotype phase
29Inferring haplotype phase
- Methods software
- PHASE, FASTPHASE
- EH
- FBAT
- HAPLOTYPER
- EM-DECODER
- PLEM
- HAP
- HAPLORE
- Haplo.stat
- SNPEM
- PEDPHASE
- SNPHAP
- TDTHAP
30Inferring haplotype phase
- Phase cases and controls separately or pooled?
- Separating can give inflated type I error
- Pooling can reduce power