Title: Stratification
1Stratification Lon Cardon University of
Oxford F\lon\2001\stratification\stratification
.ppt
2Population Stratification
Consider trait distribution only
Mean differences in population substrata
These differences alone do not influence genetic
association under Ho
3Single sample, unequal marker allele frequencies
Allele 1 Allele 2
f(2) gt f(1)
4Single sample, unequal marker allele frequencies
Allele 1 Allele 2
f(1) gt f(2)
5Stratified sample, equal marker allele frequencies
f(1) f(2) More 1 in high end More 2 in
low end. Association evidence
6A Simple Model of Stratification
- Consider
- Two subsamples of equal sample size, but
opposite allele frequencies (e.g., sample 1,
5248 sample 2, 4852) - Within sample variance of the usual form of Va
Vq(w) Ve, which is the same for both
subsamples - Mean effects arising from a true QTL and
stratification m mq ms - Then
- Total variance in the combined sample has
additional between-strata effects due to the
QTL and stratification Vq(b) and Vs, so Vq
Vq(b) Vq(w) and the total variance is Va Vq
Vs Ve - Let Vq Vs 0.05 and p1, p2 vary from .1 to .9.
7Stratification only
8QTL effect only
9QTL and stratification effects
10Stratification Summary
Stratification not only yields increases in Type
I error Can also mask real effects Could see
true case/control results but no
TDT Difficult area of research.
11Stratification detection/correction using the
Genome
Idea Dont necessarily need to use
family-based controls to detect/control for
stratification, can use other markers in cases
Pritchard Rosenberg (1999). Am J Hum
Genet Procedure Interested in candidate
marker, C1, genotype 40 other anonymous
(unlinked) markers, M1 .. M40. Calculate
association c2 for M1 .. M40. Test is on sum of
c2. If find evidence in background, worry about
stratification else, do not. Extensions Use
same idea to gain estimate of background
inflation factor of test statistic. Use this
factor to correct candidate gene
test. Pritchard et al. (2000) Am J Hum
Genet Devlin Roeder (1999) Biometrics
(Genomic Control) Bacanu, Devlin Roeder
(2000) Am J Hum Genet.
12How bad is the stratification problem?
13When is Stratification Tricky?
14Stratification Detection Using the Genome
- Promising idea to allow large studies of popln
cohorts - Appears to detect large stratification
differences easily - Small frequency differences much more difficult
to detect. Can still obtain large (gt 2-fold)
increases in Type I error rate. - Unfortunately, these differences may be
precisely what we seek in complex traits - Tough cases many sub-strata, uninformative
markers, effects of linked background markers.
Watch this area very active