Title: SNP Selection
1SNP Selection
University of Louisville Center for Genetics and
Molecular Medicine January 10, 2008 Dana
Crawford, PhD Vanderbilt University Center for
Human Genetics Research
2Outline of Tutorial
- Concepts of tagSNPs
- LD and haplotype definitions
- Haplotype blocks and definitions
- Tools to identify tagSNPs
3Why Do We Need tagSNPs?
Too Many SNPs to Genotype!
- Whole Genome
- 15,000,000 SNPs
- 6,000,000 SNPs gt 5 MAF
4SNP Genotypes Are Correlated (aka linkage
disequilibrium)
the nonindependence of alleles at different
sites. Pritchard and Przeworski 2001
5Measuring Pair-wise SNP Correlations
- SNP genotype correlation described by
- linkage disequilibrium (LD)
- Pair-wise measures of LD D and r2
- D pAB - pApB D D/Dmax
Recombination - r2 D2
- f(A1)f(A2)f(B1)f(B2) Power
6LD Statistics Practical Uses
- r2 is inversely related to power (effective
sample size) - 1/r2
- 1,000 cases 1,250 cases
- 1,000 controls r21.0 1,250 controls r2 0.80
-
- D is related to recombination history
- D 1 no recombination
- D lt 1 historical recombination
7Where to Find Population LD Statistics
For your gene or region of interest, search
- HapMap www.hapmap.org
-
- Perlegen genome.perlegen.com
- SeattleSNPs PGA pga.gs.washington.edu
- NIEHS SNPs egp.gs.washington.edu
8Where to Find Population LD Statistics
For your gene or region of interest, search
- HapMap www.hapmap.org
-
- Perlegen genome.perlegen.com
- SeattleSNPs PGA pga.gs.washington.edu
- NIEHS SNPs egp.gs.washington.edu
9Visualizing Pair-wise LD
10Visualizing Pair-wise LD
11Visualizing Pair-wise LD
12Where to Find Population LD Statistics
For your gene or region of interest, search
- HapMap www.hapmap.org
-
- Perlegen genome.perlegen.com
- SeattleSNPs PGA pga.gs.washington.edu
- NIEHS SNPs egp.gs.washington.edu
Genome Variation Server
13Visualizing Pair-wise LD
14Visualizing Pair-wise LD
15Visualizing Pair-wise LD
16Visualizing Pair-wise LD
17Visualizing Pair-wise LD
18Visualizing Pair-wise LD
19Visualizing Pair-wise LD
20Visualizing Pair-wise LD
21Visualizing Pair-wise LD
22Multi-SNP Genotype Correlations (aka Haplotypes)
a unique combination of genetic markers present
in a chromosome. pg 57 in Hartl Clark, 1997
23Constructing Haplotypes
24Constructing Haplotypes
Examples of Haplotype Inference Software EM
Algorithm Haploview http//www.broad.mit.edu/mpg/
haploview/index.php Arlequin http//lgb.unige.ch/
arlequin/ PHASE v2.1 http//www.stat.washington.e
du/stephens/software.html HAPLOTYPER http//www.p
eople.fas.harvard.edu/junliu/Haplo/docMain.htm
25Haplotypes in NIEHS SNPs
- gt625 genes re-sequenced
- Cell cycle, DNA repair/replication, apoptosis
- 2 DNA panels
- 1 Polymorphism Discovery Resource (PDR90)
- 2 Europeans, Africans, Hispanics, and Asians
- PHASEv2.0 results posted on website
- Interactive tool (VH1) to visualize and sort
haplotypes
http//egp.gs.washington.edu
26Haplotypes in NIEHS SNPs
27Haplotypes in NIEHS SNPs
28Haplotypes in NIEHS SNPs
29Haplotypes in NIEHS SNPs
30Haplotypes in NIEHS SNPs
31Haplotypes in NIEHS SNPs
32Haplotypes in NIEHS SNPs
33Haplotypes in NIEHS SNPs
34Haplotypes in NIEHS SNPs
35Haplotypes in NIEHS SNPs
36Haplotypes in NIEHS SNPs
37Haplotypes in NIEHS SNPs
38Using LD and Haplotypes to Pick tagSNPs
- r2 is inversely related to power (effective
sample size) - 1/r2
- 1,000 cases 1,250 cases
- 1,000 controls r21.0 1,250 controls r2 0.80
-
- D is related to recombination history
- D 1 no recombination
- D lt 1 historical recombination
Example Tagger and LDSelect
Example Haplotype blocks
39Using LD and Haplotypes to Pick tagSNPs
- r2 is inversely related to power (effective
sample size) - 1/r2
- 1,000 cases 1,250 cases
- 1,000 controls r21.0 1,250 controls r2 0.80
-
Example Tagger and LDSelect
40LDSelect Using LD to Pick tagSNPs
- LDSelect
- Uses SNP discovery data (not haplotypes)
- Finds all correlated SNP genotypes to minimize
the total number - Maintains genetic diversity of locus
Carlson et al. AJHG (2004)
41TagSNPs Are Population Specific
European-descent (BLM)
African-descent (BLM)
42SNP Selection tagSNP Data
BLM
43Side Note Categorizing tagSNPs
- SNP context
- Nonrepetitive gt repetitive
- Location of SNP
- Coding gt noncoding
- Function
- Nonsynonymous gt synonymous
44Categorizing tagSNPs
LPO
45Haplotypes in Genetic Association Studies
Two main approaches with haplotypes
46Haplotypes in Genetic Association Studies
Two main approaches with haplotypes
Haplotypes Pick tagSNPs Genotype
samples
Pick tagSNPs Infer haplotypes Test for
association
47Haplotype Blocks
Daly et al 2001
Daly et al Nat. Genet. (2001)
48Block Definitions
Daly et al 2001
Daly et al Nat. Genet. (2001)
D Gabriel et al Science (2002)
49Block Definitions
50Haplotype Blocks and tagSNPs
- Identifying blocks and tagSNPs
- Manually
- Visual haplotype
- Algorithms
- HapMap and Haploview
51Haplotype Blocks and tagSNPs
LTA 16 SNPs (MAF gt10) 6 common haplotypes
52Haplotype Blocks and tagSNPs
- Identifying blocks and tagSNPs
- Manually
- Visual Haplotype
- Algorithms
- HapMap and HaploView
53HapMap Data and Haploview
www.hapmap.org
54(No Transcript)
55(No Transcript)
56HapMap Data and Haploview
57(No Transcript)
58(No Transcript)
59http//www.broad.mit.edu/mpg/haploview/
60Import HapMap Data into Haploview
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68Note HapMap is not complete variation data
69Variation data, LD, and tagSNPs for ANAPC10 in
European-Americans
HapMap 5 tagSNPs
70tagSNPs and Genome Variation Server
71(No Transcript)
72Note Tagger is essentially the same as LDSelect
73(No Transcript)
74Haplotypes, TagSNPs, and Caveats
- Haplotypes are inferred
- Block-like structure assumed for some software
- Different block definitions
- Block boundaries sensitive to marker density
- Genotype savings may not be great (recombination)
tagSNPs based on LD more popular than htSNPs
75SNP Selection Summary
- Resources available for pair-wise LD and
haplotypes - Software for tagSNP selection available
- Be aware the limitations of the approach you
choose - Be aware that some SNP datasets may not
represent - all common variation of gene or gene region
- Be aware that a fraction of tagSNPs do not
convert - into a successful genotyping assay