Title: SNP selection for candidate gene association studies
1SNP selection for candidate gene association
studies
- Peter Kraft (pkraft_at_hsph.harvard.edu)
- Rulla Tamimi
- (nhrmt_at_channing.harvard.edu)
2Background
- Ataxia telangiectasia (AT) is an autosomal
recessive disorder - characterized by neurodegeneration, cerebral
ataxia, sensitivity to radiation - 100-fold increased risk of developing cancer
- Family studies suggest that women heterozygous
for the ataxia telangiectasia mutated (ATM) gene
are estimated to have 4-5 fold increased risk of
breast cancer
3Background
- ATM is a tumor suppressor gene
- Functions in cell cycle arrest, apoptosis and
repair of double strand breaks - In vitro evidence indicates that cells from AT
heterozygotes are intermediate in their
sensitivity to X-Rays - gt200 disease causing mutations, mostly truncation
mutations - ATM mutations observed in breast cancer patients
are mostly missense mutations
4Epidemiologic Studies
- Studies examining polymorphic variation in ATM
and breast cancer have been inconclusive - Increased risk among women with family history
and/or early onset - Not all studies confirm these findings
- 2 hospital-based studies reported positive
associations with missense mutations - Population based study provided little support
for a role of 20 ATM missense mutations in breast
cancer
5ATM
- Large gene
- 66 exons, over 180kb
- Two independent groups identified variation in
this gene - Thorstenson et al. American Journal of Human
Genetics, 2001 - Bonnen et al. American Journal of Human Genetics,
2000
6Thorstenson et al.
- SNP discovery
- Used DHPLC on 93 cell lines of diverse ethnicity
- 62 out of 66 exons, splice sites, 5 and 3
regions - 24kb covered
- 88 variants found
- 53 found 1 time only
- 18 2-3 times
- 17 gt3 times
- Genotyped same population for these 17 SNPs
- 10 in complete LD
- 7 total haplotypes (3 in European Caucasians)
7Thorstenson, et al. 2001
8Bonnen et al.
- SNP discovery
- Sequenced 500bp regions evenly distributed
throughout the gene in 5 unrelated European
Caucasians - Primarily noncoding regions
- 15kb sequence covered
- 17 variants found
- Sequenced population for 14 SNPs
- 71 African Americans, 77 European White
Americans, 73 Hispanics, 39 Asian Americans - 22 total haplotypes (only 7 in European
Caucasians, 5gt5)
9ATM gene
Bonnen et al, Am J Hum Genet, 2000
10Bonnen et al, Am J Hum Genet, 2000
11Objective
- Determine if common variation in ATM is
associated with breast cancer - Because of the size of ATM and apparent small
numbers of haplotypes - small number of htSNPs
- A haplotype approach may be useful method to
examine common genetic variation - Based on Bonnen paper 5 htSNPs required to
capture 99 variation
12Haplotype Tagging SNPs
- Using BEST
- http//genomethods.org/best/
- Uses an exact method to identify the minimum
number of tagging SNPs necessary to capture
variation - BEST identified 5 htSNPs necessary to capture all
of the haplotypes occurring at gt1
13If the space below a column is blank, the SNP
belongs to the minimum set of htSNPs tagging the
haplotypes if a column is labeled with an X, it
can be derived from the selected htSNPs when a
column is labeled by a number, the representation
binary representation of the labeled and the
labeling columns are identical and therefore any
solution including one will be equivalent to any
solution including the other.
14Methods
- Conducted nested case-control study in the NHS
- 1309 breast cancer cases, 1761 controls
- Genotyped 5 htSNPs using Taqman
- Used PROC HAPLOTYPE to estimate haplotypes in
cases and controls separately - Excluded haplotypes at lt1
- Conducted secondary analysis among those with no
missing genotype data(1199 cases, 1535 controls)
15Results
- Six unique haplotypes
- Haplotypes 1,4,5 represent gt80 of haplotypes
- Global p-value-0.63
16Proc Haplotype vs. HAPPY
17Additional Results
- Used expectation substitution approach to examine
haplotype interactions (HAPPY) - No interaction between haplotypes
- Family history (p0.51)
- Menopausal status at diagnosis (p0.29)
18Limitations
- Accuracy of the estimated haplotypes relies on
the precision of the 5 htSNPs - No one has resequenced ATM entirely
- Efficiency depends on the density of markers used
to choose tagging SNPs - Bonnen markers average density of 1 SNP per 10kb
- Does not exclude possibility of rare variants
associated with breast cancer
19- S49C Among carriers 50 had haplotype 2, not
significantly different from noncarriers (38
p0.6) - D1853N haplotype 15 only occurred among carriers
- P1054R Among carriers 46 had haplotype 17,
significantly more than noncarriers (9 p0.003)
Letraro et al, Am J Hum Genet, 2003