HapMap: - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

HapMap:

Description:

ENCODE variation reference resource available ... much more complete variation resource by which. the ... Genome-wide products can capture most common variation ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 56
Provided by: hapmapNc
Category:
Tags: hapmap | variation

less

Transcript and Presenter's Notes

Title: HapMap:


1
HapMap application in the design and
interpretation of association studies Mark J.
Daly, PhD on behalf of The International HapMap
Consortium
2
Goals of this segment
  • Briefly summarize HapMap design and current
    status
  • Discuss the application of HapMap to all aspects
    of association study design, analysis and
    interpretation

3
HapMap Project
A freely-available public resource to increase
the power and efficiency of genetic association
studies to medical traits
  • High-density SNP genotyping across the genome
    provides information about
  • SNP validation, frequency, assay conditions
  • correlation structure of alleles in the genome

All data is freely available on the web for
application in study design and analyses as
researchers see fit
4
HapMap Samples
  • 90 Yoruba individuals (30 parent-parent-offspring
    trios) from Ibadan, Nigeria (YRI)
  • 90 individuals (30 trios) of European descent
    from Utah (CEU)
  • 45 Han Chinese individuals from Beijing (CHB)
  • 45 Japanese individuals from Tokyo (JPT)

5
HapMap progress
PHASE I completed, described in Nature
paper 1,000,000 SNPs successfully typed in all
270 HapMap samples ENCODE variation reference
resource available PHASE II data generation
complete, data released this past Monday
gt3,500,000 SNPs typed in total !!!
6
ENCODE-HAPMAP variation project
  • Ten typical 500kb regions
  • 48 samples sequenced
  • All discovered SNPs (and any others in dbSNP)
    typed in all 270 HapMap samples
  • Current data set 1 SNP every 279 bp

A much more complete variation resource by
which the genome-wide map can evaluated
7
Completeness of dbSNP
Vast majority of common SNPs are contained in or
highly correlated with a SNP in dbSNP
8
Recombination hotspots are widespreadand account
for LD structure
7q21
9
Utility of LD in association study
  • If Im a causal variant, what is relevant to my
    detection in association studies is how well
    correlated I am with one of the SNPs or
    haplotypes examined in the study.

10
Coverage of Phase II HapMap(estimated from
ENCODE data)
Panel r2 gt 0.8 max r2 YRI
81 0.90 CEU 94 0.97 CHBJPT 94 0.97
From Table 6 A Haplotype Map of the Human
Genome, Nature
11
Coverage of Phase II HapMap(estimated from
ENCODE data)
Panel r2 gt 0.8 max r2 YRI
81 0.90 CEU 94 0.97 CHBJPT 94 0.97
Percentage of deeply ascertained common variants
highly correlated with a HapMap SNP
From Table 6 A Haplotype Map of the Human
Genome, Nature
12
Coverage of Phase II HapMap(estimated from
ENCODE data)
Panel r2 gt 0.8 max r2 YRI
81 0.90 CEU 94 0.97 CHBJPT 94 0.97
Average maximum correlation between a
deeply ascertained variant and a neighboring
HapMap SNP
From Table 6 A Haplotype Map of the Human
Genome, Nature
13
Coverage of Phase II HapMap(estimated from
ENCODE data)
Panel r2 gt 0.8 max
r2 YRI 81 0.90 CEU 94 0.97 CHBJPT 94 0.97

Vast majority of common variation (MAF gt .05)
captured by Phase II HapMap
14
Applying the HapMap
  • Study design - tagging
  • Study coverage evaluation
  • Study analysis - improving association testing
  • Study interpretation
  • Comparison of multiple studies
  • Connection to genes/genomic features
  • Integration with expression and other functional
    data
  • Other uses of HapMap data
  • Admixture, LOH, selection

15
Tagging from HapMap
  • Since HapMap describes the majority of common
    variation in the genome, choosing non-redundant
    sets of SNPs from HapMap offers considerable
    efficiency without power loss in association
    studies

16
(No Transcript)
17
Pairwise tagging
Tags SNP 1 SNP 3 SNP 6 3 in total Test for
association SNP 1 SNP 3 SNP 6
After Carlson et al. (2004) AJHG 74106
18
Pairwise Tagging Efficiency
Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds Table 7 Number of selected tag SNPs to capture all observed common SNPs in the Phase I HapMap for the three analysis panels using pairwise tagging at different r2 thresholds
YRI CEU CHBJPT
Pairwise r2 0.5 324,865 178,501 159,029
Pairwise r2 0.8 474,409 293,835 259,779
Pairwise r2 1 604,886 447,579 434,476
Tag SNPs were picked to capture common SNPs in
release 16c.1 for every 7,000 SNP bin using
Haploview.
Tagging Phase I HapMap offers 2-5x gains in
efficiency
19
Use of haplotypes can improve genotyping
efficiency
Tags SNP 1 SNP 3 2 in total Test for
association SNP 1 captures 12 SNP 3 captures
35 AG haplotype captures SNP 46
Tags SNP 1 SNP 3 SNP 6 3 in total Test for
association SNP 1 SNP 3 SNP 6
tags in multi-marker test should be conditional
on significance of LD in order to avoid
overfitting
20
Efficiency and power
tag SNPs
300,000 tag SNPs needed to cover
common variation in whole genome in CEU
Relative power ()
random SNPs
Average marker density (per kb)
P.I.W. de Bakker et al. (2005) Nat Genet Advance
Online Publication 23 Oct 2005
21
How to pick tag SNPs?
  • What is the genetic hypothesis? Which variants do
    you want to test for a role in disease?
  • functional annotation (coding SNPs)
  • allele frequency (HapMap ascertainment)
  • previously implicated associations
  • Go to http//www.hapmap.org DCC supported
    interactive tagging
  • Export HapMap data into tools such as Tagger,
    Haploview (www.broad.mit.edu/mpg)

22
Will tag SNPs picked from HapMap apply to other
population samples?
CEU
CEU
CEU
Whites from Los Angeles, CA
Botnia, Finland
Utah residents with European ancestry(CEPH)
Population differences add very little
inefficiency Platform presentation Paul de
Bakker (223 Sat 9.30)
23
Applying the HapMap
  • Study design - tagging
  • Study coverage evaluation
  • Study analysis - improving association testing
  • Study interpretation
  • Comparison of multiple studies
  • Connection to genes/genomic features
  • Integration with expression and other functional
    data
  • Other uses of HapMap data
  • Admixture, LOH, selection

24
Genome-wide association coverage
  • If genome-wide products are typed on the HapMap
    sample panel, the SNPs on HapMap not included in
    the panel provide an evaluation for the coverage
    of the product
  • ENCODE (deep ascertainment)
  • Phase II (dense, genome-wide)

25
Association tests with fixed markers
Tests of association SNP 1 SNP 3
SNP on whole-genome product (1 - 5 common
variation directly assayed)
26
Association tests with fixed markers
Tests of association SNP 1 SNP 3
27
Association tests with fixed markers
Tests of association SNP 1 SNP 3 SNPs
actually tested SNP 1 SNP 3 SNP 2 SNP 5
28
Genome-wide products can capture most common
variation
Example 500K data generated by Affymetrix and
recently submitted to HapMap DCC
29
More on this topic
  • Platform presentations tomorrow morning 8 AM
    sharp
  • Peer
  • Jorgenson
  • Lazarus
  • As well as several detailed posters!

30
Applying the HapMap
  • Study design - tagging
  • Study coverage evaluation
  • Study analysis - improving association testing
  • Study interpretation
  • Comparison of multiple studies
  • Connection to genes/genomic features
  • Integration with expression and other functional
    data
  • Other uses of HapMap data
  • Admixture, LOH, selection

31
Can incorporating tests of haplotypes of SNPs on
the genome-wide product improve this coverage?
32
Improving association power using data from
HapMap
Tests of association SNP 1 SNP 3 SNPs
actually tested SNP 1 SNP 3 SNP 2 SNP 5
33
Improving association power using data from
HapMap
Tests of association SNP 1 SNP 3 SNPs
actually tested SNP 1 SNP 3 SNP 2 SNP 5
34
Improving association power using data from
HapMap
Tests of association SNP 1 SNP 3 AG
haplotype SNPs actually tested SNP 1 SNP
3 SNP 2 SNP 5 SNP 4 SNP 6
35
Haplotypes increase coverage
36
Applying the HapMap
  • Study design - tagging
  • Study coverage evaluation
  • Study analysis - improving association testing
  • Study interpretation
  • Connection to genes/genomic features
  • Comparison of multiple association studies
  • Integration with expression and other functional
    data
  • Other uses of HapMap data
  • Admixture, LOH, selection

37
Integration with genomic features
  • Positive association to a SNP on HapMap enables
    detailed interpretation
  • How many other SNPs are in LD with this SNP?
  • What genes are in LD with this SNP?
  • What coding variants and putative functional
    variants are in LD with this SNP?
  • Potential to improve power by modifying Bayesian
    priors
  • of each association test based on this
    information

38
Example Complement Factor H - AMD
  • Original SNP hit in Affy 100K experiment
    rs380390
  • Extent and structure of LD from HapMap aids in
    the fine mapping phase of project

Klein et al Science 2005
39
Example Complement Factor H - AMD
rs380390
40
Example Complement Factor H - AMD
rs380390
41
Meta-analysis of association studies
  • When different marker sets are used to study
    association (candidate gene or genome-wide),
    results can be readily integrated when all
    markers are typed on HapMap samples

42
(No Transcript)
43
Example DTNBP1 and schizophrenia
  • Multiple studies have described modest
    association to schizophrenia
  • Most studies have examined small numbers of
    non-overlapping sets of SNPs
  • HapMap data can be used to determine whether
    these association finding

Derek Morris, Mousumi Mutsuddi (WCPG meeting)
44
Extensive LD across DTNBP1
Phase II HapMap - 186 SNPs 180 kb
45
Phylogeny of DTNBP1 tag SNPs
Ancestral haplotype
6 33 42
8 11
46
Associated alleles reported
Straub 2002 Van den Oord 2003
47
Associated alleles reported
Straub 2002 Van den Oord 2003
Schwab 2003
48
Associated alleles reported
Straub 2002 Van den Oord 2003
Van den Bogaert 2003 Funke 2004
Schwab 2003
49
Associated alleles reported
Straub 2002 Van den Oord 2003
Williams 2004 Bray 2005
Van den Bogaert 2003 Funke 2004
Schwab 2003
50
Associated alleles reported
Kirov 2004
Straub 2002 Van den Oord 2003
Williams 2004 Bray 2005
Van den Bogaert 2003 Funke 2004
Schwab 2003
51
Inconsistent findings
  • No consistently associated SNP/haplotype pattern
    across studies
  • All studies (European-derived populations) had
    allele/haplotype frequencies compatible with
    HapMap-CEU sample
  • HapMap can successfully relate associations from
    diverse marker sets

52
Other Applications Structural Variation
  • 3 papers coming out in the next month describe
    use of HapMap data to identify large, common
    deletion polymorphisms
  • LD around these polymorphisms permits their
    assessment with tag SNPs/haplotypes in
    genome-wide association studies

53
Other Applications Admixture Scanning
  • HapMap data provides a rich source of highly
    differentiated SNPs for design of admixture
    panels
  • Fine mapping of admixture signals can be focused
    on the full set of highly differentiated alleles
    in any region of the genome

54
Other Applications LOH
  • HapMap identifies
  • Regions of extended LD that may manifest
    themselves as unusually long stretches of
    homozygosity in individual samples
  • The catalog of large deletion variants on the
    HapMap will differentiate between LOH that is
    potentially de novo and causal, and that which is
    simply commonly segregating in the population

LOH analysis cognizant of HapMap patterns under
development
55
Early results encouraging
  • At this meeting
  • Arking and colleagues describe identification of
    variant altering QT-interval
  • Herbert and colleagues describe a novel gene for
    obesity
  • Wijmenga and colleagues describe a novel gene for
    celiac disease
Write a Comment
User Comments (0)
About PowerShow.com