Title: The Genome Access Course Sequence Variation
1The Genome Access CourseSequence Variation
2Any two copies of the human genome have 1
difference per each 1000 nt
3Variation Types
- Cytological level
- Chromosome numbers
- Segmental duplications, rearrangements, and
deletions - Molecular level
- Transposable Elements
- Short Deletions, Sequence and Tandem Repeats
- Sequence level
- Single Nucleotide Polymorphisms (SNPs)
- Small Nucleotide Insertions and Deletions (Indels)
4Variation is useful
- Identify genetic basis for
- Disease risk
- Reactions to environmental triggers
- Responsiveness to drug treatments
- Forensics
- Genetic and physical mapping
- Evolution
5Most common diseases are caused by a combination
of genes and environment
Stroke
Manic-depression
Myocardial Infarction
Breast cancer
Hypertension
Diabetes
High Cholesterol
Obesity
Schizophrenia
Inflammatory Bowel Disease
6RFLP
SSR
a
x
y
b
a,a a,b b,b
Radioactive probes
Gel electrophoresis
Human Molecular Genetics (Strachan Read 2004)
7GATTTAGATCGCGATAGAG GATTTAGATCTCGATAGAG
- SSR are not frequent enough for complex disease
association studies - SNPs are the most abundant type of polymorphism
(1 SNP has been discovered every 2000 nt) - 1 every 300 bases on average in the worlds human
population - To be considered a polymorphism, the minor allele
must have a frequency gt 1
8(No Transcript)
9The SNP consortium
- Construct high density human SNP map for medical
and population genetics studies - Identify 1 M candidate SNPs by shotgun sequencing
of genomic fragments form 24 individuals - Project finished in 9/2003
10SNPs in Shotgun Genomic Sequences
HGP reference sequence
Shotgun sequences
SNP
SNP
11The SNP consortium
- An additional 1 M SNPs were identified by the
human genome project in the overlaps between BAC
clones
12SNPs in Overlapping Genomic Sequences
Overlapping BACs from library
SNP
50 of overlaps contain polymorphisms
13SNP Map
- 1.42 M non-redundant SNPs
- 95 estimated to be polymorphic in at least one
population (1500 SNP genotyped) - 82 percent of the SNPs have a minor allele
frequency gt10
14Types of SNPs
- Genic, coding SNPs
- Non-synonymous
- Synonymous
- Genic, non-coding SNPs
- Regulatory SNPs
- Intronic SNPs
- Intergenic
15The challenge Genotyping
- Sequence comparison
- Genomic sequences
- ESTs
- BACs
- PCR (TaqMan)
- Microarrays
- SSCP
- single-strand conformation polymorphism
- DHPLC
- heteroduplex DNAs w/ HPLC
16TaqMan
R
C
A
AA genotype Red (shown) CC genotype Green AC
genotype R G
R
R
Sayers et al. In SNP and Micro satellite
Genotyping, Biotechniques Pub., MA 2000
17Microarrays
Oligo matching the A allele
Oligo matching the C allele
Wang et al. Science 1998
18Haplotypes
(4 haplotypes in the population for a 6000 bp
region)
The International HapMap Consortium, Nature 2003
19- Genotyped 600,000 SNPs in 270 DNA samples from
several populations from different ancestral
geographic locations - 30 trios Yoruba people (Ibadan, Nigeria) code
YRI - 45 Japanese (unrelated, Tokyo) code JPT
- 45 Chinese (unrelated, Beijing) code CHB
- 30 trios U.S. (northern/western European)
code CEU - Genotyped additional SNPs in regions where
associations are weak - Complete analysis of ten 500kb regions (ENCODE
project) - Major paper published in 10/2005 (Nature)
20ENCODE Project
21926 SNPs w/ Extreme Allele Frequencies Like Duffy
(DY ) Locus
- 32 are non-synonymous coding SNPs
These SNPs show very strong population
differentiation
22Candidate Loci for Natural Selection
- High frequency haplotypes that are large
HLA region multiple haplotypes of 500 SNPs
that extend gt 1cM w/ frequency gt 1
23(No Transcript)
24(No Transcript)
25Tag SNP
26(No Transcript)
27(No Transcript)
28(No Transcript)
29Sites for Viewing SNPs
- UCSC Browser
- http//genome.ucsc.edu
- The International HapMap Project
- http//www.hapmap.org
- NCBI dbSNP
- http//www.ncbi.nlm.nih.gov/SNP
- Ensembl
- http//www.ensembl.org
- Perlegen
- http//genome.perlegen.com/browser/index.html
30Haplotype Analysis
Reduce QTL interval by comparing haplotypes of
parental strains used in a cross Exclude regions
where haplotypes match
High Blood Pressure QTL from intercross of
(C3H/HeJ x SWR/J) F2 18 cM, spanning 6886 cM
on Chr 1
Compared haplotypes of mice high and normal blood
pressure
High blood pressure SWR/J and C57BL/6J Normal
blood pressure C3H/HeJ and A/J
31Haplotype Analysis
SSLP marker
lengths
32Haplotype Analysis
Repeat example using dbSNP (Chr. 1 154-165 Mb for
SWR, B6, C3H, A)
http//www.ncbi.nlm.nih.gov/SNP/MouseSNP.cgi
33Haplotype Analysis
Repeat example using MGI (Chr. 1 154-165 Mb for
SWR, B6, C3H, A)