Title: The Genome Access Course Sequence Variation: SNPs, etc.
1TheGenomeAccessCourseSequence
VariationSNPs, etc.
Marilyn Monroe, Andy Warhol
2Human Genetic Identity
- 99.9 identical
- 3,200,000 nucleotides different
- Single base differences in genomes between any
two individuals ca. 3 million - Amino acid differences in proteomes between any
two individuals ca. 100,000
3Variation Types
- Macro
- Chromosome numbers
- Segmental duplications, rearrangements, and
deletions - Medium
- Sequence Repeats
- Transposable Elements
- Short Deletions, Sequence and Tandem Repeats
- Micro
- Single Nucleotide Polymorphisms (SNPs)
- Single Nucleotide Insertions and Deletions
(Indels)
4Relevance Important applications
- Identify inherited contribution to
- Disease risk
- Reactions to environmental triggers
- Reactions to treatments
- Cognitive abilities
- Requirements
- Forensics (incl. WTC)
- Ontology Phylogeny
- Evolution
5Most common diseases are caused by a combination
of genes and environment
Stroke
Manic-depression
Myocardial Infarction
Breast cancer
Hypertension
Diabetes
High Cholesterol
Obesity
Schizophrenia
Inflammatory Bowel Disease
6Roles for variation in genome analysis
- Physical Mapping
- enriched marker set
- Population Structure
- haplotype analysis
- evolutionary studies
- Association Studies
- population stratification
- metabolic pathways
- Functional Analysis
- Pharmacogenomics
- protein structure
7Raw Genome Data
Biological variation vs. sequence variation
8Most abundant type SNPs-Single Nucleotide
Polymorphisms GATTTAGATCGCGATAGAG GATTTAGATCTCGAT
AGAG
9SNPs in Overlapping Genomic Sequences
Overlapping BACs from library
50 of overlaps contain polymorphisms
10SNPs What they are (and arent )
- Single base pair variations among allelic
sequences - Least abundant allele has frequency 1
- Not all single base pair differences are SNPs
- Nor are all point mutations (indels are not)
11SNP - Identification
- Sequence comparison
- Genomic sequences
- ESTs
- BACs
- Rare alleles difficult to spot (below error rate)
- PCR
- Microarrays
- Databases (in silico)
12Occurrence
- Ca. 1/1300 bp between two alleles
- Ca. 1/300 bp between populations
- In intergenic regions, introns exons
- Functionally constrained DNA less diverse
- 50 of CDS SNPs synonymous
- Frequency varies by variation type
13Types of SNPs
- Genic, coding SNPs
- non-synonymous
- Maintaining vs. altering protein
structure/function - synonymous
- Maintaining vs. altering splicing
- Genic, non-coding SNPs
- Regulatory SNPs
- Maintaining vs. altering gene expression
- Intronic SNPs
- Maintaining vs. altering gene expression/splicing
- Linked SNPs
- usually intergenic
14GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
- Rare Alleles
- ---o--------------------
- -----o------------------
- -------o----------------
- -----------o------------
- ---------------o--------
- -------------------o----
- Many
- Common Alleles
- ----o-------------------
- ----o-------------------
- ----o-------------------
- --------------------o---
- ---------o----------o---
- --------------------o---
- Few
Rare Alleles ---o-------------------- -----o------
------------ -------o---------------- -----------o
------------ ---------------o-------- ------------
-------o---- Many
Common Alleles ----o------------------- ----o-----
-------------- ----o------------------- ----------
----------o--- ---------o----------o--- ----------
----------o--- Few
15Sites for Viewing SNPs
- UCSC Browser (http//genome.ucsc.edu)
- SNP Consortium (http//snp.cshl.org/)
- NCBI dbSNP http//www.ncbi.nlm.nih.gov/SNP)
16(No Transcript)
17(No Transcript)
18Aaron the Lemba
19(No Transcript)
20(No Transcript)
21NCBIs dbSNP - variation and polymorphism
Variant position A/G
22Variations mapped onto protein structures
3,868 variations (6 of coding region variations)
have been mapped to a 3D structure
23SNPs vs. Haplotypes
GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
CGGGTATCGATTTAGATCGCGATAGAGTTGCCTACA CGAGTATCGATTT
AGATCTCGATAGAGTTGTCTACA
Many polymorphisms make a type
24- Responders
- tcgaggaacagggctcttaaaaatgctttatccgcttag
- tcgaggaacagggctcttaaaaatgctttctccgcttgg
- tagagcaacagggctctaaaaaatgctttctccgcttag
- Non-responders
- tagtgaaacagggctctgaaaactgctttatccgattcg
- tagtggaatagggctctgaaaactgctttatccgattgg
- tcgtggaacagggctctgaaaactgctttgtccgattgg
Responders -c-a-g--c--------t----a------a----c--a-
-c-a-g--c--------t----a------c----c--g- -a-a-c--c
--------a----a------c----c--a- Non-responders -a-t
-a--c--------g----c------a----a--c- -a-t-g--t-----
---g----c------a----a--g- -c-t-g--c--------g----c-
-----g----a--g-
25Haplotypes in NCBIs MapViewer
26(No Transcript)