Title: Human Genetic Variation
1Human Genetic Variation
SNPs
2Human Genetic Identity
- 99.9 identical
- 3,196,800,000 nucleotides identical
- 3,200,000 nucleotides different
3Human Genetic Variation
- Single base differences in genomes between any
two individuals 2-5 million - Amino acid differences in proteomes between
between any two individuals about 100,000 - Inheritance vs. Alterations
4Relevance Variation Bears Difference
- Physiological and anatomical differences based on
molecular differences - Exception Trauma, environmental impacts
- Genetics Inherited contribution to phenotypic
variation
5(No Transcript)
6Variation
- Macro
- Chromosome numbers
- Segmental duplications, rearrangements, and
deletions - Medium
- Sequence Repeats
- Transposable Elements
- Short Deletions, Sequence and Tandem Repeats
- Micro
- Single Nucleotide Polymorphisms (SNPs)
- Single Nucleotide Insertions and Deletions
(Indels)
7Relevance Important applications
- Identify inherited contribution to
- Disease risk
- Reactions to environmental triggers
- Reactions to treatments
- Cognitive abilities
- Requirements
- Forensics (incl. WTC)
- Ontology Phylogeny
- Evolution
8SNPs in Biomedicine
- Medical Conditions
- Diagnosis and treatment
- Gene therapy
- Pharmacogenomics
- Individualized drugs
- Less drug side effects
- Faster clinical trials
9Most common diseases are caused by a combination
of genes and environment
Stroke
Manic-depression
Myocardial Infarction
Breast cancer
Hypertension
Diabetes
High Cholesterol
Obesity
Schizophrenia
Inflammatory Bowel Disease
10Traditional Approach
- Linkage or recombinatorial mapping
- Successful for single gene disorders
- Little success for common complex traits, such as
heart disease, diabetes, asthma, mental disorders
11Mendelian disease genetics
genotype
disease state
Linkage analysis powerful because genetic risk
factors are highly penetrant
12Complex Disease Genetics
phenotype1
other genes
phenotype2
disease state
genotype
phenotype3
phenotype4
phenotype5
environment
13CHALLENGE
Find genetic susceptibility factors for common
disease, drug and pathogen response
- To understand the fundamental basis of disease
- To identify at-risk populations
- To identify targets for chemical
intervention/drug development - To predict outcomes and treatment efficacy
14Complex Disease Genetics
- Despite significant genetic component,
traditional approaches (e.g., linkage analysis)
have little or no power because the genetic risk
factors are individually too modest/incompletely
penetrant - If traditional approaches arent succeeding -
where do we turn?
15Raw Genome Data
Biological variation vs. sequence variation
16Most abundant type SNPs-Single Nucleotide
Polymorphisms GATTTAGATCGCGATAGAG GATTTAGATCTCGAT
AGAG
17SNPs in Overlapping Genomic Sequences
Overlapping BACs from library
50 of overlaps contain polymorphisms
18SNPs What they are (and arent )
- Single base pair variations among allelic
sequences - Least abundant allele has frequency gt 1
- Not all single base pair differences are SNPs
- Nor are all point mutations
19Types of SNPs
- Causative SNPs
- coding SNPS (non/-synonymous)
- non-coding SNPs
- (Read Making Sense of Nonsense NRG 5/02)
- Linked SNPs
- usually intra- and intergenic SNPs
20Occurrence
- Ca. 1/1300 bp in genomic DNA from two equivalent
chromosomes - Ca. 1/300 bp in whole populations
- In intergenic regions, introns exons
- Functionally constrained DNA less diverse
- 50 of CDS SNPs synonymous
- Frequency varies by variation type
21(No Transcript)
22(No Transcript)
23TSC SNPdb
- A/G C/T 33
- A/C A/T C/G G/T 8
- A/C/G A/C/T A/G/T C/G/T 0.006
- A/C/G/T 0.002
24Transversions c?a g?t c?g g?c t?a a?t
25SNPs - Identification
- Sequence comparison
- PCR
- Microarrays
- Databases
26In Silico SNP Identification
- Mine existing sequence resources
- Genomic sequences
- ESTs
- BAC-end sequences
- Cost-effective genome-wide SNP discovery by
examining regions of redundant sequence coverage - Rare alleles difficult to spot (below error rate)
27(No Transcript)
28De Novo SNP Identification
- Select candidates
- Establish cell cultures
- Isolate DNA, digest, gel fractionate, clone
- Sequence, kill if repeats gt 50
- Blast against each other select gt99 id
- Run SNP Finding Program
29(No Transcript)
30GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
- Rare Alleles
- ---o--------------------
- -----o------------------
- -------o----------------
- -----------o------------
- ---------------o--------
- -------------------o----
- Many
- Common Alleles
- ----o-------------------
- ----o-------------------
- ----o-------------------
- --------------------o---
- --------------------o---
- --------------------o---
- Few
31GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
- Rare Alleles
- ---o--------------------
- -----o------------------
- -------o----------------
- -----------o------------
- ---------------o--------
- -------------------o----
- Many
- Common Alleles
- ----o-------------------
- ----o-------------------
- ----o-------------------
- --------------------o---
- --------------------o---
- --------------------o---
- Few
32Pharmacogenomics
- The use of DNA sequence information to measure
and predict the reaction of individuals to drugs. - Personalized drugs
- Faster clinical trials through selected trial
populations - Less drug side effects
33Drug Responses
- Absorption
- Distribution
- Activation
- Metabolism
- Excretion
34Patients
- Responders
- Non-responders
- Toxic responders
35Goals
- Right Drug
- Right Dose
- Right Patient
36SNPs vs. Haplotypes
GATTTAGATCGCGATAGAGGATTTAGATCTCGATAGAG
CGGGTATCGATTTAGATCGCGATAGAGTTGCCTACA CGAGTATCGATTT
AGATCTCGATAGAGTTGTCTACA
Many polymorphisms make a type
37- Responders
- tcgaggaacagggctcttaaaaatgctttatccgcttag
- tcgaggaacagggctcttaaaaatgctttctccgcttgg
- tagagcaacagggctctaaaaaatgctttctccgcttag
- Non-responders
- tagtgaaacagggctctgaaaactgctttatccgattcg
- tagtggaatagggctctgaaaactgctttatccgattgg
- tcgtggaacagggctctgaaaactgctttgtccgattgg
38- Responders
- -c-a-g--c--------t----a------a----c--a-
- -c-a-g--c--------t----a------c----c--g-
- -a-a-c--c--------a----a------c----c--a-
- Non-responders
- -a-t-a--c--------g----c------a----a--c-
- -a-t-g--t--------g----c------a----a--g-
- -c-t-g--c--------g----c------g----a--g-
39SNPFinder
- SNPFinder at the Cancer Gene Anatomy Project
allows you to search for SNPs (single-nucleotide
polymorphisms). You may upload your ABI or
SCF-format chromatograms, which are basecalled,
assembled and searched for SNPs. Sequences may
optionally be assembled with UniGene data of your
choosing. An account is required to use SNPFinder
- it is free for academic users.
40Other Sites for Viewing SNPs
- UCSC Browser (http//genome.ucsc.edu)
- SNP Consortium (http//snp.cshl.org/)
- NCBIs dbSNP (http//www.ncbi.nlm.nih.gov/SNP/)
41Variation Data at NCBI
- Bioinformatics infrastructure
- Databases of sequence variation
- Haplotypes and variations as genome annotation
- Functional variants in genetic disease /
pharmacogenomics - Evolutionary Biology
- Forensic Biology
- mass casualty analysis
42dbSNP - variation and polymorphism
Variant position A/G
43Roles for variations in genome analysis
Physical Mappingenriched marker set Population
Structurehaplotype analysisevolutionary
studies Association Studiespopulation
stratificationmetabolic pathways Functional
Analysispharmacogenomicsprotein structure
44Variations mapped onto protein structures
3,868 variations (6 of coding region variations)
have been mapped to a 3D structure
45(No Transcript)
46(No Transcript)
47Haplotypes in MapViewer
48Frequency data by dbSNP population classes