Title: Genome Diversity
1Genome Diversity
2Overview
- Mutation and Alleles
- linkage
- genetic variation in populations
- SNPs as genetic markers
- classical genetic diseases
- multi-factorial diseases and risk factors
- Genome scans (genotyping)
- Pharmacogenomics
3 A review of some basic genetics
4 Alleles
- An allele is a particular DNA sequence for a gene
- Some gene alleles are responsible for ordinary
phenotypes like blue/brown eyes. - Others lead to classic genetic diseases like
cystic fibrosis or Huntingtons disease.
5 Changes occur in DNA sequences mutations
6Many Causes of Mutations
- Somatic vs. reproductive cells
- Radiation and/or chemical damage to DNA
- Random errors of the replication machinery
- Normal biological processes- methylation
7Mutations create Alleles
- Mutations occur randomly throughout the DNA
- Most have no phenotypic effect (non-coding
regions, equivalent codons, similar AAs) - Some damage the function of a protein or
regulatory element - A very few provide an evolutionary advantage
8Population Genetics
- Chromosome pairs segregate and recombine in every
generation. - Every allele of every gene has its own
independent evolutionary history (and future!) - Frequencies of various alleles differ in
different sub-populations of people.
9Human Alleles
- The OMIM (Online Mendelian Inheritance in Man)
database at the NCBI tracks all human mutations
with known pheontypes. - It contains a total of about 2,000 genetic
diseases and another 11,000 genetic loci with
known phenotypes - but not necessarily known gene
sequences - It is designed for use by physicians
- can search by disease name
- contains summaries from clinical studies
10(No Transcript)
11Variation Makes Life Interesting
- The Human Genome has been sequenced, whats next?
- Much of what makes us unique individuals is
represented by the differences in our DNA
sequence from other people. - There are rare and common forms (alleles) of
every gene - probably only 3-4 alleles are present in 95 of
the population for most genes, but lots of rare
mutations
12 SNPs are Mutations
13SNPs
- A mutation that causes a single base change is
known as a Single Nucleotide Polymorphism (SNP) - Other kinds of mutations include insertions and
deletions - Large breaks and rearrangement of chromosomes
also occur (translocations)
GATTTAGATCGCGATAGAG GATTTAGATCTCGATAGAG
14 SNPs are Very Common
- SNPs are very common in the human population.
- Between any two people, there is an average of
one SNP every 1250 bases. - Most of these have no phenotypic effect
- only lt1 of all human SNPs impact protein
function (non-coding regions) - Selection against mis-sense mutations (think
about what would happen to dominant lethal
mutations?) - Some are alleles of genes.
15Genome Sequencing finds SNPs
- The Human Genome Project involves sequencing DNA
cloned from a number of different people. - The Celera sequence comes from 5 people
- Even in a library made from from one persons
DNA, the homologous chromosomes have SNPs - This inevitably leads to the discovery of SNPs -
any single base sequence difference - These SNPs can be valuable as the basis for
diagnostic tests
16(No Transcript)
17The SNP Consortium is an unlikely alliance of
pharmaceutical and computer companies managed by
Lincoln Stein at Cold Spring Harbor Lab.
The SNP Consortium Ltd.. is a non-profit
foundation organized for the purpose of
providing public genomic data. Its mission is to
develop up to 300,000 SNPs distributed evenly
throughout the human genome and to make the
information related to these SNPs available to
the public without intellectual property
restrictions. The project started in April 1999
and is anticipated to continue until the end of
2002.
The current release (Oct 2002) consists of 1.8
million SNPs, all of which have been anchored to
the human genome by "in silico" mapping to the
genomic working draft (UCSC Golden Path).
18 We describe a map of 1.42 million single
nucleotide polymorphisms (SNPs) distributed
throughout the human genome, providing an average
density on available sequence of one SNP every
1.9 kilobases. These SNPs were primarily
discovered by two projects The SNP Consortium
and the analysis of clone overlaps by the
International Human Genome Sequencing Consortium.
The map integrates all publicly available SNPs
with described genes and other genomic features.
We estimate that 60,000 SNPs fall within exon
(coding and untranslated regions), and 85 of
exons are within 5 kb of the nearest SNP.
Nucleotide diversity varies greatly across the
genome, in a manner broadly consistent with a
standard population genetic model of human
history. This high-density SNP map provides a
public resource for defining haplotype variation
across the genome, and should help to identify
biomedically important genes for diagnosis and
therapy.
19Search for SNPs in your gene
- an average density of one SNP every 1.9
kilobases - But that does not guarantee a SNP in your
favorite gene!
20GenBank has a dbSNP
- As of Apr 19, 2001 , dbSNP has submissions for
2,842,021 SNPs - It is possible to search dbSNP by BLAST
comparisons to a target sequence
21gtgnldbSNPrs1042574_allelePos51 total len 101
taxid 9606snpClass 1 Length
101 Score 149 bits (75), Expect 3e-33
Identities 79/81 (97) Strand Plus / Plus
Query 1489
ccctcttccctgacctcccaactctaaagccaagcactttatatttttct
cttagatatt 1548
Sbjct 1
ccctcttccctgacctcccaactctaaagccaagcactttatattttcc
tyttagatatt 60
Query 1549 cactaaggacttaaaataaaa 1569
Sbjct 61
cactaaggacttaaaataaaa 81
If a matching SNP is found, then it can
be directly located on the Genome map
22(No Transcript)
23Uses for SNPs
- Diagnostic tests for disease alleles
- Markers to aid in cloning of interesting genes
(disease genes) - Pharmacogenomics - genetics of reponse to drugs
(effectiveness and side effects)
24DNA Diagnostic Testing
- hereditary diseases - potential parents,
pre-natal, late onset diseases - genes that predisposes to disease (risk factors)
- genotyping of infectious agents (bacterial
viral) - forensics - using DNA testing to establish
identity
25Clinical Manifestationsof Genetic Variation
- (All disease has a genetic component)
- Susceptibility vs. resistance
- Variations in disease severity or symptoms
- Reaction to drugs (pharmacogenetics)
- Variable disease course and prognosis
- SNPs can be found that are linked to all of these
traits!
26Finding Disease Genes
- Virtually all diseases have a genetic component
- Start with DNA samples from families that show
inheritance of the disease - Use STS markers to map the gene or genes involved
(linkage analysis) - Find SNPs in the genetic region(s) that are
likely candidates for involvement in that
disease - Get the gene from genomic sub-clone
27Some Diseases Involve Many Genes
- There are a number of classic genetic diseases
caused by mutations of a single gene - Huntingtons, Cystic Fibrosis, Tay-Sachs, PKU,
etc. - There are also many diseases that are the result
of the interactions of many genes - asthma, heart disease, cancer
- Each of these genes may be considered to be a
risk factor for the disease. - Groups of genetic markers (SNPs) may be
associated with a disease without determining a
mechanism.
28Multiple Causes
- Some diseases may actually be caused by any of a
group of different genes (multiple causes), but
all show the same symptoms. - SNP linkage analysis can identify these
sub-populations more efficiently than classical
molecular genetic approaches. - machine learning, genetic algorithms, SVMs
29Pharmacogenomics
- The use of DNA sequence information to measure
and predict the reaction of individuals to drugs. - Personalized drugs
- Faster clinical trials
- Less drug side effects
30 Some Gene Products Interact with Drugs
- There are proteins that chemically activate or
inactivate drugs. - Other proteins can directly enhance or block a
drug's activity. - There are also genes that control side effects
31 Collect Drug Response Data
- These drug response phenotypes are associated
with a set of specific gene alleles. - Identify populations of people who show specific
responses to a drug. - In early clinical trials, it is possible to
identify people who react well and react poorly.
32 Make Genetic Profiles
- Scan these populations with a large number of SNP
markers. - Find markers linked to drug response phenotypes.
- It is interesting, but not necessary, to identify
the exact genes involved. - Can work with associated populations, does not
require detailed information on disease in family
history(pedigree).
33Huge Database Problem
- Physicians collect tons of data
- patient age, sex, weight, blood pressure, family
disease history, date of symptom onset - Cancer data tumor size, location, stage, etc.
- Data specific to each type of disease
- Now integrate thousands (or 100Ks) of SNPs that
are correlated with some of these clinical
factors in complex relationships
34Use the Profiles
- Genetic profiles of new patients can then be used
to prescribe drugs more effectively avoid
adverse reactions. - Can also speed clinical trials by testing on
those who are likely to respond well. - Can "rescue" drugs that don't work well on
everybody, or that have bad side effects on a few.
35Microarrays
- Screening large numbers of SNP markers on a
sample of genomic DNA is one highly promising
application for microarray technology. - Many other high-throughput SNP genotyping
technologies are under development. - Affymetrix 10K SNP product on sale now!
- Working on 120K SNPs to be released soon
36Preliminary data from Affy 10K SNP
37The GeneChip Mapping 10K Array and Assay Set
offer the ability to generate over 10,000
genotypes on a single array using an innovative
assay that eliminates the need for locus-specific
PCR. This assay requires only 250 of ng DNA for
each sample, saving precious resources. The major
benefits of the Mapping 10K are More Genetic
Power and resolution with an average of one SNP
every 210kb on the genome Innovative assay that
uses a single PCR primer to genotype more than
10,000 SNP Requires only 250 ng of genomic DNA
per sample Automated genotype calling with more
than 99.6 accuracy on a proven platform
Extensive SNP annotation in the NetAffx
Analysis Center
38DATABASE!!!
- Thousands of scientists are going to start
screening these 10K SNPs against various
populations of patients - If we can capture the data in a sensible
structure - think of the possible complex correlations
- an endless mine of medical/genetic information
39Real World Applications
- Most of the major pharmaceutical companies are
currently collecting pharmacogenomic data in
their clinical trials. - Data is yet to be published.
- Genetic indications for drug use are still a
couple of years away. - Plan to sell the drug with the gene test
40Multi-locus SNP Profiles
- There will be a few hundred to a few thousand
SNPs linked to medically important alleles in the
next 10 years. - Haplotypes will reduce the number that need to be
screened (one SNP gives information about a group
of linked genes) - Some genes will turn out to be involved in many
important pathways
41Will People Want This Information??
- Genetic determinism and possible discrimination.
- Even a simple test to see what drug you should
take could reveal information about your risk of
cancer or heart disease.