Polymorphism - PowerPoint PPT Presentation

About This Presentation
Title:

Polymorphism

Description:

... variable between samples while the flanking regions where PCR primers bind are constant ... Harmful (diabetes, cancer, heart disease, Huntington's disease, ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 31
Provided by: hat89
Category:

less

Transcript and Presenter's Notes

Title: Polymorphism


1
Polymorphism
  • Haixu Tang
  • School of Informatics

2
Genome variations
underlie phenotypic differences
3
Restriction fragment length polymorphism (RFLP)
4
RFLP
Haplotype
5
Microsattelite (short tandem repeats)
polymorphysim
7 repeats
8 repeats
the repeat region is variable between samples
while the flanking regions where PCR primers bind
are constant
6
Which Suspect, A or B, cannot be excluded
from potential perpetrators of this assault?
7
Single nucleotide polymorphism
  • The highest possible dense polymorphism
  • A SNP is defined as a single base change in a DNA
    sequence that occurs in a significant proportion
    (more than 1 percent) of a large population.

8
Some Facts
  • In human beings, 99.9 percent bases are same.
  • Remaining 0.1 percent makes a person unique.
  • Different attributes / characteristics / traits
  • how a person looks,
  • diseases he or she develops.
  • These variations can be
  • Harmless (change in phenotype)
  • Harmful (diabetes, cancer, heart disease,
    Huntington's disease, and hemophilia )
  • Latent (variations found in coding and regulatory
    regions, are not harmful on their own, and the
    change in each gene only becomes apparent under
    certain conditions e.g. susceptibility to lung
    cancer)

9
SNP facts
  • SNPs are found in
  • coding and (mostly) noncoding regions.
  • Occur with a very high frequency
  • about 1 in 1000 bases to 1 in 100 to 300 bases.
  • The abundance of SNPs and the ease with which
    they can be measured make these genetic
    variations significant.
  • SNPs close to particular gene can acts as a
    marker for that gene.

10
SNP maps
  • Sequence genomes of a large number of people
  • Compare the base sequences to discover SNPs.
  • Generate a single map of the human genome
    containing all possible SNPs gt SNP maps

11
How do we find sequence variations?
12
Automated polymorphism discovery
Marth et al. Nature Genetics 1999
13
Large SNP mining projects
14
How to use markers to find disease?
genome-wide, dense SNP marker map
  • genotyping using millions of markers
    simultaneously for an association study
  • depends on the patterns of allelic association
    in the human genome

15
Allelic association
  • allelic association is the non-random assortment
    between alleles i.e. it measures how well
    knowledge of the allele state at one site permits
    prediction at another

functional site
marker site
  • significant allelic association between a marker
    and a functional site permits localization
    (mapping) even without having the functional site
    in our collection
  • by necessity, the strength of allelic
    association is measured between markers

16
Linkage disequilibrium
  • LD measures the deviation from random assortment
    of the alleles at a pair of polymorphic sites
  • other measures of LD are derived from D, by e.g.
    normalizing according to allele frequencies (r2)

17
Haplotype diversity
  • the most useful multi-marker measures of
    associations are related to haplotype diversity

2n possible haplotypes
n markers
random assortment of alleles at different sites
18
Haplotype blocks
Daly et al. Nature Genetics 2001
  • experimental evidence for reduced haplotype
    diversity (mainly in European samples)

19
The promise for medical genetics
  • within blocks a small number of SNPs are
    sufficient to distinguish the few common
    haplotypes ? significant marker reduction is
    possible

CACTACCGA CACGACTAT TTGGCGTAT
20
The HapMap initiative
  • goal to map out human allele and association
    structure of at the kilobase scale
  • deliverables a set of physical and
    informational reagents

21
Haplotyping
  • the problem the substrate for genotyping is
    diploid, genomic DNA phasing of alleles at
    multiple loci is in general not possible with
    certainty
  • experimental methods of haplotype determination
    (single-chromosome isolation followed by
    whole-genome PCR amplification, radiation
    hybrids, somatic cell hybrids) are expensive and
    laborious

22
A example of hyplotyping
  • Mother GG AT CA TT
  • Father CC AA AC CT
  • Children GC AA CC CT
  • Children GC AT AA TT
  • Children GC AA AC CT

23
Haplotypes
  • a
    b
  • Mother I G A C T G T A T
  • II G T C T G A A
    T
  • Father I C A A C C A C T
  • II C A A T C A C
    C

24
A example of hyplotyping
  • Mother GG AT CA TT
  • Father CC AA AC CT
  • Children GC AA CC CT (M-Ia F-IIb)
  • Children GC AT AA TT (M-Ib F-IIa)
  • Children GC AA AC CT (M-Ia F-Ia
  • or
    M-IIb F-IIb) ?

25
HapMap Project
A freely-available public resource to increase
the power and efficiency of genetic association
studies to medical traits
  • High-density SNP genotyping across the genome
    provides information about
  • SNP validation, frequency, assay conditions
  • correlation structure of alleles in the genome

All data is freely available on the web for
application in study design and analyses as
researchers see fit
26
HapMap Samples
  • 90 Yoruba individuals (30 parent-parent-offspring
    trios) from Ibadan, Nigeria (YRI)
  • 90 individuals (30 trios) of European descent
    from Utah (CEU)
  • 45 Han Chinese individuals from Beijing (CHB)
  • 45 Japanese individuals from Tokyo (JPT)

27
HapMap progress
PHASE I completed, described in Nature
paper 1,000,000 SNPs successfully typed in all
270 HapMap samples PHASE II data generation
complete, data released gt3,500,000 SNPs
typed in total !!!
28
ENCODE-HAPMAP variation project
  • Ten typical 500kb regions
  • 48 samples sequenced
  • All discovered SNPs (and any others in dbSNP)
    typed in all 270 HapMap samples
  • Current data set 1 SNP every 279 bp

A much more complete variation resource by
which the genome-wide map can evaluated
29
Tagging from HapMap
  • Since HapMap describes the majority of common
    variation in the genome, choosing non-redundant
    sets of SNPs from HapMap offers considerable
    efficiency without power loss in association
    studies

30
Pairwise tagging
Tags SNP 1 SNP 3 SNP 6 3 in total Test for
association SNP 1 SNP 3 SNP 6
After Carlson et al. (2004) AJHG 74106
Write a Comment
User Comments (0)
About PowerShow.com