Single Nucleotide Polymorphisms - PowerPoint PPT Presentation

About This Presentation
Title:

Single Nucleotide Polymorphisms

Description:

Types of Genetic Variations. Single Nucleotide Polymorphisms (SNP) Single ... Small variable repeats microsatellites ... Frequence Databases ... – PowerPoint PPT presentation

Number of Views:256
Avg rating:3.0/5.0
Slides: 24
Provided by: lyo91
Category:

less

Transcript and Presenter's Notes

Title: Single Nucleotide Polymorphisms


1
Single Nucleotide Polymorphisms
  • Jennifer Lyon
  • Eskind Biomedical Library
  • May 1, 2009
  • CRC Workshop Series

2
Types of Genetic Variations
  • Single Nucleotide Polymorphisms (SNP)
  • Single base pair changes
  • GTCATTCGATT
  • GTCAGTCGATT
  • Indels
  • Small insertion/deletions
  • CTT------GATC
  • CTTACGGATC
  • Small variable repeats microsatellites
  • ACGACGACGACGACGACG (6 copies)
  • ACGACGACGACGACGACGACG (7 copies)
  • Variable Long tandem repeats (can be dozens to
    hundreds to thousands)
  • Chromosomal Aberrations Translocations,
    Inversions, etc.

3
Focusing on SNPs
  • Types of SNPs
  • SNP nomenclature
  • Resources for SNPs
  • Examples and Challenges in Finding SNPs
  • http//learn.genetics.utah.edu/content/health/phar
    ma/snips/

4
SNPs Types
  • SNPs can be categorized in a number of ways, the
    most common are by location and function
    (relative to a gene)
  • Intragenic SNPs are often categorized by function
    are they in a coding region, an intron, part of
    the mRNA, outside the mRNA but still in the gene
    locus (i.e., in the promoter)
  • Extragenic SNPs may be considered simply
    genomic or might be labeled relative to the
    nearest gene, ie. 5 or 3 to a gene
  • An extragenic SNP may affect regulatory regions
    important in gene expression or other DNA
    functions such as DNA replication.

5
SNP Functional Categories
  • coding nonsynonymous
  • Missense, nonsense, frame shift
  • coding synonymous
  • Intronic
  • splice site
  • mRNA utr
  • 5' utr or 3' utr
  • (gene) locus region (5 or 3 to the gene)
  • near gene usually means within 2000bp of gene
  • genomic/extragenic (distant from any gene)

6
Coding Nonsynonymous SNPs
  • Missense change an aa

http//www.ncbi.nlm.nih.gov/Class/NAWBIS/Modules/V
ariation/powerpoint/variation_files/frame.html
7
Coding Non-Synonymous SNPs
  • Nonsense
  • Change an aa to a stop codon
  • Results in a shortened protein
  • Frame Shift
  • Are really single-base indels
  • Drop or add one base and the triplet reading
    frame is thrown out of shift, altering all
    downstream aas and usually resulting in an
    earlier stop codon

8
SNP Nomenclature
  • The Human Genome Variation Society
    (http//www.hgvs.org/mutnomen/recs.html) has
    proposed some guidelines for SNP nomenclature,
    but at the moment, there is minimal consistency.
  • Different sources will refer to the same SNP in
    different ways
  • While dbSNP identifiers (rs12345678) are
    becoming common, they are not required of
    publishing authors and not used in all cases.

9
SNPs at Base-Pair Level
  • The base-pair change is given in various forms
  • A/C T?G CgtT 432GgtC T73C
  • The HGVS nomenclature recommendations
  • "c." for a coding DNA sequence (like  c.76AgtT)
    "g." for a genomic sequence (like g.476AgtT) "m."
    for a mitochondrial sequence (like m.8993TgtC
  • "r." for an RNA sequence (like r.76agtu)

10
Position, position, position!
  • The big issue with SNPs is identifying their
    location (numerically).
  • Position can be specified
  • Number location within a specific sequence
  • Relative to another genetic landmark
  • Start site for a coding region of a gene
  • Start or end of an exon or intron
  • Relative to a marker
  • Published articles are not always clear on
    this!!!
  • Different resources may use different
    landmarks/numbering
  • Numbering is always relative to the chosen
    sequence

11
Coding SNPs
  • These are easier because they can be identified
    by the amino acid position rather than the
    base-pair position
  • Most common nomenclature uses either 3-letter or
    single amino acid codes
  • Asn332Asp OR A95V
  • The HGVS recommendation is similar
  • "p." for a protein sequence (like  p.Lys76Asn)
  • Amino Acid (protein) coding sequence positions
    becoming more consistent, but are not always
    consistent

12
Database of SNPs (dbSNP)
  • dbSNP
  • is the international central repository for both
    single base nucleotide substitutions and short
    deletion and insertion polymorphisms
  • accepts data submissions from scientists
  • is integrated with the NCBIs Entrez system

13
dbSNP Content
  • The SNP database has two major classes of
    content
  • Submitted data, i.e., original observations of
    sequence variation Submitted SNPs (SS) with ss
    (ss 5586300)
  • Computed/curated data Reference SNP Clusters
    (Ref SNP) with rs (rs 4986582)

14
Reference SNP Clusters
  • Ref SNP clusters are computer-generated and
    curated by NCBI staff
  • Ref SNP Clusters define a non-redundant set of
    SNPs
  • All individual SNPs submitted by a researcher are
    given a submitter SNP number (ss) and then
    redundant (repetitive) submitter SNPs are
    combined into a RefSNP cluster record, with a
    unique rs
  • Ref SNP clusters may contain multiple submitted
    SNPs

15
Searching dbSNP
  • dbSNP is searched like any other Entrez db
  • Specialized fields include

Field Tag Notes
Allele Allele Uses IUPAC codes for bases
Chromosomal Location CHRPOS Uses chromosomal base-pair locations
Contig Position ctpos Uses contig base-pair locations
Function Class Func Includes coding synonymous, missense, nonsense, intron, utr, etc.
SNP Class SNP_Class Includes snp, indel, mixed
16
SNP Limits Page
17
Creating a Complex Search
  • Retrieve all synonymous coding reference SNPs for
    the human norepinephrine transporter gene
    (Slc6a2) from dbSNP
  • Search Strategy
  • humanorgn AND Slc6a2gene AND coding
    synonymous FUNC
  • Note To use the gene (gene name) field, it is
    necessary to have the official gene name or gene
    symbol as per the Human Gene Nomenclature
    Committee. Entrez Gene can be used to find these.

18
dbSNP Output Graphical Display
19
dbSNP - Live
  • Lets look at a dbSNP reference SNP page
  • http//www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs374
    3788

20
Finding SNPs - Challenges
  • If rs is available start with it
  • Not all rss have information in all databases
  • Another database of interest is the Online
    Mendelian Inheritance in Man (OMIM)
  • OMIM doesnt always provide rss even when there
    is one
  • dbSNP records may link to OMIM or may not, even
    if the SNP is in an OMIM record

21
Example 1
  • rs1800888
  • (CgtT) ? Ile164Thr in ADRB2 gene
  • HGVS nomenclature
  • NP_000015.1p.T164I
  • To Find in OMIM
  • Search with rs1800888 yield nothing
  • Search with ADRB2gene find record
  • Look at allelic variants .0003
    BETA-2-ADRENORECEPTOR AGONIST, REDUCED RESPONSE
    TO ADRB2, THR164ILE
  • It is a match

22
Example 2
  • rs2740574
  • A/G SNP located 5 to CYP3A4
  • HGVS nomenclature
  • NT_007933.14g.24616372CgtT
  • To find in OMIM
  • Search with rs2740574 yields nothing
  • Search with gene name CYP3A4 find record
  • Find list of allelic variants - .0001 CYP3A4
    PROMOTER POLYMORPHISM CYP3A4, a-g PROMOTER
  • Compare info in dbSNP to info in OMIM (look at
    sequence)

23
Other Databases
  • OMIM NCBI
  • HapMap - International HapMap Project
  • ALFRED Allele Frequence Databases
  • HGVbaseG2P - Human Genome Variation database of
    Genotype-to-Phenotype information
  • PharmGKB Pharmacogenomics Knowledgebase
  • F-SNP Functional SNPs
Write a Comment
User Comments (0)
About PowerShow.com