Title: Predicting effect of SNPs and de novo mutations on splicing
1Predicting effect of SNPs and de novo mutations
on splicing
- presented by
- Alexander Tchourbanov
- Biology Department
- New Mexico State University
2Motivation
- Recently, high throughput genotyping methods
became available - High-density 500K chips are available for
genotyping (Illumina Hap550, Affymetrix 5.0) - Genome resequencing (SOLID Applied Biosystems,
Solexa/Illumina genome analyzer, Roche 454 FLX) - Researchers, interested to understand genetic
risk factors contributing to a disorder,
routinely genotype patients
3Motivation
- Many SNPs have been associated with
predisposition to various diseases (Breast
cancer, Alzheimer's, Multiple sclerosis, etc.) - Only fraction of actual SNPs are genotyped with
chips - Some SNPs with significantly low P-values have
been associated through LD with affected
haplotypes - Fraction of associated SNPs are causal variants
- There is a growing evidence that Autism Spectrum
Disorder (ASD) could be triggered by de novo
mutations absent in both parents
4Types of SNPs
- Several classes of variants to consider
- Single Nucleotide Polymorphisms (SNPs)
- Deletion/Insertion polymorphisms (DIPs)
- Simple Tandem Repeat polymorphisms (STRs)
- Named polymorphisms (e.g., Alu/ dimorphisms)
- Multinucleotide polymorphisms (MNPs)
5SNPs distribution
- 6 million SNPs are located in human gene loci
(dbSNP build 129) - 63 intronic
- 11 untranslated region
- 1 nonsynonymous
- 1 synonymous
- 24 ?2 kBp from a gene
- lt1 splice site
- lt1 unknown coding variant
6What are the common disease causing variants?
- SNPs are defined as former mutations with gt1 of
population penetrance - According to Human Gene Mutation Database HGMD
(http//www.hgmd.cf.ac.uk) - 49,806 mutations are missence/nonsense
- 8,548 mutations have consequences in mRNA
splicing - Many missence/nonsence mutations are eliminated
by purifying selection and never make it to SNPs
7Splicing components
Image credit Understanding alternative splicing
towards a cellular code Arianne J. Matlin,
Francis Clark and Christopher W. J. Smith, Nature
Reviews Molecular Cell Biology 6, 386-398 (May
2005)
8Existing elements
9Orthologos blocks from UCSC GB
- 2,333,379 extended exons from 23 Tetrapoda
organisms were obtained - A number of experimental reports showed that
genes from distantly related Tetrapoda organisms
were correctly expressed and post-transcriptionall
y modified in transgenic animals (Capetanaki Y et
al. Proc Natl Acad Sci USA 1989, Jacobs GH et
al. Science 2007) - The genes encoding well-known RNA binding
proteins involved in splicing regulation are
enriched with ultraconserved elements (Bejerano
G. et al.Science 2004)
10Counting oligos
11Comparing oligo counts
12Elements found
- Using the orthologous exons available for 23
Tetrapoda organisms we have identified 2,546
unique splicing regulatory elements. - Among these elements 203 (7.97) 3SS and 177
(6.95) 5SS supporting motifs are novel and have
not been previously reported in systematic
screens detecting such elements. - Among our predicted elements, 41.08 of sequences
were heptamers and 51.81 were octamers and only
6.76 hexamers and 0.35 pentamers
13Example of 5SS ISEs found
14Example of LOD profiles (5SS ISE)
15Optimal exon length
- Depends on flanking 5SS and 3SS strengths
165GC SS Bayesian sensor
17Exon scoring method
- LOD scores associated with 5SS,3SS, exonic
length, competing SSs and Enhancer/Silencer
signals are combined towards an exon strength
18IVS22delC mutation
- gtIVS22delC
- ttcggataagacaaagattttatataatattttgaaaacattaaat
aatt tgtcattcctttatttcctttattttagCTTCGCAGAATCAAGAA
CGGCTATGTGCGTTTAAAGATCCGTATCAGCAAGACCTTGGGATAG/GTG
AGAGTAGAATCTCTCATGAAAATGGGACAATATTATGCTCGAAAG/GTAG
CACCTGCTATGGCCTTTGGGAGAAATCAAAAGGGGACATAAATCTTGTAA
AACAAGg(c)aagtgatactttccttacctgaaatgactgtgttttatac
aattgatatttatctaaaaaggacatgggagtatgttaaaatcctgttca
gaaaaacagtgaatttaaaagtgtatatataaagccaggtgtggtggctc
atgcctgtaattccagcacttttcgaggctgaggtgggcggatcacttga
ggccaggagtttgagaccagcctgggtaataacatggtgaaaccccgt
19Example of SpliceScan II predicting effects of
mutations
- An example of successfully predicted effect of
mutation IVS22delC causing familial pulmonary
arterial hypertension (Cogan JD et al Am J
Respir Crit Care Med 2006) - Another example of SpliceScan II correctly
predicting the effect of IVS10-6del34 micro
deletion causing gastrointestinal stromal tumors
(Chen LL et alOncogene 2005 )
20SpliceScan II performance on mutations
21Effect of rs849563 (Autism associated SNP)
- There is a change in annotated exon potential
here - rs849563 changes the exon sharing one boundary
with annotated exon gi41872561refNM_201266.1
2433-2577 where the exon score changes 0.60-gt0.19
22Effect of rs885747 (Autism associated SNP)
- There is a change in annotated exon potential
here - rs885747 changes the exon sharing one boundary
with annotated exon gi194097340refNM_002616.2
1627-1735 where the exon score changes
0.30-gt0.49
23SNPs performance
24SpliceScan II tool
- SpliceScan II tool http//splicescan2.lumc.edu/
- Is more sensitive than existing splicing
simulators (NetUTR, ExonScan) - Uses novel 5 GC SS Bayesian sensor
- Method allows predicting aberrant splicing events
associated with genomic variants - ACGMAP companion database http//www.stritch.luc.e
du/node/375
25