Title: Characterizing Alternative Splicing With Respect To Protein Domains
1Characterizing Alternative Splicing With Respect
To Protein Domains
- BME 220 Project
- Charlie Vaske
2Overview
- Background
- Methods and Data
- Preliminary Results
- Preliminary Conclusions
3Big Picture
Genome
Transcriptome
Proteome
DNA
RNA
Protein
4Splicing in Higher Eukaryotes
Genome
5
3
Intergenic
Intergenic
Gene
Transcription of pre-mRNA
5
3
Exon
Intron
Exon
Exon
Intron
GT
AG
AG
GT
Donor Site
Acceptor Site
U2
U1
Splicing to mRNA
5
3
Transcriptome
Exon
Exon
Exon
5Alternative Splicing
3
5
pre-mRNA
GT
AG
6Microarrays For Alt. Splicing
- Use short oligonucleotides
- Get a guess at the rate of expression of the oligo
Exon 1
Exon 2
Exon 4
Exon 5
Exon 3
7AffymetrixMicroarrays For Alt. Splicing
Exon 1
Exon 2
Exon 4
Exon 5
Exon 3
Isoform 1
Exon 1
Exon 2
Exon 4
Exon 5
Isoform 2
Exon 1
Exon 3
Exon 5
8Ideal Microarray Readings
Expression
a
b
c
d
e
Probe
Isoform 1
a
c
Exon 1
Exon 2
Exon 4
Exon 5
b
Isoform 2
a
d
Exon 1
Exon 3
Exon 5
e
9Motivation
- Why alternatively splice?
- How does it affect the resulting proteins?
- Look at domains
- High level summary of protein
- 80 of eukaryotic proteins are multi-domain
- Domains are big relative to an exon
10Some Previous Work
- Signatures of domain shuffling in the human
genome. Kaessmann, 2002. - Intron phase symmetry around domain boundaries
- The Effects of Alternative Splicing On
Transmembrane Proteins in the Mouse Genome.
Cline, 2004. - Half of TM proteins studied affected by
alt-splicing.
11Method
- Predict Alternative Splicing
- Predict Protein Domains
- Look for effects of Alt-Splicing on predicted
domains - Swapping
- Knockout
- Clipping
12Microarray Design
- Genes based on mRNA and EST data in mouse
- Mapped to Feb. 2002 mouse genome freeze
- 500,000 probes (66,000 sets)
- 100,000 transcripts
- 13,000 gene models
13Technical work
Genome Space
Overlap
gene models
Generated Data
transcripts
Overlap
Provided data
Overlap
Probe to transcript mapping
E_at_NM_021320 cc-chr10-000017.82.0 G6836022_at_J9
11445 cc-chr10-000017.91.1 G6807921_at_J911524_
RC cc-chr10-000018.4.0
probes
14Predicting Alternative Splicing
- Using mouse alt-splicing microarrays
- Data from Manny Ares
- 8 tissues
- 3 replicates of each tissue
15Predicting Alternative Splicing
- General Approach Clustering, then Anti-Clustering
107 Clusters
Detail View
16Predicting Alternative Splicing
- Cluster pairs have both anti-correlation and
overlap
17First Attempt at Predictions
- Concerned with prediction quality
- Only took clusters-pairs with anti-correlation
less than -0.9
18First Attempt at Predictions
- Greater than -0.9 anti-correlation
- 121 genes
- 60 named genes
- Many of these have documented isoforms
19Predicting Protein Domains
- Used local install of InterPro
- Only used pfam
- Based on sequence motifs
- Liberal e-value cut-off 1e-10
20Technical work
Genome Space
Transcriptome Space
Proteome Space
Amino Acid Sequences
mRNA
transcripts
Predicted Domains
Predicted Domains
Predicted Domains
21Swapping
- I define to be
- Genome base pair annotated with gt1 domain
- 2 isoforms share a domain, then each has a domain
of different types on the same side
Exon
Exon
Exon
Exon
Exon
Exon
Exon
Exon
22Knockout
- Cassette exon indels a predicted domain
Exon
Exon
Exon
Exon
Exon
or
Exon
Exon
Exon
Exon
Exon
23Clipping
- Lengthening/shortening of a domain
Exon
Exon
Exon
Exon
Exon
Exon
Exon
Exon
Exon
24Results Selected Prediction
3 C
5 N
25Preliminary Conclusions
- Only a few genes examined
- Analysis pipeline in infancy
- Not thoroughly tested
- I do have alternative splicing events
- Example and literature suggest that some effects
will be found