Title: Alternative splicing: A playground of evolution
1Alternative splicing A playground of evolution
- Mikhail Gelfand
- Research and Training Center for Bioinformatics
- Institute for Information Transmission Problems
RAS, - Moscow, Russia
- October 2006
2 of alternatively spliced human and mouse genes
by year of publication
Human (genome / random sample)
All genes
Human (individual chromosomes)
Only multiexon genes
Genes with high EST coverage
Mouse (genome / random sample)
3Plan
- Evolution of alternative exon-intron structure
- mammals human, mouse, dog
- dipteran insects Drosophila melanogaster, D.
pseudoobscura, Anopheles gambiae - Evolutionary rate in constitutive and alternative
regions - human / mouse
- D. melanogaster / D. pseudoobscura
- human-chimpanzee / human SNPs
4Elementary alternatives
Cassette exon
Alternative donor site
Alternative acceptor site
Retained intron
5Alternative exon-intron structure in the human,
mouse and dog genomes
- EDAS a database of human alternative splicing
(human genome GenBank EST data from RefSeq) - consider casette exons and alternative splicing
sites - functionality potentially translated vs.
NMD-inducing elementary alternatives - Human-mouse-dog triples of orthologous genes
- We follow the fate of human alternative sites and
exons in the mouse and dog genomes - Each human AS isoform is spliced-aligned to the
mouse and dog genome. Definition of conservation - conservation of the corresponding region
(homologous exon is actually present in the
considered genome) - conservation of splicing sites (GT and AG)
6Caveats
- we consider only possibility of AS in mouse and
dog do not require actual existence of
corresponding isoforms in known transcriptomes - we do not consider situations when alternative
human exon (or site) is constitutive in mouse or
dog - of course, functionality assignments (translated
/ NMD-inducing) are not very reliable
7Translated cassette exons
constitutive
8NMD-inducing cassette exons
9Observations
- Predominantly included exons are highly conserved
irrespective of function - Predominantly skipped translated exons are more
conserved than NMD-inducing ones - Numerous lineage-specific losses
- more in mouse than in dog
- Still, 40 of skipped (lt1 inclusion) exons are
conserved in at least one lineage
10Alternative donor and acceptor sites same trends
- Higher conservation of uniformly used sites
- Internal sites are more conserved than external
ones (as expected)
11Alternative exon-intron structure in fruit flies
and the malarial mosquito
- Same procedure (AS data from FlyBase)
- cassette exons, splicing sites
- also mutually exclusive exons, retained introns
- Follow the fate of D. melanogaster exons in the
D. pseudoobscura and Anopheles genomes - Technically more difficult
- incomplete genomes
- the quality of alignment with the Anopheles
genome is lower - frequent intron insertion/loss (4.7 introns per
gene in Drosophila vs. 3.5 introns per gene in
Anopheles)
12Conservation of coding segments
constitutive segments alternative segments
D. melanogaster D. pseudoobscura 97 75-80
D. melanogaster Anopheles gambiae 77 45
13Conservation of D.melanogaster elementary
alternatives in D. pseudoobscura genes
- blue exact
- green divided exons
- yellow joined exon
- orange mixed
- red non-conserved
- retained introns are the least conserved (are all
of them really functional?) - mutually exclusive exons are as conserved as
constitutive exons
14Conservation of D.melanogaster elementary
alternatives in Anopheles gambiae genes
- blue exact
- green divided exons
- yellow joined exons
- orange mixed
- red non-conserved
- 30 joined, 10 divided exons (less introns in
Aga) - mutually exclusive exons are conserved exactly
- cassette exons are the least conserved
15CG1517 cassette exon in Drosophila, alternative
acceptor site in Anopheles
16CG31536 cassette exon in Drosophila, shorter
cassette exon and alternative donor site in
Anopheles
17Evolutionary rate in constitutive and alternative
regions
- Human and mouse orthologous genes
- Estimation of the dn/ds ratio higher fraction
of non-synonymous (changing amino acid)
substitutionsgt weaker stabilizing (or stronger
positive) selection
18Concatenates of constitutive and alternative
regions in all genes different evolutionary rates
- Relatively more non-synonimous substitutions in
alternative regions (higher dN/dS ratio)
- Less amino acid identity in alternative regions
- Columns (left-to-right) (1) constitutive
regions - (24) alternative regions N-end, internal, C-end
19Individual genes the rate of non-synonymous to
synonymous substitutions dn/ds tends to be larger
in alternative regions (vertical acis) than in
constitutive regions (horizontal acis)
20Non-symmetrical histogram of dn/ds(const)dn/ds(a
lt)
Black shadow of the left half.In a larger
fraction of genes dn/ds(const)ltdn/ds(alt),
especially for larger values
21The same effect is seen in N-terminal,
internal, C-terminal parts
22Drosophilas less selection in alternative
regions?
More mutations in alt. regions
Similar level of mutations
More mutations in const. regions
In a majority of genes, both synonymous and
non-synonymous mutation rates are higher in
alternative regions than in constitutive regions
23Different behavior of N-terminal, internal and
C-terminal alternatives
N-terminal alternatives most genes have higher
syn. substit. rate in alt. regions most genes
have higher stabilizing selection in alt. regions
Internal alternatives intermediate situation
C-terminal alternatives more non-synonymous
substitutions and less synonymous substitutions
gt lower stabilizing selection in alternative
regions
24The MacDonald-Kreitman test evidence for
positive selection in (minor isoform) alternative
regions
- Human and chimpanzee genome mismatches vs human
SNPs - Exons conserved in mouse and/or dog
- Genes with at least 60 ESTs (median number)
- Fishers exact test for significance
Pn/Ps (SNPs) Dn/Ds (genomes) diff. Signif.
Const. 0.72 0.62 0.10 0
Major 0.78 0.65 0.13 0.5
Minor 1.41 1.89 0.48 0.1
- Minor isoform alternative regions
- More non-synonymous SNPs Pn(alt_minor).12 gtgt
Pn(const).06 - More non-synonym. mismatches Dn(alt_minor).91
gtgt Dn(const).37 - Positive selection (as opposed to lower
stabilizing selection) a 1 (Pa/Ps) /
(Da/Ds) 25 positions - Similar results for all highly covered genes or
all conserved exons
25An attempt of integration
- AS is often genome-specific
- young AS isoforms are often minor and
tissue-specific - but still functional
- although unique isoforms may result from aberrant
splicing - AS regions show evidence for decreased negative
selection - excess non-synonymous codon substitutions
- AS regions show evidence for positive selection
- excess non-synonymous SNPs
- AS tends to shuffle domains and target functional
sites in proteins - Thus AS may serve as a testing ground for new
functions without sacrificing old ones
26What next?
- Multiple genomes
- many Drosophila spp.
- ENCODE data for many mammals
- Estimate not only the rate of loss, but also the
rate of gain (as opposed to aberrant splicing) - Control for
- functionality translated / NMD-inducing
- exon inclusion (or site choice) level major /
minor isoform - tissue specificity pattern (?)
- type of alternative N-terminal / internal /
C-terminal - Evolution of regulation of AS
- Splicing errors and mutations retained introns,
skipped exons, cryptic sites
27Acknowledgements
- Discussions
- Vsevolod Makeev (GosNIIGenetika)
- Eugene Koonin (NCBI)
- Igor Rogozin (NCBI)
- Dmitry Petrov (Stanford)
- Dmitry Frishman (GSF, TUM)
- Shamil Sunyaev (Harvard University Medical
School) - Data
- King Jordan (NCBI)
- Support
- Howard Hughes Medical Institute
- INTAS
- Russian Academy of Sciences (program Molecular
and Cellular Biology) - Russian Fund of Basic Research
28Authors
- Andrei Mironov (Moscow State University)
- Ramil Nurtdinov (Moscow State University)
human/mouse/dog - Dmitry Malko (GosNIIGenetika)
drosophila/mosquito - Ekaterina Ermakova (Moscow State University,
IITP) Kn/Ks - Vasily Ramensky (Institute of Molecular Biology)
SNPs - Irena Artamonova (GSF/MIPS) human/mouse,
plots - Alexei Neverov (GosNIIGenetika) functionality
of isoforms