Alternative splicing: A playground of evolution - PowerPoint PPT Presentation

About This Presentation
Title:

Alternative splicing: A playground of evolution

Description:

Research and Training Center for Bioinformatics ... Caveat: spurious exons still may seem to be conserved in the rodent lineage due to short time ... – PowerPoint PPT presentation

Number of Views:187
Avg rating:3.0/5.0
Slides: 52
Provided by: rtcb
Category:

less

Transcript and Presenter's Notes

Title: Alternative splicing: A playground of evolution


1
Alternative splicing A playground of evolution
  • Mikhail Gelfand
  • Research and Training Center for Bioinformatics
  • Institute for Information Transmission Problems
    RAS,
  • Moscow, Russia
  • October 2008

2
of alternatively spliced human and mouse genes
by year of publication
Human (genome / random sample)
All genes
Human (individual chromosomes)
Only multiexon genes
Genes with high EST coverage
Mouse (genome / random sample)
3
Roles of alternative splicing
  • Functional
  • creating protein diversity
  • 30.000 genes, gt100.000 proteins
  • maintaining protein identity
  • e.g. membrane (receptor) and secreted isoforms
  • dominant negative isoforms
  • combinatorial (transcription factors, signaling
    domains)
  • regulatory
  • E.g. via chanelling to NMD
  • Evolutionary

4
Plan
  • Evolution of alternative exon-intron structure
  • mammals
  • human compared to mouse and dog
  • mouse and rat compared to human and dog
  • paralogs
  • dipteran insects
  • Drosophila melanogaster, D. pseudoobscura,
    Anopheles gambiae
  • many drosophilas
  • Evolutionary rates in constitutive and
    alternative regions
  • human and mouse
  • D. melanogaster and D. pseudoobscura
  • many drosophilas
  • human-chimpanzee vs. human SNPs
  • Alternative splicing and protein domains
  • Regulation of AS via conserved RNA structures

5
Elementary alternatives
Cassette exon
Alternative donor site
Alternative acceptor site
Retained intron
6
EDAS a database of alternative splicing
  • Sources
  • human and mouse genomes
  • GenBank
  • RefSeq
  • consider cassette exons and alternative splicing
    sites
  • functionality potentially translated vs.
    NMD-inducing elementary alternatives (in-frame
    stops, length non divisible by 3)

human mouse
genes 28957 31811
mRNA / cDNA 114624 215212
proteins 91844 126797
ESTs 4294590 3817531
all alternatives 51713 44030
elementary alternatives 31746 21329
7
(No Transcript)
8
Alternative exon-intron structure in the human,
mouse and dog genomes
  • Human-mouse-dog triples of orthologous genes
  • We follow the fate of human alternative sites and
    exons in the mouse and dog genomes
  • Each human AS isoform is spliced-aligned to the
    mouse and dog genome. Definition of conservation
  • conservation of the corresponding region
    (homologous exon is actually present in the
    considered genome)
  • conservation of splicing sites (GT and AG)

9
Caveats
  • we consider only possibility of AS in mouse and
    dog do not require actual existence of
    corresponding isoforms in known transcriptomes
  • we do not account for situations when alternative
    human exon (or site) is constitutive in mouse or
    dog
  • of course, functionality assignments (translated
    / NMD-inducing) are not very reliable

10
Gains/losses loss in mouse
Commonancestor
11
Gains/losses gain in human (or noise)
Commonancestor
12
Gains/losses loss in dog (or possible gain in
humanmouse)
Commonancestor
13
Triple comparison
Human-specific alternatives noise?
Human-specific alternatives noise?
Lost in mouse
Lost in dog
Conserved alternatives
Conserved alternatives
14
Translated and NMD-inducing cassette exons
  • Mainly included exons are highly conserved
    irrespective of function
  • Mainly skipped translated exons are more
    conserved than NMD-inducing ones
  • Numerous lineage-specific losses
  • more in mouse than in dog
  • more of NMD-inducing than of translated exons
  • 40 of almost always skipped (lt1 inclusion)
    exons are conserved in at least one lineage

15
Mouserat vs human and dog a possibility to
distinguish between exon gain and noise
16
The rate of exon gain decreases with the exon
inclusion rate increases with the sequence
evolutionary rate
  • Caveat spurious exons still may seem to be
    conserved in the rodent lineage due to short time
  • Solution estimate FDR by analysis of
    conservation of pseudoexons

17
Alternative donor and acceptor sites same trends
  • Higher conservation of uniformly used sites
  • Internal sites are more conserved than external
    ones (as expected)

18
Source of innovation Model of random site
fixation
  • Plots Fraction of exon-extending alternative
    sites as dependent on exon length
  • Main site defined as the one in protein or in
    more ESTs
  • Same trends for the acceptor (top) and donor
    (bottom) sites
  • The distribution of alt. region lengths is
    consistent with fixation of random sites
  • Extend short exons
  • Shorten long exons

19
Genetic diseases
  • Mutations in splice sites yield exon skips or
    activation of cryptic sites
  • Exon skip or activation of a cryptic site depends
    on
  • Density of exonic splicing enhancers (lower in
    skipped exons)
  • Presence of a strong cryptic nearby

Av. dist. to a stronger site Skipped exons Cryptic site exons Non-mutated exons
Donor sites 220 75 289
Acceptor sites 185 66 81
20
One more source of innovation site creation
  • MAGE-A family of human CT-antigens
  • Retroposition of a spliced mRNA, then duplication
  • Numerous new (alternative) exons in individual
    copies arising from point mutations
  • Creation of donor sites

21
Improvement of an acceptor site
22
Alternative exon-intron structure in fruit flies
and the malarial mosquito
  • Same procedure (AS data from FlyBase)
  • cassette exons, splicing sites
  • also mutually exclusive exons, retained introns
  • Follow the fate of D. melanogaster exons in the
    D. pseudoobscura and Anopheles genomes
  • Technically more difficult
  • incomplete genomes
  • the quality of alignment with the Anopheles
    genome is lower
  • frequent intron insertion/loss (4.7 introns per
    gene in Drosophila vs. 3.5 introns per gene in
    Anopheles)

23
Conservation of coding segments
constitutive segments alternative segments
D. melanogaster D. pseudoobscura 97 75-80
D. melanogaster Anopheles gambiae 77 45
24
Conservation of D.melanogaster elementary
alternatives in D. pseudoobscura genes
  • blue exact
  • green divided exons
  • yellow joined exon
  • orange mixed
  • red non-conserved
  • retained introns are the least conserved (are all
    of them really functional?)
  • mutually exclusive exons are as conserved as
    constitutive exons

25
Conservation of D.melanogaster elementary
alternatives in Anopheles gambiae genes
  • blue exact
  • green divided exons
  • yellow joined exons
  • orange mixed
  • red non-conserved
  • 30 joined, 10 divided exons (less introns in
    Aga)
  • mutually exclusive exons are conserved exactly
  • cassette exons are the least conserved

26
Evolution of (alternative) exon-intron structure
in nine Drosophila spp.
Dana
D. melanogasterD. sechelia D. yakuba D.
erecta D. ananassae D. pseudoobscura D.
mojavensis D. virilis D. grimshawi
D. Pollard, http//rana.lbl.gov/dan/trees.html
27
Gain and loss of alternative segments and
constitutive exons
0 / 2 0 / 2
Dyak
  • CaveatWe cannot observe exon gain outside and
    exon loss within the D.mel. lineage

7 / 7 1 / 1
1 / 7 19 / 23
Dmel
Dmoj
5 / 7 2 / 3
Dere
Dsec
Dana
3 / 10 10 / 12
2 / 12 0 / 1
Dvir
Dgri
20 / 32 2 / 4
2 / 16 5 / 13
1 / 5 9 / 12
3 / 5 8 / 21
Dpse
8 / 10 3 / 5
1 / 16 7 / 8
5 / 8 1 / 2
6 / 15 8 / 33
Notation Patterns with single events / Patterns
with multiple events (Dollo parsimony)
Sample size 397 / 452 18596 / 18874
9 / 21 7 / 12
28
Evolutionary rate in constitutive and alternative
regions
  • Human and mouse orthologous genes
  • D. melanogaster and D. pseudoobscura
  • Estimation of the dn/ds ratio higher fraction
    of non-synonymous substitutions (changing amino
    acid) gt weaker stabilizing (or stronger
    positive) selection

29
Human/mouse genes non-symmetrical histogram of
dn/ds(const)dn/ds(alt)
Black shadow of the left half.In a larger
fraction of genes dn/ds(alt) gt dn/ds(const),
especially for larger values
30
Concatenated regionsAlternative regions evolve
faster than constitutive ones
1
dN/dS
dS
dS
dN/dS
dN
dN
0
31
Weaker stabilizing selection (or positive
selection) in alternative regions (insignificant
in Drosophila)
1
dN/dS
dS
dS
dN/dS
dN
dN
0
32
Different behavior of terminal alternatives
1,5
Drosophila Synonymous substitutions prevalent
in terminal alternative regions non-synonymous
substitutions, in internal alternative regions
dN/dS
Mammals Density of substitutions increases in
the N-to-C direction
dS
dN
0
33
Many drosophilasdN in mut. exclusive exons
same as in constitutive exonsdS lower in almost
all alternatives regulation?
34
Many drosophilas relaxed (positive?) selection
in alternative regions
35
The MacDonald-Kreitman test evidence for
positive selection in (minor isoform) alternative
regions
  • Human and chimpanzee genome substitutions vs
    human SNPs
  • Exons conserved in mouse and/or dog
  • Genes with at least 60 ESTs (median number)
  • Fishers exact test for significance

Pn/Ps (SNPs) Kn/Ks (genomes) diff. Signif.
Const. 0.72 0.62 0.10 0
Major 0.78 0.65 0.13 0.5
Minor 1.41 1.89 0.48 0.1
  • Minor isoform alternative regions
  • More non-synonymous SNPs Pn(alt_minor).12 gtgt
    Pn(const).06
  • More non-synonym. substitutions
    Kn(alt_minor).91 gtgt Kn(const).37
  • Positive selection (as opposed to lower
    stabilizing selection) a 1 (Pa/Ps) /
    (Ka/Ks) 25 positions
  • Similar results for all highly covered genes or
    all conserved exons

36
What does alternative splicingdo to proteins?
  • SwissProt proteins
  • PFAM domains
  • SwissProt feature tables

37
Alternative splicing avoids disrupting domains
(and non-domain units)
Control fix the domain structure randomly place
alternative regions
38
and this is not simply a consequence of the
(disputed) exon-domain correlation
39
Positive selection towards domain shuffling (not
simply avoidance of disrupting domains)
40
Short (lt50 aa) alternative splicing events within
domains target protein functional sites
c)
FT
positions
affected
FT
positions
unaffected
Prosite
patterns
affected
Prosite
patterns
unaffected
Expected
Observed
41
An attempt of integration
  • AS is often species-specific
  • young AS isoforms are often minor and
    tissue-specific
  • but still functional
  • although species-specific isoforms may result
    from aberrant splicing
  • AS regions show evidence for decreased negative
    selection
  • excess non-synonymous codon substitutions
  • AS regions show evidence for positive selection
  • excess fixation of non-synonymous substitutions
    (compared to SNPs)
  • AS tends to shuffle domains and target functional
    sites in proteins
  • Thus AS may serve as a testing ground for new
    functions without sacrificing old ones

42
What next?
  • AS in one species, constitutive splicing, in
    another (data from microarrays)
  • Changes in inclusion rates
  • Evolution of regulation of AS
  • Control for
  • functionality translated / NMD-inducing
    (frameshifts, stop codons)
  • exon inclusion (or site choice) level major /
    minor isoform
  • tissue specificity pattern (?)
  • type of alternative 1 N-terminal / internal /
    C-terminal
  • type of alternative 2 cassette and mutually
    exclusive exon, alternative site

43
Acknowledgements
  • Discussions
  • Eugene Koonin (NCBI)
  • Igor Rogozin (NCBI)
  • Vsevolod Makeev (GosNIIGenetika)
  • Dmitry Petrov (Stanford)
  • Dmitry Frishman (GSF, TUM)
  • Data
  • King Jordan (NCBI)
  • Support
  • Howard Hughes Medical Institute
  • INTAS
  • Russian Academy of Sciences (program Molecular
    and Cellular Biology)
  • Russian Foundation of Basic Research

44
Authors
  • Andrei Mironov (Moscow State University)
  • Ramil Nurtdinov (Moscow State University)
    human/mouserat/dog
  • Dmitry Malko (GosNIIGenetika, Moscow)
    drosophila/mosquito
  • Ekaterina Ermakova (Moscow State University,
    IITP) Kn/Ks
  • Vasily Ramensky (Institute of Molecular Biology,
    Moscow) SNPs, MacDonald-Kreitman test
  • Evgenia Kriventseva (now at U. of Geneva) and
    Shamil Sunyaev (now at Harvard U. Medical
    School)
  • protein structure
  • Irena Artamonova (Inst. of General Genetics,
    Moscow) human/mouse, plots, MAGE-A
  • Alexei Neverov (GosNIIGenetika, Moscow)
    functionality of isoforms

45
Bonus track conserved secondary structures
regulating (alternative) splicing in the
Drosophila spp.
  • 50 000 introns
  • 17 alternative, 2 with alt. polyA signals
  • gt95 of D.melanogaster introns mapped to at least
    7 of 12 other Drosophila genomes
  • Search for conserved complementary words at
    intron termini (within 150 nt. of intron
    boundaries), then align
  • Restrictive search gt 200 candidates
  • 6 tested in experiment (3 const., 3 alt.). All 3
    alt. ones confirmed

46
CG33298 (phopspholipid translocating ATPase)
alternative donor sites
47
Atrophin (histone deacetylase) alternative
acceptor sites
48
Nmnat (nicotinamide mononucleotide
adenylytransferase) alternative splicing and
polyadenylation
49
Less restrictive search gt many more candidates
50
Properties of regulated introns
  • Often alternative
  • Longer than usual
  • Overrepresented in genes linked to development

51
Authors
  • Andrei Mironov (idea)
  • Dmitry Pervouchine (bioinformatics)
  • Veronica Raker, Center for Genome Regulation,
    Barcelona (experiment)
  • Juan Valcarcel, Center for Genome Regulation,
    Barcelona (advice)
  • Mikhail Gelfand (general pessimism)
Write a Comment
User Comments (0)
About PowerShow.com