Title: Whole genome transcriptome variation in Arabidopsis thaliana
1Whole genome transcriptome variation in
Arabidopsis thaliana Xu Zhang Borevitz Lab
2Arabidopsis thaliana have been adapted to highly
variable environments
3Transcription and splicing
Chromosomal DNA
Exon 1
Exon 2
Exon 3
Intron 1
Intron 2
Transcription
Nuclear RNA
RNA splicing
Messenger RNA
Exon 1
Exon 2
Exon 3
Exon 1
Exon 3
4Whole genome tiling array
- High density and resolution 1.6M unique probes
at 35bp spacing - Without bias toward known transcripts
Genetic hybridization polymorphisms could affect
the estimation of gene expression
5The experiment
Col? x Col?
Van ? x Van ?
Col ? x Van ?
Van ? x Col ?
- parental strains and reciprocal F1 hybrids
- mRNA from total RNA genomic DNA
6Double-stranded random labeling
AAAAA
Random reverse transcription
AAAAA
Double-stranded cDNA
Random priming
7Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
8Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
9Single Feature Polymorphisms and indels
SFP
SFP
SFPs
deletion or duplication in Van
10Sequence polymorphisms
SFPsa FDR Col gt Vanc Van gt Colc Total
SFPsa 11.82 135769 14934 150703
SFPsa 7.66 126443 9479 135922
SFPsa 5.22 118381 6662 125043
SFPsa 3.88 110861 4979 115840
SFPsa 3.15 104115 3820 107935
Indelsb Model selection deletion duplication Total
Indelsb BICd 518 22 540
Indelsb AICe 1645 136 1781
SPFs and indels (gt200bp) were removed before gene
expression analysis
11Deletions vs duplications
12Distribution of indels along chromosomes
13Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
14Additive, dominant and maternal effects of gene
expression
15The linear model
Gene probe Intensity additive dominant
maternal e
16Gene expression variation between genotypes
Deltaa Sigb Sig-c Total Falsed FDR
additive 0.5 4911 3967 8878 901 10.15
additive 1 2674 1736 4410 215 4.88
additive 1.5 1626 923 2549 70 2.76
additive 1.8 1249 676 1925 39 2.03
additive 2.5 690 334 1024 13 1.24
dominant 0.5 1511 3190 4701 767 16.31
dominant 1 405 1521 1926 186 9.65
dominant 1.5 157 811 968 67 6.93
dominant 1.8 92 575 667 40 5.99
dominant 2.5 41 270 311 14 4.65
maternal 0.5 5998 95 6093 735 12.06
maternal 1 2046 8 2054 151 7.37
maternal 1.5 480 0 480 49 10.29
maternal 1.8 163 0 163 28 17.33
maternal 2.5 41 0 41 9 22.84
17The pattern of gene expression inheritance
Mean gene intensity
paternal
Maternal
Col dominant
over dominant
F1v dominant
F1c dominant
Van dominant
Col Van F1v F1c
18The pattern of gene expression inheritance
19Enrichment in GO functional categories
GO enrichment for additive dominant maternal
effect genes
Defense response genes are highly expressed in F1
hybrid lines, while many growth related pathway
are down-regulated
20Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
21Default expression status of exon and intron
- Exons correction for gene expression
- corrected by gene mean
- corrected by a gene median
- splicing index (Meanexon/Meangene)
- Introns direct comparison
Exon/intron probe Intensity additive dominant
maternal e
22Differential exon splicing
Deltaa Sigb Sig-c Total Falsed FDR
corrected by gene mean 0.3 287 190 477 559 117
corrected by gene mean 0.4 177 129 306 205 67.0
corrected by gene mean 0.5 127 109 236 97 41.0
corrected by gene mean 0.6 92 86 178 55 30.8
corrected by gene mean 0.7 77 69 146 34 23.4
Corrected by gene median 0.3 523 280 803 556 69.2
Corrected by gene median 0.4 328 172 500 203 40.6
Corrected by gene median 0.5 223 120 343 96 28.0
Corrected by gene median 0.6 154 76 230 54 23.5
Corrected by gene median 0.7 123 52 175 34 19.3
Splicing index 0.3 407 235 642 425 66.0
Splicing index 0.4 292 175 467 132 28.0
Splicing index 0.5 230 143 373 50 13.0
Splicing index 0.6 178 104 282 21 7.50
Splicing index 0.7 148 86 234 10 4.30
Exon probe Intensity additive dominant
maternal e
23Differential intron splicing
Deltaa Sigb Sig-c Total Falsed FDR
0.3 561 1034 1595 332 20.8
0.4 405 523 928 85 9.17
0.5 316 352 668 28 4.26
0.6 239 220 459 12 2.61
0.7 202 155 357 7 1.91
0.8 176 120 296 5 1.53
Intron probe Intensity additive dominant
maternal e
24Differential exon splicing is predominantly
additive in F1 hybrids
25Some dominant effect in differential intron
splicing in F1 hybrids
26Comparison for enrichment in known alternatively
spliced exons
Threshold 1 Threshold 1 Threshold 2 Threshold 2
Called Not called Called Not called
Corrected by gene mean Known Known 28 991 7 1012
Corrected by gene mean Not known Not known 397 55145 90 55452
Corrected by gene mean Fold enrichment Fold enrichment 3.92 3.92 4.26 4.26
Corrected by gene mean p-value p-value 5.97E-09 5.97E-09 1.90E-03 1.90E-03
Corrected by gene median polish Known Known 24 995 6 1013
Corrected by gene median polish Not known Not known 430 55112 85 55457
Corrected by gene median polish Fold enrichment Fold enrichment 3.09 3.09 3.86 3.86
Corrected by gene median polish p-value p-value 3.60E-06 3.60E-06 6.14E-03 6.14E-03
Splicing index Known Known 24 1093 5 1112
Splicing index Not known Not known 537 72328 88 72777
Splicing index Fold enrichment Fold enrichment 2.96 2.96 3.72 3.72
Splicing index p-value p-value 6.84E-06 6.84E-06 1.36E-02 1.36E-02
27Experimental determined FDR for differential
splicing
of significant calls estimated FDR of tested of confirmed experimental FDR
Exon (corrected by mean) 477 117 45 22 51.1
Exon (corrected by mean) 111 20.8 18 10 44.4
Exon (corrected by median) 500 40.6 40 21 47.5
Exon (corrected by median) 103 15.60 17 10 41.2
Exon (splicing index) 642 66.0 50 23 54.0
Exon (splicing index) 102 1.00 20 10 50.0
intron 459 2.61 65 38 41.5
intron 195 1.15 58 33 43.1
28Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
29Enrichment of differentially spliced genes in
chloroplast thylakoid
enrichment of differentially spliced genes
30Chloroplast thylakoid
31Photosynthesis related genes
Differrentially spliced genes which are located
in chloroplast thylakoid
AT5G38660 APE1 (Acclimation of Photosynthesis to
Environment) mutant has altered acclimation
responses
32Splicing regulator tend to be differentially
spliced
AT1G07350 transformer serine/arginine-richribonucleoprotein putative
AT1G55310 SC35-like splicing factor 33 kD(SCL33)
AT2G29210 splicing factor PWIdomain-containing protein
AT5G04430 KH domain-containing proteinNOVA putative
33Outlines
- Sequence polymorphisms
- Gene expression variation
- Splicing variation
- A functional network of differentially spliced
genes - HMM for a de novo transcription profiling
34Generalized tiling array HMM
(by Jake Byrnes)
- 3-state HMM
- Discrete distribution for emission probability
- Transition probability counts for probe spacing
- Baum-Welch parameter estimation
35An example of HMM detected segments
36A nice model also needs better array
- Array density is not enough to distinguish
exon/intron boundaries - Probe quality
37Differential segments gt3 continuous probes
with posterior probability gt0.99. Differentially
expressed genes annotated genes for which 33
of their probes reside within the observed
differential segments. Differentially spliced
genes annotated genes for which lt33 of probes
resided within the differential segment, or
annotated genes containing 2 differential
segments with different states. Novel gene
boundaries differential segments with gt 5
probes extending beyond annotated gene
boundary Novel transcripts differential segments
with gt 5 probes and outside any annotated gene
boundary.
38Length distribution of segments called by HMM
39Comparison of annotation-based analysis and HMM
Col gt Van Van gt Col Total
Annotation differential expressiona 1626 923 2549
Annotation differential exonic splicingb 287 190 477
Annotation differential intronic splicingc 202 155 357
HMM differential expressiond 1654 962 2616
HMM differential splicinge 874 530 1404
HMM un-annotated transcriptf 34 42 76
HMM un-annotated 5'g 30 19 49
HMM un-annotated 3'g 28 8 36
40Comparison of annotation-based analysis and HMM
Annotation Expression (ColgtVan) Expression (VangtCol) Splicing (ColgtVan) Splicing (VangtCol)
HMM 1654 962 921 550
Expression (ColgtVan) 1626 1270 225
Expression (VangtCol) 923 727 132
Splicing (ColgtVan) 441 181 47
Splicing (VangtCol) 300 90 38
41Acknowledgements
Justin Borevitz Yan Li Christos Noutsos Geoff
Morris Andy Cal
Jake Byrnes Josh Rest