Whole genome transcriptome variation in Arabidopsis thaliana - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Whole genome transcriptome variation in Arabidopsis thaliana

Description:

Whole genome transcriptome variation in Arabidopsis thaliana Xu Zhang Borevitz Lab Generalized tiling array HMM 3-state HMM Discrete distribution for emission ... – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 42
Provided by: XuZh8
Category:

less

Transcript and Presenter's Notes

Title: Whole genome transcriptome variation in Arabidopsis thaliana


1
Whole genome transcriptome variation in
Arabidopsis thaliana Xu Zhang Borevitz Lab
2
Arabidopsis thaliana have been adapted to highly
variable environments
3
Transcription and splicing
Chromosomal DNA
Exon 1
Exon 2
Exon 3
Intron 1
Intron 2
Transcription
Nuclear RNA
RNA splicing
Messenger RNA
Exon 1
Exon 2
Exon 3
Exon 1
Exon 3
4
Whole genome tiling array
  • High density and resolution 1.6M unique probes
    at 35bp spacing
  • Without bias toward known transcripts

Genetic hybridization polymorphisms could affect
the estimation of gene expression
5
The experiment
Col? x Col?
Van ? x Van ?
Col ? x Van ?
Van ? x Col ?
  • parental strains and reciprocal F1 hybrids
  • mRNA from total RNA genomic DNA

6
Double-stranded random labeling
AAAAA
Random reverse transcription
AAAAA
Double-stranded cDNA
Random priming
7
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

8
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

9
Single Feature Polymorphisms and indels
SFP
SFP
SFPs
deletion or duplication in Van
10
Sequence polymorphisms
SFPsa FDR Col gt Vanc Van gt Colc Total
SFPsa 11.82 135769 14934 150703
SFPsa 7.66 126443 9479 135922
SFPsa 5.22 118381 6662 125043
SFPsa 3.88 110861 4979 115840
SFPsa 3.15 104115 3820 107935
Indelsb Model selection deletion duplication Total
Indelsb BICd 518 22 540
Indelsb AICe 1645 136 1781
SPFs and indels (gt200bp) were removed before gene
expression analysis
11
Deletions vs duplications
12
Distribution of indels along chromosomes
13
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

14
Additive, dominant and maternal effects of gene
expression
15
The linear model
Gene probe Intensity additive dominant
maternal e
16
Gene expression variation between genotypes

  Deltaa Sigb Sig-c Total Falsed FDR
additive 0.5 4911 3967 8878 901 10.15
additive 1 2674 1736 4410 215 4.88
additive 1.5 1626 923 2549 70 2.76
additive 1.8 1249 676 1925 39 2.03
additive 2.5 690 334 1024 13 1.24
dominant 0.5 1511 3190 4701 767 16.31
dominant 1 405 1521 1926 186 9.65
dominant 1.5 157 811 968 67 6.93
dominant 1.8 92 575 667 40 5.99
dominant 2.5 41 270 311 14 4.65
maternal   0.5 5998 95 6093 735 12.06
maternal   1 2046 8 2054 151 7.37
maternal   1.5 480 0 480 49 10.29
maternal   1.8 163 0 163 28 17.33
maternal   2.5 41 0 41 9 22.84
17
The pattern of gene expression inheritance
Mean gene intensity
paternal
Maternal
Col dominant
over dominant
F1v dominant
F1c dominant
Van dominant
Col Van F1v F1c
18
The pattern of gene expression inheritance
19
Enrichment in GO functional categories
GO enrichment for additive dominant maternal
effect genes
Defense response genes are highly expressed in F1
hybrid lines, while many growth related pathway
are down-regulated
20
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

21
Default expression status of exon and intron
  • Exons correction for gene expression
  • corrected by gene mean
  • corrected by a gene median
  • splicing index (Meanexon/Meangene)
  • Introns direct comparison

Exon/intron probe Intensity additive dominant
maternal e
22
Differential exon splicing
  Deltaa Sigb Sig-c Total Falsed FDR
  corrected by gene mean   0.3 287 190 477 559 117
  corrected by gene mean   0.4 177 129 306 205 67.0
  corrected by gene mean   0.5 127 109 236 97 41.0
  corrected by gene mean   0.6 92 86 178 55 30.8
  corrected by gene mean   0.7 77 69 146 34 23.4
Corrected by gene median 0.3 523 280 803 556 69.2
Corrected by gene median 0.4 328 172 500 203 40.6
Corrected by gene median 0.5 223 120 343 96 28.0
Corrected by gene median 0.6 154 76 230 54 23.5
Corrected by gene median 0.7 123 52 175 34 19.3
Splicing index 0.3 407 235 642 425 66.0
Splicing index 0.4 292 175 467 132 28.0
Splicing index 0.5 230 143 373 50 13.0
Splicing index 0.6 178 104 282 21 7.50
Splicing index 0.7 148 86 234 10 4.30
Exon probe Intensity additive dominant
maternal e
23
Differential intron splicing
Deltaa Sigb Sig-c Total Falsed FDR
0.3 561 1034 1595 332 20.8
0.4 405 523 928 85 9.17
0.5 316 352 668 28 4.26
0.6 239 220 459 12 2.61
0.7 202 155 357 7 1.91
0.8 176 120 296 5 1.53
Intron probe Intensity additive dominant
maternal e
24
Differential exon splicing is predominantly
additive in F1 hybrids
25
Some dominant effect in differential intron
splicing in F1 hybrids
26
Comparison for enrichment in known alternatively
spliced exons
      Threshold 1 Threshold 1 Threshold 2 Threshold 2
      Called Not called Called Not called
Corrected by gene mean Known Known 28 991 7 1012
Corrected by gene mean Not known Not known 397 55145 90 55452
Corrected by gene mean Fold enrichment Fold enrichment 3.92 3.92 4.26 4.26
Corrected by gene mean p-value p-value 5.97E-09 5.97E-09 1.90E-03 1.90E-03
Corrected by gene median polish Known Known 24 995 6 1013
Corrected by gene median polish Not known Not known 430 55112 85 55457
Corrected by gene median polish Fold enrichment Fold enrichment 3.09 3.09 3.86 3.86
Corrected by gene median polish p-value p-value 3.60E-06 3.60E-06 6.14E-03 6.14E-03
Splicing index Known Known 24 1093 5 1112
Splicing index Not known Not known 537 72328 88 72777
Splicing index Fold enrichment Fold enrichment 2.96 2.96 3.72 3.72
Splicing index p-value p-value 6.84E-06 6.84E-06 1.36E-02 1.36E-02
27
Experimental determined FDR for differential
splicing
  of significant calls estimated FDR of tested of confirmed experimental FDR
Exon (corrected by mean) 477 117 45 22 51.1
Exon (corrected by mean) 111 20.8 18 10 44.4
Exon (corrected by median) 500 40.6 40 21 47.5
Exon (corrected by median) 103 15.60 17 10 41.2
Exon (splicing index) 642 66.0 50 23 54.0
Exon (splicing index) 102 1.00 20 10 50.0
intron 459 2.61 65 38 41.5
intron 195 1.15 58 33 43.1
28
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

29
Enrichment of differentially spliced genes in
chloroplast thylakoid
enrichment of differentially spliced genes
30
Chloroplast thylakoid
31
Photosynthesis related genes
Differrentially spliced genes which are located
in chloroplast thylakoid
AT5G38660 APE1 (Acclimation of Photosynthesis to
Environment) mutant has altered acclimation
responses
32
Splicing regulator tend to be differentially
spliced
AT1G07350 transformer serine/arginine-richribonucleoprotein putative
AT1G55310 SC35-like splicing factor 33 kD(SCL33)
AT2G29210 splicing factor PWIdomain-containing protein
AT5G04430 KH domain-containing proteinNOVA putative
33
Outlines
  • Sequence polymorphisms
  • Gene expression variation
  • Splicing variation
  • A functional network of differentially spliced
    genes
  • HMM for a de novo transcription profiling

34
Generalized tiling array HMM
(by Jake Byrnes)
  • 3-state HMM
  • Discrete distribution for emission probability
  • Transition probability counts for probe spacing
  • Baum-Welch parameter estimation

35
An example of HMM detected segments
36
A nice model also needs better array
  • Array density is not enough to distinguish
    exon/intron boundaries
  • Probe quality

37
Differential segments gt3 continuous probes
with posterior probability gt0.99. Differentially
expressed genes annotated genes for which 33
of their probes reside within the observed
differential segments. Differentially spliced
genes annotated genes for which lt33 of probes
resided within the differential segment, or
annotated genes containing 2 differential
segments with different states. Novel gene
boundaries differential segments with gt 5
probes extending beyond annotated gene
boundary Novel transcripts differential segments
with gt 5 probes and outside any annotated gene
boundary.
38
Length distribution of segments called by HMM
39
Comparison of annotation-based analysis and HMM
    Col gt Van Van gt Col Total
Annotation differential expressiona 1626 923 2549
Annotation differential exonic splicingb 287 190 477
Annotation differential intronic splicingc 202 155 357
HMM differential expressiond 1654 962 2616
HMM differential splicinge 874 530 1404
HMM un-annotated transcriptf 34 42 76
HMM un-annotated 5'g 30 19 49
HMM un-annotated 3'g 28 8 36
40
Comparison of annotation-based analysis and HMM
  Annotation Expression (ColgtVan) Expression (VangtCol) Splicing (ColgtVan) Splicing (VangtCol)
HMM 1654 962 921 550
Expression (ColgtVan) 1626 1270   225  
Expression (VangtCol) 923   727 132
Splicing (ColgtVan) 441 181 47  
Splicing (VangtCol) 300   90 38
41
Acknowledgements
Justin Borevitz Yan Li Christos Noutsos Geoff
Morris Andy Cal
Jake Byrnes Josh Rest
Write a Comment
User Comments (0)
About PowerShow.com