Title: Identification of differential gene expression from Massively Parallel Signature Sequencing MPSS dat
1Identification of differential gene expression
from Massively Parallel Signature Sequencing
(MPSS) data based on bootstrap percentile
confidence intervals
Toni Reverter CRC for Innovative Dairy
Products Bioinformatics Group CSIRO Livestock
Industries Queensland Bioscience Precinct 306
Carmody Rd., St. Lucia QLD 4067, Australia
InCoB 2004 Auckland NZ
2Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Technology
- Another (very good) sequencing method
- Identifies (nearly) all the DNA molecules in a
given sample - Each analysis involves gt 106 transcripts (big
range!) - High sensitivity (identification of very low
abundant transcripts) - More information
- Brenner et al. (2000)
- www.lynxgen.com (Lynx Therapeutics, Inc.)
- Statistical analysis of MPSS data?
InCoB 2004 Auckland NZ
3Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Tag Data
SIGNATURE CANCER (tpm) NORMAL (tpm) GATCGTCCTCTC
CCCCG 15 17 GATCCCTGCCCCACCCC
6 19 GATCCCAACCTTTTGTA 0
6 GATCGCCCTCGTGCTGA 13
19 GATCTATGGCATCCAAG 6
1 GATCTTGGCCTTCACAT 10
19 GATCCCAGGCTGCTTCT 9
0 GATCTTGGCTTCTCAAC 24
1 GATCTGCACAGATGCCT 17
18 GATCAACGATATCCACA 3
10 GATCGAGGACTGTGTGG 290
156 GATCAAGCGGGAGCAGA 78
91 GATCCCAACAGGCTCAA 4 0
SUM 1,260,230 1,280,977 MIN
0 0 MAX
31,243 101,215
InCoB 2004 Auckland NZ
4Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Tag Distribution
MPSS Paper, Jongeneel et al.
PNAS 03, 1004702 tpm N Tags gt
1 (0.0) 27,965 100.00 5 (0.7) 15,145
54.16 10 (1.0) 10,519 37.61
50 (1.7) 3,261 11.66 100 (2.0) 1,719
6.15 500 (2.7) 298 1.07
1,000 (3.0) 154 0.55 5,000 (3.7)
26 0.09 10,000 (4.0) 7 0.02
InCoB 2004 Auckland NZ
5Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Statistics Model
Sample 1 Sample 2 Gene 1 n11
n12 N1. Others n21 n22 N2. N.1
N.2 N..
- Categorical Data Normal
approximation for Binomial proportions
- a la SAGE data
- Man et al., 2000
- Vencio et al., 2003
No Hypothesis testing ?
InCoB 2004 Auckland NZ
6Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
My Concern
Mycroarray vs MPSS
N Genes 10,000 25,000
DE Genes 2 10 15
25 N DE Genes 200 1,000
3,750 6,250
InCoB 2004 Auckland NZ
7Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
My Concern
Sensitivity Adapted from Reverter et al., 2004
InCoB 2004 Auckland NZ
8Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Test Data
- 2 Issues
- Equivalence with M-A plots
- Geometry
InCoB 2004 Auckland NZ
9Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Test Data
InCoB 2004 Auckland NZ
10Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Test Data
Binomial 5,137 DE Genes
InCoB 2004 Auckland NZ
11Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
Algorithm for Bootstrap
1. Read transcripts for the i-th signature
2. Sort MAi by ai (x-axis) 3. Define b
Bins (Same width or Same size) 4. Define r BR
(Bootstrap Replicates), enough for CI (eg. r
200) Define ? (Significance) 5. For each Bj
collect 5.1. Compute 5.2. Identify 6. Stop
InCoB 2004 Auckland NZ
12Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
Bins of equal width
InCoB 2004 Auckland NZ
13Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
Bins of equal size
- Merits
- Accuracy stabilisation
- Variance stabilisation
InCoB 2004 Auckland NZ
14Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Test Data
Bootstrap CI 497 DE Genes
InCoB 2004 Auckland NZ
15Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
MPSS Test Data
InCoB 2004 Auckland NZ
16Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
Conclusions
- Compared to microarray, the analysis of MPSS
should be trivial - Standard parametric (binomial) methods likely to
generate a large number of differentially
expressed elements. - Trade-off Biological vs Statistically
significant - The proposed method possesses a number of
advantages - Very easy to implement
- Very fast to generate
- Operates on total transcripts as opposed to
proportions - Accommodates the inherent heteroskedasticity
- More research () is needed to assess
- The impact of MPSS in expression studies
- The (possible) annotation gap (non-sequenced
species)
InCoB 2004 Auckland NZ
17Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
References
Brenner, S., M. Johnson, J. Bridgham, et al.
(2000) Gene expression analysis by massively
parallel signature sequencing (MPSS) on microbead
arrays. Nature Biotechnology 18630-634.
Jongeneel, C.V., C. Iseli, B.J. Stevenson, et al.
(2003) Comprehensive sampling of gene expression
in human cell lines with massively parallel
signature seequencing. PNAS, USA, 1004702-4705.
Man M.Z., X. Wang, and Y. Wang (2000) POWER_SAGE
comparing statistical tests for SAGE experiments.
Bioinformatics, 16953-959.
Reverter, A., S. McWilliam, W. Barris, and B.
Dalrymple (2004) A rapid method
for computationally inferring transcriptome
coverage and microarray sensitivity. Bioinformatic
s (in press).
Tu. Y., G. Stolovitzky, and U. Klein (2002)
Quantitative noise analysis for gene expression
microarray experiments. PNAS, USA, 9914031-14036.
Vencio, R.Z.N., H. Brentani, and C.A.B. Pereira
(2003) Using credibility intervals instead of
hypothesis tests in SAGE analysis.
InCoB 2004 Auckland NZ
18Differential gene expression from MPSS data based
on bootstrap percentile confidence intervals
Acknowledgements
Lynx Therapeutics, Inc. Christian Haudenschild
Peter Thompson (USYD) Frank Nicholas (USYD) Ross
Tellam (CSIRO) Brian Dalrymple (CSIRO)
InCoB 2004 Auckland NZ