Title: Variance stabilization for microarray data
1Variance stabilization for microarray data in
comparison with various other normalization
strategies part 1 Christine Steinhoff Microa
rrays/Hybridization/Northern Ulrike
Nuber Bettina Lipkowitz H.-Hilger
Ropers Max Planck Institute for Molecular
Genetics Berlin, Germany
2Types of Arrays
Red/Green experiments
Affymetrix Chips
Radioactive filters
3Data Analysis - Procedure
Starting with Image Processing Scanner output
information about spot intensities local
background pins PCR plates localization
standard deviation...
Quality Check Are there any effects due to
pins PCR plates local effects ...
4What is Normalization ?
Systematic Variation in Microarray Experiments
- Saturation (Scanner Labeling) -
Nonlinearity of Cy5, Cy3 Labeling - Efficiencies
of Cy5, Cy3 Labeling - Variation of
Low-Intensities - Pins - PCR Plates - Local
Effects ...
Normalization is the process of describing and
removing such variation
5Some Examples
6Unequal Variance Across Intensity Range
Logratio
Product intensity (logscale)
7Normalization Methods
User Defined Sets Housekeeping (?!) Controls
etc useful for Most Genes Changed- Settings
Entire Dataset useful for Most Genes
Unchanged- Settings
8Normalization Strategies
Local Regression determine regression lines
locally
9Normalization Strategies
10Comparison of Normalization Strategies
1 maximal differential genes (red, 138 genes)
discarding 5 lowest expressed genes (green, 691
genes) before log product vs. log ratio of
normalized intensities
No Normalization
Linear Regression
Local Regression
Overall Median
Zscore
ANOVA
Variance Stabilization
11Comparison of Normalization Strategies
Goal Detection of differentially expressed
genes Set of 30 maximal differential genes out
of 13824
12Comparison of Normalization Strategies
Goal Detection of Differentially expressed Genes
Var Stab ANOVA Lin Regr Least Med Local
Regr Mean Median Shorth Zscore Raw
distance d(i,j) N - shared genes
13Comparison of Normalization Strategies
Goal Detection of Differentially expressed Genes
Var Stab ANOVA Lin Regr Least Med Local
Regr Mean Median Shorth Zscore Raw
d(i,j) 1 - 6/(N(N2-1)) ?k,l1...N d(i,j)k,l
genes ordered by abs(logratio) d(i,j)k,l
rank(genek)-rank(genel) if exists
N1 else
14Comparison of Normalization Strategies
Dataset three repetitions of one dye swap
experiment (6) followed by Northern blot
verification
Normalization strategies
Biological Evaluation (a) Northern Blotting
(b) quant. RT PCR (c) SAGE library (d)
quantifiable controls
15Goal Biological Evaluation
Microarray Ratios
quantifiable method different from microarray
16Three Dye Swap experiments
experiment 1
experiment 2
experiment 3
swap 1
genes empties housekeping plant
swap 2
17Biological Evaluation
RAW DATA
LogRatio
product of intensity (logScale)
18Comparison of Normalization Strategies
(1) 0.845
(2) 0.845
(7) 0.854
(3) 0.859
(4) 0.851
(1) Raw data (2) median (3) ZScore (4) Overall
(linear) Regression (5) Local Regression (6)
Variance Stabilization (7) ANOVA
(5) 0.851
(6) 0.853
19Spoil the data
Plate specific effect Random
effect Labeling effect Scanner effect
20Normalize again
21Compare to Northern Results
no/mean ZScore Local R
Lin R ANOVA VarStab
raw random effect on 5 spots labeling
scatter in low intensity
correlation coefficient