Max Planck Institute - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Max Planck Institute

Description:

Max Planck Institute. for Molecular Genetics. Microarray Data Analysis ... differential genes (red, 138 genes) discarding 5% lowest expressed genes (green, ... – PowerPoint PPT presentation

Number of Views:1219
Avg rating:3.0/5.0
Slides: 33
Provided by: mgx
Category:

less

Transcript and Presenter's Notes

Title: Max Planck Institute


1
Max Planck Institute for Molecular Genetics
Microarray Data Analysis Comparison of various
Normalization Methods Christine Steinhoff Max
Planck Institute for Molecular Genetics
2
Outline
  • Types of Arrays
  • Effects in the Data
  • Procedure of Data Analysis
  • Problems and Noise
  • What is Normalization and why?
  • Methods
  • How to use them
  • Does it make any difference which method to use?
  • How to find out which one? one example

3
Types of Arrays
Red/Green experiments
Affymetrix Chips
Radioactive filters
4
Data Analysis Procedure
Informations about spot intensities (Pixel
and mean/median etc) local background pins PCR
plates Localization standard deviation...
5
Effects in your Data
Hybridisierung
6
Data Analysis Procedure
Starting with Image Processing Scanner output
information about spot intensities local
background pins PCR plates localization
standard deviation...
Quality Check Are there any effects due to
pins PCR plates local effects ...
7
Problem 1 Background
8
Problem 2 variability
9
Problem 3 Saturation
10
Problem 4 linearity
11
Problem 5 variance
Logratio
Product intensity (logscale)
12
Problem 6 Pin/Plate Effect?
Ratio of intensitities of both channels
Huber/von Heydebreck
Product intensitity of both channels
13
Problem 7 Pin Effect
Ratio of intensitities of both channels
Yang, YH et al, SPIE BiOS, San Jose 2001
Product intensitity of both channels
14
What is Normalization?
Systematic Variation in Microarray Experiments
- Saturation (Scanner Labeling) -
Nonlinearity of Cy5, Cy3 Labeling - Efficiencies
of Cy5, Cy3 Labeling - Variation of
Low-Intensities - Pins - PCR Plates - Local
Effects ...
Normalization is the process of describing and
removing such variation
15
Why ?
Goal Reliable Measurement of Ratios Patient
vs. Control Patient(red)/Control(green)
Patient(green)/Control(red)
In Self-Self-Hybridization we would
expect green/red 1 for all genes
Mixture of Unequal Labelling Noise not constant
Variance Differential Expression (not in this
example!) ...
16
Methods
User Defined Sets Housekeeping (?!) Controls
etc useful for Most Genes Changed- Settings
Entire Dataset useful for Most Genes
Unchanged- Settings
17
Methods
Local Regression determine regression lines
locally
18
Methods
19
How to use them?
http//www.bioconductor.org
20
Methods
1 maximal differential genes (red, 138 genes)
discarding 5 lowest expressed genes (green, 691
genes) before log product vs. log ratio of
normalized intensities
No Normalization
Linear Regression
Local Regression
Overall Median
Zscore
ANOVA
Variance Stabilization
21
Comparison
Goal Detection of differentially expressed
genes Set of 30 maximal differential genes out
of 13824
22
Comparison
Goal Detection of Differentially expressed Genes
Var Stab ANOVA Lin Regr Least Med Local
Regr Mean Median Shorth Zscore Raw
distance d(i,j) N - shared genes
23
Comparison
Goal Detection of Differentially expressed Genes
Var Stab ANOVA Lin Regr Least Med Local
Regr Mean Median Shorth Zscore Raw
d(i,j) 1 - 6/(N(N2-1)) ?k,l1...N d(i,j)k,l
genes ordered by abs(logratio) d(i,j)k,l
rank(genek)-rank(genel) if exists
N1 else
24
Which One?
Dataset three repetitions of one dye swap
experiment (6) followed by Northern blot
verification
Normalization strategies
Biological Evaluation (a) Northern Blotting
(b) quant. RT PCR (c) SAGE library (d)
quantifiable controls
25
Goal Biological Evaluation
Microarray Ratios
quantifiable method different from microarray
26
Three Dye Swaps
experiment 1
experiment 2
experiment 3
swap 1
genes empties housekeping plant
swap 2
27
Biological Evaluation
RAW DATA
LogRatio
product of intensity (logScale)
28
Comparison
(1) 0.845
(2) 0.845
(7) 0.854
(3) 0.859
(4) 0.851
(1) Raw data (2) median (3) ZScore (4) Overall
(linear) Regression (5) Local Regression (6)
Variance Stabilization (7) ANOVA
(5) 0.851
(6) 0.853
29
Fazit
For good data it doesnt really matter But
whats about bad data?
30
Spoil the data
Plate specific effect Random
effect Labeling effect Scanner effect
31
Normalize again
32
Compare with original data
no/mean ZScore Local R
Lin R ANOVA VarStab
raw random effect on 5 spots labeling
scatter in low intensity
correlation coefficient
Write a Comment
User Comments (0)
About PowerShow.com