Title: cDNA Microarray
1cDNA Microarray
- Design and Pre-processing
- By
- H. Bjørn Nielsen
2Why Experimental Design
- To enable statistical hypothesis
verification/falsification - To balance the effects from undesired
controllable effects - To ensure sufficient statistical power
31. To enable statistical hypothesis
verification/falsification
- Typically, we want to identify differential
expressed genes between a set of conditions using
t-test or ANOVA like statistics. - This implies that we replicate sampling from a
set of fixed conditions.
Control vs. Treatment
Treatment 1, Treatment 2, Treatment 3
Multi factorial Control Treatment Mutant, Mutant
Treated
41. To enable statistical hypothesis
verification/falsification
- But we may also fit to a trend using alternative
statistics (Bayesian fit, Boot strapping, ANOVA
etc.)
Series T0, T1, T2, .... Tn
52. To balance the effects from undesired
controllable effects
- Typical controllable effects
- Labeling dye
- Microarray slide
- Sampling time
- Growth conditions
- Typical uncontrollable effects
- Random effects
- Unintended deviations in sample handling, growth
conditions, etc.
Minimize and Balance
62. To ensure sufficient statistical power
- An appropriate number of replicates are required
for distinguishing noise from 'effect' - Gene expression studies typically requires 3
replicates
Make sure to replicate over the most important
sources of variance Typical order of noise
contributions are Biological variation Sample
preparation batch Hybridization/slide effect Dye
effect/Spot effect
7An example
- Aim Identify differentially expressed genes
between ill and healthy patients. - Samples 4 ill and 4 healthy patients
- Using a two channel cDNA array.
- How should we do?
-
Slide Dye Condition
Slide 1 Cy3 ill
Slide 1 Cy5 ...
... ...
8Another example
- Aim Identify differentially expressed genes
between ill and healthy patients. - Samples 4 ill (2xM 2xF) and 4 healthy (2xM 2F)
- Using a two channel cDNA array.
- How should we do?
-
Slide Dye Sex Condition
Slide 1 Cy3 M ill
Slide 1 Cy5 ... ...
... ...
9Yet another example
- Aim
- Identify genes differentially affected by
starving in obese and lean people - Samples 4 obese (2x starving 2x not starving)
and - 4 lean (2x starving 2x not starving)
- Using a one channel GeneChip.
- How should we do?
-
Chip BMI Food
1 O S
2 L N
... ...
10cDNA pre-processing
- Background correction
- Normalization
- Within slide
- Between slide
11Background correction
- Is it meaningful?
- Methods
- subtraction
- movingmin (3x3)
- normexp
- none
Ritchie et al. 2007, Bioinformatics
12Normalization within array
- Correct for any bias that follow an undesired
uncontrollable effect - Print tip
- Microtiter plate
- Printing order
- Spatial trends (uneven hybridization)
- As well as intensity dependent biases
13Normalization between array
- Correction for intensity dependent biases
- Lowess
- Qspline
- Quantiles
- And more