Title: John Quackenbush, Ph.D.
1Presented by John Quackenbush, Ph.D. at the
June 10, 2003 meeting of the Pharmacology
Toxicology Subcommittee of the Advisory
Committee for Pharmaceutical Science
2The Experimental Design
- The Experimental Design dictates a good deal of
what you can do with the data - Good normalization and processing reflects the
experimental design - The design also facilitates certain comparisons
between samples and provides the statistical
power you need for assigning confidence limits to
individual measurements - The design must reflect experimental reality
- The most straight-forward designs compare
expression in two classes of samples to look for
patterns that distinguish them.
3Sample Pairing for Co-Hybridization Experiments
Direct Comparison with Dye Swap
A1
B1
A2
B2
A3
B3
A4
B4
A1
B1
A2
A3
B2
B3
A4
B4
- RNA sample is not limiting (e.g. plenty of
sample) - Flip dyes account for any gene-dye effects
Balanced Block Design
A1
B1
A2
B2
A4
B4
A3
B3
- RNA sample is limiting
- Balanced blocking accounts for any gene-dye
effects
4Multiple Sample Pairings
Reference Design (Indirect Comparison)
- More than two samples are compared
- (e.g. tumor classification, time course)
- Flip dyes are not necessary but can be done to
increase precision - Ratio values are inferred (indirect)
- Suited for cluster analysis need common
reference
Loop Design
5Loops and Reference Designs
23 Hybs
10 hybs
Standard flip-dye expt
S. Wang , K. Kerr, J. Quackenbush, G. Churchill
6Loops and Reference Designs
Both approaches can give equivalent results
S. Wang , K. Kerr, J. Quackenbush, G. Churchill
7Loop vs. Reference Designs
- Loop design
- Can provide direct measurements
- Give more data on each experimental sample with
the same number of hybs - Require more RNA per sample
- Can unwind with a bad sample or for a gene
with bad data - Reference design
- Easily extensible
- Simple interpretation of all results
- Requires less RNA per sample
- Less sensitive to bad RNA samples and bad
array elements
8One Possible Experimental Paradigm
Examining Genotype, Phenotype, and Environment
Parental - stressed
Derived - stressed
Parental - unstressed
Derived - unstressed
9Basic Design Principles
- Biological replicas are more informative than
correlated replicas (independent RNA, independent
slides) - More replicas are better higher statistical
power - For loops, hybridizations of individual samples
should be balanced (as many Cy3 as Cy5
labelings) - Self-self hybs add data on reproducibility and
can be used to produce error models - At a minimum, should use dye swap replicates to
compensate for any dye biases in labeling or
detection
10How Many Replicates?
(Simon et al., Genetic Epidemiology 23 21-36,
2002)
n 4(za/2 zb)2 / (d/1.4s)2
Where za/2 and zb are normal percentile values at
significance level a and false negative rate b
parameter d represents the minimum detectable
log2 ratio and s represents the SD of log ratio
values. For a 0.001 and b 0.05, then za/2
-3.29 and zb -1.65. Assume d 1.0 (2-fold
change) and s 0.25, Therefore n 12 samples
(6 query and 6 control).