Title: Plotting the path from RNA to microarray: the importance of experimental planning and methods
1Plotting the path from RNA to microarray the
importance of experimental planning and methods
- Glenn Short
- Microarray Core Facility/Lipid Metabolism Unit
- Massachusetts General Hospital
2Talk Outline
- Why perform a microarray experiment?
- Choosing a microarray platform
- Sources of variability that lend to experimental
considerations - Overcoming experimental variability
3Why perform a microarray experiment?
- Genomic vantage point
- Detect gene expression
- Compare gene expression levels
- Over time
- Over treatment course
- Map genes to phenotypes
- Map deleted or duplicated regions
- Identify genes that modulate other genes
- Binary decision-making
4When not to perform a Microarray Experiment
- Interested in a small number of specific genes
QRT-PCR, Northern blots - Desire quantitative results
- Low tolerance of variability
- Cannot afford to perform experiment with adequate
replication
5Asking a Specific Question
- The most fundamental the MOST IMPORTANT
- Simplifies experimental design
- Empowers interpretation of data
- Simplicity, simplicity, simplicity! I say let
your affairs be as one, two, three and to a
hundred or a thousand We are happy in proportion
to the things we can do without.--Henry David
Thoreau
6Considerations of Microarray Experimental Design
- Which microarray platform will be used?
- What is the end goal of the experiment?
- What is the specific question being asked?
- What are the most pertinent comparisons?
- What controls will be applied to the experiments?
- Which statistical methods will be used during
data analysis? - What methods will be used to verify results from
the microarrays?
7Choosing a Microarray Platform
- Are genes of interest included on the array?
- Are genes replicated?
- Tiling of genes that undergo splicing
- Controls on array
- Quantity of RNA needed for testing
- Are the arrays adequately QCd?
- Cost
8Affymetrix Platform
9Affymetrix Platform
10Affymetrix Platform
- Pros
- standardized production
- gene replication
- probe tiling across gene
- Reproducible
- Affymetrix custom database user-friendly
- Cons
- Expensive
- Annotation differences
- single sample per chip
11cDNA Platform
cDNA clones (probes)
- Pros
- Genome sequence independent
- High stringency hybridization
- Little need for signal amplification
- Cons
- Clone handling
- Clone authentication
- cDNA resources difficult to access and often
cross- contaminated
1. PCR product amplification 2.
Purification 3. Printing
PCR products used as probes
12Spotted oligonucleotide Platform
Synthesized oligonucleotides in 384 well plates
- Pros
- Complete control over oligo sequences
- Absence of contamination
- Additional probes may be added when needed
- Flexibility of design, probe replication, and
tiling - Inexpensive, enabling experimental replication
- Cons
- Sequence data required for probe design
- No consensus set of probe design algorithms
- Must have arraying instrumentation
- Purification
- QC
- Printing
Oligonucleotides used as probes
13Spotted Oligonucleotide vs Affymetrix Arrays
Oligonulceotide Affymetrix
14ParaBioSys Platform
- Long Oligonucleotides, 70mer
- Designed and synthesized in-house
- 5-amine modified
- Extensively QCd
- Probes designed to the 5-orf
- Set is updated as known orf list grows
- Currently 20,000 probes
15ParaBioSys probe design and synthesis
- Probe design using OligoPicker
- based on gen-pept database
- Tms of selected oligos approx. the same
- improved specificity
16Oligonucleotide Quality Control
pass
fail
- Use of mass spectral analysis
- Identifies relative abundance
- Ensures probe is of the expected mass based upon
sequence
- Capillary Electrophoresis
- Identifies relative abundance of full-length
product
17Array Quality Control
- Spotted probes are 3-labeled with dCTP-Cy3 using
terminal deoxynucleotidyl transferase - First and last array of the print-run are QCd
18Understanding sources of variability in
microarray experiments
?
?
?
19Sources of Variation
- Differences in identical treatments
- Intrinsic biological variation
- Technical variation in extraction and labeling of
RNA samples - Technical variation in hybridization
- Spot size variation
- Measurement error in scanning
20When graphing expression data, use log
0 5 10 15 20
-4 -2 0 2 4
ratio (T/C) log2
ratio (T/C)
21Plotting expression data
log2 C
M
A
log2 T
M log ratio vs Alog geometric
mean
22Expression data-cont
Genes expressed up relative to reference by a
factor of 32.
log2(Ti /Ci)
Genes expressed down relative to reference by a
factor of 1/32.
Low expressed Highly expressed
23Differences Due to Treatment
- RNA isolation protocol differences
- Cell-culture media changes
- Expression differences over time
- Cell cycle genes (synchronization)
- Variables need to be minimized!
24Biological Variability
- Self-self hybridizations of four independent
biological replicates - Biological variability of inhibitory PAS domain
protein
25Technical Variability
Sample 2
Sample 3
Sample 1
Sample 1
- Self-self hybridization (Cerebellar vs
cerebellar) - Sample 1 and 2 labeled together and hybridized on
separate slides - Sample 3 labeled separately
- Arises from differences in labeling, efficiency
in RT, hybridization, arrays, etc.
26Dye Effects
Environmental Health Perspectives VOLUME 112
NUMBER 4 March 2004
- Variation in quantum yield of fluorophores
- Variation in the incorporation efficiency
- Differential dye effects on hybridization
27Hybridization Variability
28Printing Variability
29Differences in Probe Performance
Academic_1 Academic_2 ParaBioSys Vendor
- Probe design algorithms will cause changes in the
expression pattern - Once a platform is chosen all future comparisons
should be performed on the same platform - Cross-platform comparisons as a means of
validation
30Differences Across Commercial Platforms
Plt0.001
Nucleic Acids Research, 2003, Vol. 31, No. 19,
5676-5684
31Controlling Variability
Experimental Plan
32Increased Quality Control
- Probe QC
- Array QC
- Total RNA QC
- denaturing agarose gel
- Agilent Bioanalyzer
- Labeling QC
33Controlling biological and technical variability
with replication
Integrin alpha 2b
Pro-platelet basic protein
- Average across replicates
- Essential to the estimation of variance
- Critical for valid statistical analysis
34Controlling Dye Effects
T
C
T
C
35Controlling Variability through Experimental
Design
- Replication
- Spot
- Multiple arrays per sample comparison (technical)
- Dye swap
- Multiple samples per treatment group (biological)
- Increased precision and quality control
- Estimate measurement error
- Estimate biological variation
- Pooling
- Reduce biological variation
36Controlling Variability through Experimental
Design cont.
- Normalize data to correct for systematic
differences (spot intensity, location on array,
hybridization,dye,scanner, scanner parameters)
on the same slide or between slides, which is not
a result of biological variation between mRNA
samples - Minimize printing differences by using a
contiguous series of slides from the same print
run - If wanting to do historical comparisons, use the
same platform
37Planning your experiment
- Experimental Aim
- Specific questions and priorities among them
- How will the experiments answer the questions
posed? - Experimental logistics
- Types of total RNA samples
- Reference, control, cell line, tissue sample,
treatment A. - How will the samples be compared?
- Number of arrays needed
- Other Considerations
- Plan of experimental process prior to
hybridization - Sample isolation, RNA extraction, amplification,
pooling, labeling - Limitations number of arrays, amount of material
- Extensibility (linking)
38Planning your Experiment- cont
- Other Considerations-cont
- Controls positive, negative, in-spike controls
- Methods of verification
- QRT-PCR, Northern, in situ hybridization,
- Performing the experiment
- Reagents (arrays-from same print run), equipment
(scanners), order of hybridizations
39Controls
- Positive Controls
- used to ensure that target DNAs are labeled to an
acceptable specific activity - single pool of all probe elements on array
- Negative Controls
- used to assess the degree of non-specific cross-
hybridization - probes derived from organisms with no known
homologs/paralogs to the organism of study - derived in silico (alien sequences)
- In-spike controls
- Known amounts of polyadenylated mRNAs added to
each labeling reaction - Should not cross-hybridize with with any probe
sequences - Alien sequences
- Spot-report (Stratagene)
- Lucidea ScoreCard (Amersham Biosciences)
- Can be used to assess dynamic range of the system
40Validation
- If you have failed to
- validate your array data,
- you have NOT completed
- your analysis
- ParaBioSys has developed
- Primer Bank for QRT-PCR
- primer sequences
- http//pga.mgh.harvard.edu/primerbank/
41Many thanks for your attention
https//dnacore.mgh.harvard.edu http//pga.mgh.ha
rvard.edu
Glenn Short Microarray Core Massachusetts
General Hospital