Title: Microarrays for Gene Expression Analysis
1Microarrays for Gene Expression Analysis
Questions What genes are expressed in this
tissue under these conditions? What genes are
expressed in my treated cells versus the
control? What genes are expressed during the
phases of the cell cycle? What genes are
expressed in diseased tissue versus normal
tissue?
2Microarrays other uses
Questions What point mutations exist and what
bases are located at the substitution
positions? What bases are substituted where
there are multiple mutations very close
together? Which allele of this gene do we
have? Is this the mutant or wildtype?
3Goals
Finding Co-Regulated Genes Understanding Gene
Regulatory Networks
4Expressed Genes mRNA
DNA
messenger RNA
protein
5Expressed Genes Currently Transcribed
Extract RNA
Isolate mRNAs
mRNA
mRNA
mRNA
mRNA
mRNA
6Affymetrix Oriented
- Fluorescently tagged cRNA
- One chip per sample
- One for control
- One for each experiment
- Other methods include two dyes/one chip
- Red dye
- Green dye
- Control and experiment on same chip
7Creating Targets
PCR Amplification of DNA
In Vitro transcription to create cRNA
8RNA-DNA Hybridization
Targets RNA
probe sets DNA (25 base oligonucleotides of known
sequence)
9Non-Hybridized Targets are Washed Away
Targets (fluorescently tagged)
probe sets (oligos)
Non-bound ones are washed away
10Picture of Gene Chip
11(No Transcript)
12Handling Chip
13570nm
Argon laser 488nm
Scanner based on epifluorescence confocal
microscopy
14Custom Chips vs Affy Chips
- Affy chips contains thousands of gene probes
- Genes selected from sources such as GenBank
- Custom chips can be designed for individual
- investigators
- Few genes, but more copies of each
- Done on microscope slide
15Example Affy Chips
- Rat Toxicology Chip - gt850 genes
- CYP450s, Heat Shock proteins
- Drug transporters
- Stress-activated kinases
- Rat Neurobiology chip - gt 1,200 genes
- Synuclein 1, prion protein, Huntingtons
disease - Syntaxin, Neurexin, neurotransmitters
16Example Affy Chips
- Arabidopsis Genome Chip
- Murine Genome Chip - gt36,000 genes
- E. coli Genome Chip - gt4,200 ORFs
- Drosophila Genome Chip - gt13,500 sequences
- Yeast Genome Chip - gt6,400 ORFs
- Human Genome Chip - gtgt60,000 human genes
-
17 Definitions
Probe a single-stranded DNA oligonucleotide
complementary to a specific sequence. Each probe
cell consists of millions of probe
molecules. Probe Array a collection of probes
sets. Probe Set a set of probes designed to
detect one transcript. 16-20 probe pairs. A 20
probe pair set is made up of 20 PM and 20 MM for
a total of 40 probe cells. Probe Pair Two probe
cells, a PM and its corresponding MM. Perfect
Match(PM) probes that are designed to be
complementary to the reference sequence. MisMatch(
PM) probes that are designed to be
complementary to the reference sequence except
for 1 base. Target sequence from your sample.
18GeneChip Hierarchy
Probe Array Chip Probe Set 16-20 probe
pairs(to detect particular gene) Probe Pair
Probe Cell (MisMatch) 20 Probe Cell (Perfect
Match) 20 Probes lt 25 bases (millions of
copies) Pixels 24 sq. um
19probe cell
probe pair
Probe Array (chip)
20(No Transcript)
21Probe a single-stranded DNA oligonucleotide
complementary to a specific sequence. Each probe
cell consists of millions of the same probe
molecules. The intensity of each cell is an
average of each of its scanned pixels.
Pixel 3 24 um
Probe Cell 20 - 50 micrometers
22Affymetrix Tiling Strategies
- Standard
- Alternative
- Block
- Expression
23Affymetrix Standard Tiling
- Purpose detection of mutations and polymorphisms
and determination of which base is at a certain
position. - Probes are arranged in sets of 4.
- Each probe in a set of 4 has one of 4 bases at
the substitution position. - Compares the four target-to-probe hybrid
intensities in each set to identify the base in
the substitution position.
24Affymetrix Standard Tiling
25Affymetrix AlternativeTiling
- Purpose determination of base where multiple
mutations are close together as opposed to a
single point mutation. - Probes are arranged in sets of 5.
- Includes a single base deletion at substitution
point. - Compares the four target-to-probe hybrid
intensities in each set to identify the base in
the substitution position.
26Affymetrix Alternative Tiling
27Affymetrix Block Tiling
- Purpose determination of genotype wildtype or
mutant. Determines which allele is present. - Probes are arranged in sets of 5.
- Includes a single base deletion at substitution
point. - Compares the four target-to-probe hybrid
intensities in each set to identify the base in
the substitution position.
28Affymetrix Block Tiling
M S
-2
-1
1
2
29Affymetrix Expression Tiling
- Purpose measure the relative abundance of
various mRNAs. - A set of probe pairs for each mRNA
- PM perfect match
- MM mismatch by one base
- Software compares the hybridization intensities
of the PM to those of the MM to determine the
absolute or difference call for each probe set.
30Affymetrix Expression Tiling
TARGET ACGGATG
PM ACGGATG
MM ACAGATG
31Data Analysis for Gene Expression
.cel file (pixel readings)
.dat file (average intensities, etc. are
calculated)
.chp file (parameters are calculated)
data mining, statistical analysis
32Raw Data to Cooked Data
of pixels
Intensity
Probe cell Avg Intensity
33Raw Data to Cooked Data
- Calculate the Average Intensity of every probe
cell - Calculate the background
- Subtract the background
- Calculate the Noise (pixel-to-pixel variation
within a probe cell) - Determine numbers of Positive and Negative probe
pairsfor every probe set. - Positive Probe Pair PM intensity gt MM intensity
- Negative Probe Pair MM intensity gt PM intensity
- Calculate Positive Fraction
- Calculate Pos/Neg Ratio
- Calculate Log Average Ratio Avg Difference
34Absolute Analysis Parameters
- Probe Set Name
- Positive - number of pairs scored positive
- Negative number of pairs scored negative
- Pairs number of probe pairs for a probe set
- Pairs Used those not masked for some reason
- PairsInAvg excludes those with extremely
intense or weak scores
35Absolute Analysis Parameters
- PM Excess have exceeded limit for intensity
- MM Excess have exceeded limit for intensity
- Avg Diff average difference of fluorescence
intensity between the PM and MM cells. - Log Avg Ratio a measure of the hybridization
performance - Higher better
- Log Avg 0 indicates random cross
hybridization - Pos/Neg ratio of positive probe pairs to
negative probe pairs - Positive Fraction positive probe pairs/probe
pairs - Abs Call Present, Absent or Marginal. Is this
gene present in this sample?
36Raw Data to Cooked Data
Positive Fraction Pos/Neg Ratio Log Avg
Ratio
Decision Matrix
Absolute Call (Present, Absent, Marginal)
37Data Analysis
Absolute Analysis used to determine whether
transcripts represented on the probe array are
detected or not within one sample(uses data from
one probe array experiment). Comparison Analysis
used to determine the relative change in
abundance for each transcript between a baseline
and an experimental sample(uses data from two
probe array experiments). Intensities for each
experiment are compared to a baseline/control.
38Approaches
- What genes are Present/Absent in my tissue?
- What genes are Present/Absent in the experiment
vs control? - Which genes have increased/decreased expression
in experiment vs control? - Which genes have biological significance based
on my knowledge of the biological system under
investigation?
39Approaches to Data Analysis
Database Queries
Graphical Analysis Statistical Analysis
Biological Knowledge
40Set Filter Parameters
Adjust filter parameters
Query
Pivot
Scatter/Fold Graph
Select Points
Add probe sets to filter
Bar Graph
Identify interesting relationships
41Data Analysis
Absolute Analysis used to determine whether
transcripts represented on the probe array are
detected or not within one sample(uses data from
one probe array experiment). Comparison Analysis
used to determine the relative change in
abundance for each transcript between a baseline
and an experimental sample(uses data from two
probe array experiments). Intensities for each
experiment are compared to a baseline/control.
42Comparison Analysis Parameters
- Inc number of probe pairs that increased
- Dec number of probe pairs that decreased
- Inc Ratio
- Dec Ratio
- Max Inc Dec Ratio
- Pos Change
- Neg Change
- Inc/Dec
- DPos-DNeg Ratio
- Log Avg Ratio Change
- Diff Call did this gene increase or decrease?
- Increase, Marginal Increase, Decrease,
Marginal Decrease, - No Change
43Comparison Analysis Parameters(continued)
- Avg Diff Change how much did the difference
between PM and MM change from the control to the
treated?(Avg Dif Exp Avg Dif Control) - BA was this gene present in the control?
- Fold Change how many times more expression did
the treated have compared to the control?
(positive or negative) - Sort Score a ranking based fold change and avg
diff change
44Data Analysis
Filter/Query Select those oligos which have
shown a real,significant change.
45Filter Sort to Find Real Changes
Avg. Difference Change gt 200
and FoldChange gt 3 and INC gt 70 and
DEC 0
46Query Results
47Query Results
Probe1 Probe2 Probe3 Exp1 1 4
9 Exp2 3 6 8
48Pivot the Query Results
- Experiments columns
- Genes Rows
- Shows how genes change across experiments
49Pivot Results
Exp1 Exp2 Probe1 1 3 Probe2
4 6 Probe3 9 8
50Pivoted Data Can be sorted by any parameter.
Sort in descending order to show greatest
differences.
51Description of gene
Link to NCBI
52(No Transcript)
53Graphical Analyses
Scatter Plot Graph requires Control vs Experiment
54(No Transcript)
55Fold Change Graphs
How many times did the expression of this gene
change in the treated tissue versus the control?
56(No Transcript)
57(No Transcript)
58Statistical Techniques
59Statistical Techniques
- Self-Organizing Maps
- Correlation Coefficient Clustering
- Analysis Function
- Matrix
60Self-Organizing Maps
- Clustering
- Automatically discovering classes
61Self-Organizing Maps
62Self-Organizing Maps
Genes whose expression level rise/fall
together under the same conditions, cluster
together Co-regulated ?
63(No Transcript)
64(No Transcript)
65Statistical Techniques
- Average
- Standard Deviation
- Median
- T-test parametric comparison of means.
- assumes a normal distribution
- Mann-Whitney for the comparison of two samples
- non-parametric
- no assumption about underlying distribution
66Is there any difference in the expression pattern
for Exp1 Versus the expression pattern for
Exp2? A Mann-Whitney (non-parametric comparison)
can help answer such questions.
Exp1 Exp2 Probe1 1 3 Probe2 4
6 Probe3 9 8
67Uses of Expression Analysis
- hypothesis generation
- hypothesis testing
- need for replication
68(No Transcript)
69Microarray DBs on the Web
http//www.biologie.ens.fr/en/genetiqu/puces/bdden
g.html
70(No Transcript)
71(No Transcript)
72(No Transcript)
73Free SOM Software http//rana.lbl.gov/EisenSoftwa
re.htm
74Contacts
Brad Yoder 934-0994 Li Hong Teng
934-0995 Aubrey Hill 934-4069 www.affymetrix
.com Michael Eisen Lab at Lawrence-Berkley Labs
http//rana.lbl.gov/ Stanford MicroArray
Database http//genome-www4.stanford.edu/MicroArr
ay/SMD/ Review of Currently Available Microarray
Software http//www.the-scientist.com/yr2001/apr/
profile1_010430.html