Title: Using WebBased Tools for Microarray Analysis
1Using Web-Based Tools for Microarray Analysis
Michael Elgart
2Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
3Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
4What is a microarray?
- A tool for analyzing gene expression that
consists of a small membrane or glass slide
containing samples of thousands of genes arranged
in a regular pattern.
5The Boom of Microarray Technology Number of
Publications with Affymetrix Chips
6Whats the Point?
- Large scale (genome-wide) screening
- Eliminate bias of pre-selecting candidate genes
- Test multiple hypotheses simultaneously
- Generate new hypotheses by identifying novel
genes associated with experiment - Identify novel relationships/patterns among genes
7GEO Public Database Example
8Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
9What are DNA microarrays?
- Microarrays are a method of scanning the genome
based on an well known property of nucleic acids
(hybridization) - Complementary strands of DNA/RNA will find each
other in solution
10Types of DNA Microarray Experiments
Some types of experiments that can be done
- Measure changes in gene expression
- RNA hybridizes to DNA
- Identify genomic gains and losses
- Genomic DNA hybridizes to DNA
- Identify mutations in DNA
- PCR product hybridizes to DNA
11Expression Microarray Basics
- Two parts
- Probes the single stranded DNA molecules on the
solid surface - Targets the single stranded labeled population
from your experimental source
12Microarray Overview
13Probe deposition on array
- Contact printing
- Ink jet spraying
- On chip synthesis
14Pin Spotting of DNA Arrays
- Can be automated or manual
- Relatively cheap but may result in QC issues with
spots
10 per 100 probe array
15Under the microscope
16Ink jet spraying
17Ink jet sprayed spots on a chip
18Affymetrix
- Will be dealing mainly with this type today, so
here is a little more data
19On chip synthesis
Lithography
20(No Transcript)
21(No Transcript)
22Set of probes that identifies a transcript
ProbeSet
23Affymetrix
- Gene Expression Arrays Transcripts/Genes
- Arabidopsis Genome 24,000
- C. elegans Genome 22,500
- Drosophila Genome 18, 500
- E. coli Genome 20, 366
- Human Genome U133 Plus 47,000
- Mouse Genome 39, 000
- Yeast Genome 5, 841 (S. cerevisiae) 5, 031
(S. pombe) - Rat Genome 30, 000
- Zebrafish 14, 900
- Plasmodium/Anopheles 4,300 (P. falciparum)
14,900 (A. gambiae) - Barley (25,500), Soybean (37,500 23,300
pathogen), Grape (15,700) - Canine (21,700), Bovine (23,000),B.subtilis
(5,000), S. aureus (3,300 ORFS), Xenopus (14, 400)
24Spots on an Affymetrix chip printed using
photolithography
25DNA Deposition on Array
2um
Taken from Duggan et al, Nature Genetics 2110
26(No Transcript)
27RNA Quality and Quantity
28S rRNA
18S rRNA
Degraded sample
28Hybridization expression level
- The amount of hybridization of RNA to a fragment
of DNA representing any gene can be measured if
the RNA is labeled with some dye - The intensity of hybridization is a surrogate
that measures the level of expression of the gene
represented by that DNA fragment
29Hybridization and Washing of DNA Microarrays
- Remains one of the most poorly controlled steps
in the process - Long oligonucleotide probes were designed to
standardize the Tms across the slide - However, there will be variable efficiency,
variable specificity
30Slide Scanning
Selectable lasers Emission filters with range
from 500-700 nm 5 micron resolution
Goal is to generate images of the arrays that are
used as input for quantitation algorithms
31Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
32(No Transcript)
33(No Transcript)
34(No Transcript)
35Usually the 75th percentile
36Do not use MM data! MAS (3,4,5) is NOT GOOD Use
RMA !!!
37Fortunately (?) you dont do this
The result
INTENSITY NumberCells4691556 X Y MEAN STDV NPIX
ELS 0 0 30022.0 4025.9 9 1
0 507.0 48.5 9 2 0 30116.0 4500.7 9 3
0 602.0 97.3 9 4 0 339.0 36.3 9 5
0 491.0 59.1 9 6 0 29208.0 3090.8 9 7
0 877.0 126.0 9 8 0 28683.0 4069.2 9 9
0 645.0 63.6 9 10 0 28536.0 3462.7 9 11
0 473.0 100.5 9 12 0 29509.0 4287.0 9
13 0 667.0 83.2 9
CEL Version3 HEADER Cols2166 Rows2166 Tota
lX2166 TotalY2166 OffsetX0 OffsetY0 GridCorner
UL623 408 GridCornerUR16090 586 GridCornerLR159
32 15984 GridCornerLL464 15807 . . . .
38So can we just use the data now?
39Sources of Microarray Data Variability
- Biological variability in the populationNo good
solution here - At an experimental level, there is
- variability between preparations and labelling of
the sample, - variability between hybridisations of the same
sample to different arrays, and - variability between the signal on replicate
features on the same array.
Expression values in 2 replicas will be
different! Can we handle it?
40Normalization
- Deals with the fact that the results from
identical experiments on two identical
microarrays will never be exactly the same. In
addition to unavoidable random errors there are
also systematic differences caused by - Different incorporation efficiencies of dyes. For
example, green colored markers are stronger then
red ones (measured as stronger illumination)
creating a bias between experiments done with
green and red markers. - Different amounts of mRNA in the tested sample,
causing different expression levels. - Difference in experimenter or protocol.
- Different scanning parameters
- Differences between chips created in different
production batches.
41Quantile Normalization
- Intensity distributions are adjusted to be
equivalent - Scaling to a target intensity sets the mean
signal intensity to the defined value
500
Probe Intensity
Probe Intensity
Number of Probes
Number of Probes
42Background Correction
- Different GC content of probes
- Location on Chip Effect
- etc.
- All this need to be compensated for. The
algorythm to do it is - RMA
43Correct Experimental Design
- Tree representation of replicate experiments
- The first level is at the level of biological
replicates - This is followed by two independent mRNA
extractions - In each microarray experiment, each gene (each
probe or probe set) is really a separate
experiment in its own right
Experiment
Biological Replicates
Replicate 1
Replicate 2
Extract 2
Extract 1
Technical Replicates
We need normalization to be able to look at the
biological differences between samples and not
technical ones Elgart M.
44Reproducibility
- How big is the difference between sample that was
twice hybridized on same type of array? - If we look at technical replicas, what do we
expect to see?
45Summary Statistics
All using only Top 10,000 brightest probes
Correlation (gt2x Diffl Only)
Red In Replicates
Agree on 2x Diffl
46Set of probes that identifies a transcript
ProbeSet
If all 10 probes give high signal in Treatment
and low in Control then alls well. But what if
only 6 of 10 are positive? How do we decide
whether this gene is expressed?
47Set of probes that identifies a transcript
ProbeSet
If all 10 probes give high signal in Treatment
and low in Control then alls well. But what if
only 6 of 10 are positive? How do we decide
whether this gene is expressed?
48- Is this a hands-on thing ?
- Yes.
- Example
49(No Transcript)
50Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
51(No Transcript)
52(No Transcript)
53(No Transcript)
54(No Transcript)
55(No Transcript)
56(No Transcript)
57(No Transcript)
58(No Transcript)
59(No Transcript)
60(No Transcript)
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65(No Transcript)
66(No Transcript)
67(No Transcript)
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75(No Transcript)
76(No Transcript)
77(No Transcript)
78(No Transcript)
79(No Transcript)
80(No Transcript)
81(No Transcript)
82(No Transcript)
83(No Transcript)
84(No Transcript)
85(No Transcript)
86(No Transcript)
87(No Transcript)
88(No Transcript)
89(No Transcript)
90(No Transcript)
91(No Transcript)
92(No Transcript)
93(No Transcript)
94(No Transcript)
95(No Transcript)
96(No Transcript)
97(No Transcript)
98(No Transcript)
99(No Transcript)
100(No Transcript)
101(No Transcript)
102Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analisys
103(No Transcript)
104(No Transcript)
105(No Transcript)
106(No Transcript)
107(No Transcript)
108(No Transcript)
109(No Transcript)
110(No Transcript)
111(No Transcript)
112(No Transcript)
113(No Transcript)
114(No Transcript)
115(No Transcript)
116(No Transcript)
117(No Transcript)
118(No Transcript)
119(No Transcript)
120(No Transcript)
121(No Transcript)
122(No Transcript)
123(No Transcript)
124(No Transcript)
125(No Transcript)
126(No Transcript)
127(No Transcript)
128(No Transcript)
129(No Transcript)
130(No Transcript)
131Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analysis
132(No Transcript)
133(No Transcript)
134(No Transcript)
135(No Transcript)
136(No Transcript)
137(No Transcript)
138(No Transcript)
139(No Transcript)
140(No Transcript)
141(No Transcript)
142(No Transcript)
143(No Transcript)
144(No Transcript)
145(No Transcript)
146(No Transcript)
147(No Transcript)
148(No Transcript)
149(No Transcript)
150(No Transcript)
151Outline
- Introduction to microarrays why use them and
what to expect from their results - What are they?
- Why use them?
- What types are there?
- Low level analysis
- Background correction
- Normalization
- Quality control
- Significance analysis
- Annotations
- Functional Analysis
- Gene Ontology
- Promoter Analysis
152(No Transcript)
153(No Transcript)
154(No Transcript)
155(No Transcript)
156(No Transcript)
157(No Transcript)
158(No Transcript)
159(No Transcript)
160(No Transcript)
161(No Transcript)
162(No Transcript)
163(No Transcript)
164(No Transcript)
165(No Transcript)
166(No Transcript)
167(No Transcript)
168(No Transcript)
169(No Transcript)
170(No Transcript)
171(No Transcript)
172(No Transcript)
173(No Transcript)
174(No Transcript)
175(No Transcript)
176(No Transcript)
177(No Transcript)
178(No Transcript)
179(No Transcript)
180(No Transcript)
181(No Transcript)
182(No Transcript)
183(No Transcript)
184(No Transcript)
185Verifications...
186The END!
187Sources
- Gene Expression Omnibus
- http//www.ncbi.nlm.nih.gov/geo/
- R
- www.r-project.org
- Bioconductor
- www.bioconductor.org
- Race
- http//race.unil.ch
- Microarray Blob Remover(MBR)
- http//liulab.dfci.harvard.edu/Software/MBR/MBR.ht
m - Significance Analysis of Microarrays(SAM)
- http//www-stat.stanford.edu/tibs/SAM/
- Affymetrix NetAffx
- http//www.affymetrix.com/analysis/index.affx
- Onto Tools
- http//vortex.cs.wayne.edu/projects.htm
- Ensembl
- http//www.ensembl.org/index.html
- CisGenome