Title: Bioinformatics: Applications
1Bioinformatics Applications
- ZOO 4903
- Fall 2006, MW 1030-1145
- Sutton Hall, Room 312
- The Study of Gene Expression
2Lecture overview
- What weve talked about so far
- Finding genes within genomes
- The different forms genes can take (alternative
splicing) - Overview
- Measuring levels of transcription
- Microarray technology
3Why Measure Gene Expression?
- Gene expression levels correspond to protein
levels - More abundant genes/transcripts are more
important - Normal cells have a standard expression
profile/signature - Changes to expression profiles indicate something
is happening - Gene expression is a proxy measure for what
sort of environment or control the cell is under.
4Problems and potentials for high-throughput
analysis
Less
Easy
DNA
RNA
Biological relevance
Ease of measurement
protein
metabolite
More
Hard
phenotype
5Transcription is the most commonly reported form
of regulation
6mRNA level Protein level?
- There is a correlation, but
- Gygi et al. (1999) Mol. Cell. Biol. compared
protein levels (MS, 2D gels) and RNA levels
(SAGE) for 156 genes in yeast - In some genes, mRNA levels were essentially
unchanged, but protein levels varied by up to 20X - In other genes, protein levels were essentially
unchanged, but mRNA levels varied by up to 30X - Highly expressed mRNAs correlate well with
protein levels
7mRNA level Protein level?
R 0.35
R 0.95
Gygi et al. (1999) Mol. Cell. Biol
8Measuring Gene Expression
- Northern/Southern Blotting
- Serial Analysis of Gene Expression (SAGE)
- RT-PCR (real-time PCR)
- DNA Microarrays or Gene Chips
- Others
- Differential display
- Ribonuclease protection assays
- In-situ hybridization
9Northern Blots
- Method of measuring RNA abundance
- Name makes fun of Southern blots (which measure
DNA abundance) - mRNA is first separated on an agarose gel, then
transferred to a nitrocellulose filter, then
denatured and finally hybridized with 32P
labelled complementary DNA - Intensity of band indicates abundance
10Northern Blotting
11The Blot Block
12Northern Blots
13Advantages of Northerns
- Inexpensive, quantitative method of measuring
transcript abundance - Well tested and understood technology
- Use of radioactive probes makes it very sensitive
with high dynamic range
14Disadvantages of Northerns
- Relies on radioactive labeling dirty
technology - Old fashioned and low-throughput technology,
now largely replaced by microarrays and other
technologies
15Serial Analysis of Gene Expression (SAGE)
- Convert mRNA to cDNA
- Split each sequence into short (12-15 base),
unique tags with a restriction enzyme (NlaIII) - After creating the tags, these are concatenated
into a longer sequence or, essentially, a list
of shorter tags - Each tag is separated by a SAGE tag
- The list can be read using a DNA sequencer and
rapidly compared against known sequence databases
to estimate the frequency of genes
16SAGE Tools
17SAGE of Yeast Chromosome
18Advantages of SAGE
- Very direct and quantitative method of measuring
transcript abundance - Open-ended technology
- Sensitive high dynamic range
- Built-in quality control
- e.g. spacing of tags 4-cutter restriction sites
19Disadvantages of SAGE
- Expensive, time consuming technology - must
sequence gt50,000 tags per sample - Most useful with fully sequenced genomes
(otherwise difficult to associate 15 bp tags with
their genes) - 3 ends of some genes can be very polymorphic and
throw off the uniqueness assumption
20Real Time PCR
21Principles of PCR
Polymerase Chain Reaction
22RT-PCR
- RT-PCR is a method to quantify mRNA and cDNA in
real time - Generates quantitative fluorescence data at
earliest phases of PCR cycle when replication
fidelity is highest - Measures the build up of fluorescence with each
PCR cycle
23RT-PCR
An oligo probe with 2 flurophores is used (a
quencher reporter)
24RT-PCR vs. Microarray
25Advantages of RT-PCR
- Sensitive assay, highly quantitative, highly
reproducible - Considered gold standard for mRNA quantitation
- Can detect as few as 5 molecules
- Excellent dynamic range, linear over several
orders of magnitude
26Disadvantages of RT-PCR
- Expensive (instruments are gt150K, materials are
also expensive) - Not a high throughput system (10s to 100s of
genes not 1000s) - Can pick up RNA carryover or contaminating RNA
leading to false positives
27Microarrays
28Microarrays
- Basic idea
- Reverse Northern blot on a huge scale
- The clever tricks
- Miniaturize the technique, so that many assay can
be carried out in parallel - Hybridize control and experimental samples
simultaneously use distinct fluorescent dyes to
distinguish them
29First, mRNA is made into cDNA clones
30Robots lay down cDNA probes on the microarrays
31Target genes are labeled with Cy3 and Cy5 Dyes
Cy3
Cy5
32Samples are labeled and hybridized to the array
33Hybridized arrays are put in a scanner
34Microarrays Spot Color
35Typical cDNA microarray
36Spots are then gridded with software and
normalized vs. background
37Microarray technical problems
Anti-probe Locally high background
Spot overlap
Precipitate Locally low signal
Comet-tails (donut hole)
38Two Types of glass-slide microarrays
- Spotted glass slide cDNA (500-1000 bp) arrays
- Photolithographically prepared short
oligonucleotide (25-70 bp) arrays
39Affymetrix GeneChips
40Maskless photolitography
41Affymetrix SNP chips (resequencing arrays)
- Each probe 25 bp long
- 22-40 probes per gene
- Perfect Match (PM) as well as MisMatch (MM) probes
42(No Transcript)
43Microarray mania
- Expression arrays
- Exon arrays
- Genomic arrays!
- Methylation arrays
- Protein arrays
- Antibody arrays
- Tissue arrays
44Exon microarrays
45Comparative Genomic Hybridization Microarray
Reference DNA Labeled with Cy5 (Detected with Red
Laser)
Patient DNA Labeled with Cy3 (Detected with Green
Laser)
Cot-1 DNA Unlabeled
Glass slide spotted with DNA from known locations
in the Genome
46CGH Microarrays
- Reveals DNA copy number and allele-specific
information - Shows chromosomal deletions and duplications
47Methylation-Specific Oligonucleotide Microarrays
Reference DNA
Test DNA
mCG
mCG
CG
CG
Bisulfite treatment and PCR
CG
CG
TG
TG
3 end-labeling with Cy3 or Cy5 and co-hybridized
on the chip
48Methylation-specific hybridization
0 Methylation
100 Methylation
--C--C--
--T--T--
--C--C--
--T--T--
--T--T--
--C--C--
--C--C--
--T--T--
49Protein microarrays
- True protein microarrays are evolving very slow
and only a few exist. - Technology is not straight forward due to
inherent characteristic of proteins (e.g.
available ligands, folding, drying) - Mostly limited to binding interactions (e.g.,
antibodies, protein-protein) - Some detect protein-protein interaction by
surface plasmon resonance other use a
fluorescence based approach
50Protein microarrays
51Antibody microarrays
Antibody or ligand is on the microarray, proteins
are tagged with different dyes
52Chromatin Immunoprecipitation
53ChIP-chip
Chromatin immunoprecipitation on DNA chip
54Tissue microarrays
Tissue samples are spotted for histological or
immunochemical staining, in-situ hybridization,
or just visualization of morphology
55Advantages to Microarrays
- High throughput, quantitative method of measuring
transcript abundance - Avoids radioactivity (fluorescence)
- Kit systems and commercial suppliers make
microarrays very easy to use - Uses many high-tech techniques and devices
cutting edge - Good dynamic range
56Disadvantages to Microarrays
- Relatively expensive (gt1000 per array for Affy
chips, 300 per array for home made systems) - Quality and quality-control is highly variable
- Quantity of data often overwhelms most users
- Analysis and interpretation is difficult
- Not as sensitive as other methods low abundance
transcripts are harder to reliably detect
57Nature prefers a low-energy solution to the
transcriptional response
58Nature prefers a low-energy solution to the
transcriptional response
59Microarrays provide quantity at the cost of some
quality
60Summary
- There are a number of methods to measure
transcriptional levels, but most are being
replaced by microarray technology - What is lost in sensitivity is made up for by a
more global survey and the ability to do QC via
replicates - Miniaturization and automation are creating a new
high-throughput biology where data analysis and
management is the biggest challenge - What all these microarrays have in common is the
gathering of a massive amount of comparative data.
61For next time
- Homework 4 due
- Read Mount chapter 13, pages 628-58
62Different Kinds of Omes
Genome Transcriptome Proteome
63-Omics Mania
biome, cellomics, chronomics, clinomics,
complexome, crystallomics, cytomics,
cytoskeleton, degradomics, diagnomicsTM,
enzymome, epigenome, expressome, fluxome,
foldome, secretome, functome, functomics,
genomics, glycomics, immunome, transcriptomics,
integromics, interactome, kinome, ligandomics,
lipoproteomics, localizome, phenomics,
metabolome, pharmacometabonomics, methylome,
microbiome, morphome, neurogenomics, nucleome,
secretome, oncogenomics, operome,
transcriptomics, ORFeome, parasitome, pathome,
peptidome, pharmacogenome, pharmacomethylomics,
phenomics, phylome, physiogenomics, postgenomics,
predictome, promoterome, proteomics,
pseudogenome, secretome, regulome, resistome,
ribonome, ribonomics, riboproteomics,
saccharomics, secretome, somatonome, systeome,
toxicomics, transcriptome, translatome,
secretome, unknome, vaccinome, variomics...