Title: Gene Expression
1Gene Expression
2Gene expression
- All cells in one organism have the same DNA. But
different cells have very different functions. - In each cell at certain times only some genes are
expressed. - Which genes are expressed at which times?
3Cells
4Double-stranded DNA
5DNA Structure
6DNA matching
- Every A forms two weak hydrogen bonds with T.
- Every T forms two hydrogen bonds with A.
- Every C forms three weak hydrogen bonds with G.
- Every G forms three hydrogen bonds with C.
7RNA
- RNA is also a sequence of nucleotides.
- RNA means ribonucleic acid.
- DNA means deoxyribonucleic acid.
8Nucleotides
9RNA
10DNA Structure
11DNA vs RNA
- Both are strings of nucleotides.
- DNA is usually double-stranded RNA is
single-stranded. - RNA is usually much shorter than DNA.
- RNA replaces each T by U (uracil).
- DNA contains deoxyribose while RNA contains
ribose. This makes DNA more stable chemically
than RNA.
12DNA and RNA
- DNA in your cells is in the nucleus RNA can be
anywhere in the cell. - Proteins are made directly using RNA, not DNA.
13Central Dogma
- A protein-coding region of DNA is copied to
messenger RNA (mRNA) by transcription. - The mRNA leaves the nucleus and goes to a
ribosome. - The ribosome uses the mRNA to make a protein by
translation.
14Central Dogma
15Translating codons
- Ala/A GCT, GCC, GCA, GCG Leu/L TTA, TTG, CTT,
CTC, CTA, CTG - Arg/R CGT, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG
- Asn/N AAT, AAC Met/M ATG
- Asp/D GAT, GAC Phe/F TTT, TTC
- Cys/C TGT, TGC Pro/P CCT, CCC, CCA, CCG
- Gln/Q CAA, CAG Ser/S TCT, TCC, TCA, TCG, AGT,
AGC - Glu/E GAA, GAG Thr/T ACT, ACC, ACA, ACG
- Gly/G GGT, GGC, GGA, GGG Trp/W TGG
- His/H CAT, CAC Tyr/Y TAT, TAC
- Ile/I ATT, ATC, ATA Val/V GTT, GTC, GTA, GTG
- START ATG STOP TAG, TGA, TAA
16Protein primary structure
173D views of proteins
18DNA for beta hemoglobin
- ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGG
CAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGG
TGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCC
ACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAA
AGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGG
GCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT
CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCA
TCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAG
TGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA
19Primary structure for beta hemoglobin
- MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYWTQRFFESFGDLST
PDAVMGNPKVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDP
ENFRLLGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
20Part of the two strands for beta hemoglobin
- ATGGTGCATCTGACTCCT
- TACCACGTAGACTGAGGA
- The top is the sense or template the bottom is
the antisense or coding strand.
21Transcription Make mRNA
- ATGGTGCATCTGACTCCT sense
- TACCACGTAGACTGAGGA coding
- AUGGUGCAUCUGACUCCU mRNA
22Structure of mRNA
23mRNA goes to a ribosome, outside the nucleus
24Eukaryotic cell
- (1) nucleolus
- (2) nucleus
- (3) ribosomes (little dots)
- (5) rough endoplasmic reticulum (ER)
- (9) mitochondria
- (10) vacuole
- (11) cytoplasm
25Ribosomes
- The ribosome functions as a factory to make
proteins. It uses two kinds of input - (a) mRNA
- (b) tRNA
- It outputs a protein.
26Ribosome translates mRNA
- Ribosome (2) straddles mRNA (1)
- It makes the protein (3).
- It starts at AUG and ends at UAG
27Ribosome large subunit
28Transfer RNA (tRNA)
- Each tRNA molecule has on one side a conformation
that binds to the specific codon and on the other
side a conformation that binds to the
corresponding amino acid.
29tRNA
- CCA tail in orange, Acceptor stem in purple, D
arm in red, Anticodon arm in blue with Anticodon
in black, T arm in green.
30tRNA carries the amino acid matched to the codon
- UAC M tRNA will bind with the codon AUG
in the mRNA. - CAC V tRNA will bind with the codon GUG in
the mRNA.
31mRNA in a ribosome has the genetic information
- AUGGUGCAUCUGACUCCU
- UAC M tRNA will bind with the codon AUG.
- CAC V tRNA will bind with the codon GUG.
32Translating codons
- Ala/A GCT, GCC, GCA, GCG Leu/L TTA, TTG, CTT,
CTC, CTA, CTG - Arg/R CGT, CGC, CGA, CGG, AGA, AGG Lys/K AAA, AAG
- Asn/N AAT, AAC Met/M ATG
- Asp/D GAT, GAC Phe/F TTT, TTC
- Cys/C TGT, TGC Pro/P CCT, CCC, CCA, CCG
- Gln/Q CAA, CAG Ser/S TCT, TCC, TCA, TCG, AGT,
AGC - Glu/E GAA, GAG Thr/T ACT, ACC, ACA, ACG
- Gly/G GGT, GGC, GGA, GGG Trp/W TGG
- His/H CAT, CAC Tyr/Y TAT, TAC
- Ile/I ATT, ATC, ATA Val/V GTT, GTC, GTA, GTG
- START ATG STOP TAG, TGA, TAA
33mRNA goes to a ribosome
- AUGGUGCAUCUGACUCCU mRNA
- UAC . M tRNA
- CAC V tRNA
- The ribosome matches UAC on tRNA with AUG on
mRNA, then uses the M on the other end in the
protein.
34mRNA goes to a ribosome
- AUGGUGCAUCUGACUCCU mRNA
- UAC . M tRNA
- CAC V tRNA
- The ribosome matches CAC on tRNA with GUG on
mRNA, then uses the V on the other end to extend
the protein.
35Ribosome
- In this manner, the ribosome continues to make
the protein until it reaches a STOP codon.
36When is a given gene being expressed?
- A given protein is being made when its mRNA is
present in the cell. - The DNA is always present.
37When is a given gene being expressed?
- To tell what is being expressed at a given time
in a given cell, find out which mRNAs are
present. - For each kind of mRNA, measure the quantity
present.
38A microarray
39Microarrays
- A microarray consists of a pattern of thousands
of features. - Each feature has some DNA that will probe and
possibly bind with an mRNA sample. - Typically the feature is made to fluoresce under
the presence of binding mRNA. - The brightness of the dot corresponds to the
quantity of mRNA of the given sort that is
present.
40Two gene chips
41Microarrays
- Typically the probe is attached to a solid
surface which is a glass or silicon chip. It is
then called a gene chip or Affymetrix microarray.
42Introns
- Introns are inserts in the DNA within portions
that code for one protein. - The parts that code are exons.
43Introns must be removed to make the mature mRNA
44cDNA
- Complementary DNA (cDNA) is DNA synthesized from
mature mRNA using reverse transcriptase. - AUGGUGCAUCUG mRNA
- TACCACGTAGAC cDNA
45cDNA
- cDNA is more stable than RNA.
- cDNA corresponds with the part of the genome from
which introns have been removed. - cDNA does not correspond exactly to nuclear DNA.
46The mature mRNA
47The probes
- Each dot can contain DNA, cDNA, or an
oligonucleotide (oligo). - An oligonucleotide is a short fragment of
single-stranded DNA, typically 5 to 50
nucleotides long.
48Gene expression profiling
- In an mRNA or gene expression profiling
experiment the expression levels of thousands of
genes are monitored simultaneously in parallel.
This can be used to distinguish - (a) the effects of certain treatments
- (b) the effects of diseases
- (c) the effects of different stages of
development.
49Gene expression profiling
- For example, microarrays can identify genes whose
expression is changed in response to pathogens
by comparing gene expression in infected cells to
that in uninfected cells.
50A microarray experiment
- Suppose there are two cells--type 1, healthy, and
type 2, diseased. Both have four genes A, B, C,
D. We want to compare the expression of these
genes in the two types of cell.
51Procedure
- 1. Prepare the DNA chip using the chosen target
DNAs. - 2. From the cells, isolate the mRNA.
- 3. Use the mRNA as templates to generate cDNA
with a fluorescent tag attached. Typically a
green fluorescent tag is used for mRNA from
healthy cells, while a red tag is used for mRNA
from diseased cells.
52Procedure
- 4. Prepare a hybridization solution with a
mixture of the fluorescently labeled cDNAs. - 5. Incubate the hybridization solution with the
DNA chip. - 6. Detect bound cDNA using laser technology.
- 7. Analyze the data.
53Appearance afterwards
54Interpreting colors
- A spot with just healthy cDNA is green.
- A spot with just diseased cDNA is red.
- A spot with both is yellow.
- A spot with neither is black.
55Comparison of cells
- Microarrays are used to compare the genome
content in different cells for the same organism.
56Single Nucleotide Polymorphisms
- A single nucleotide polymorphism is a single
substitution in the genome. - Example
- AUGGUGCAUCUGACUCCU standard
- AUGGUGUAUCUGACUCCU SNP
57Detecting SNPs
- Microarrays can be used to detect SNPs between or
within populations. - This can measure predisposition to diseases or
identify appropriate drugs.
58How are chips made?
- In spotted microarrays the probes may be small
fragments of DNA. An array of fine needles is
controlled by a robotic arm that is dipped into
wells containing the DNA probes. Each needle
then deposits a probe at the desired location on
the surface. The probes are fixed to the
surface. Then the chip is ready to be washed in
a solution containing the targets.
59DNA microarray being printed by a robot
60Flexibility of microarrays
- Thus scientists can produce arrays from their own
labs, customized to an experiment.
61Bioinformatics problems
- 1. How long should the probes be (i.e., how many
nucleotides long)? - If too short, you get false signals.
- If too long, it is expensive.
62Bioinformatics problems
- 2. Which parts of a sequence should be cloned in
the probe?
63DNA for beta hemoglobin
- ATGGTGCATCTGACTCCTGAGGAGAAGTCTGCCGTTACTGCCCTGTGGGG
CAAGGTGAACGTGGATGAAGTTGGTGGTGAGGCCCTGGGCAGGCTGCTGG
TGGTCTACCCTTGGACCCAGAGGTTCTTTGAGTCCTTTGGGGATCTGTCC
ACTCCTGATGCTGTTATGGGCAACCCTAAGGTGAAGGCTCATGGCAAGAA
AGTGCTCGGTGCCTTTAGTGATGGCCTGGCTCACCTGGACAACCTCAAGG
GCACCTTTGCCACACTGAGTGAGCTGCACTGTGACAAGCTGCACGTGGAT
CCTGAGAACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCA
TCACTTTGGCAAAGAATTCACCCCACCAGTGCAGGCTGCCTATCAGAAAG
TGGTGGCTGGTGTGGCTAATGCCCTGGCCCACAAGTATCACTAA
64Statistical issues in microarrays
- 1. There is variability in how well each probe
in the microarray was made. - 2. There is variability in how uniformly the
target got washed across the chip. - 3. There is variability in how accurately the
probe binds with the target.
65Statistical questions
- What level of expression is statistically
significant? - If there are 20,000 probes, a 95 confidence
means there are ? events with probability less
than 5.
66Statistical questions
- What level of expression is statistically
significant? - If there are 20,000 probes, a 95 confidence
means there are 1000 events with probability less
than 5.
67Statistical issues
- How can the data be normalized (ie, compared with
known probability distributions, like the normal
curve)? - P values there will be false positives and
false negatives.
68Experimental design issues
- Replication of biological samples
- Replication of RNA samples from each experiment
- Replicate each spot on the microarray
69Data warehousing
- The data bases are huge hence hard to understand.