Title: Functional Genomics
1Functional Genomics
- Functional genomic datasets
- Biological networks
- Integrating genomic datasets
BIO520 Bioinformatics Jim Lund
2Functional genomics
- Many different experimental designs
- Different kinds of information generated.
- Each has experimental limitations
- Coverage full genome, limited?
- False positives.
- False negatives.
3The Two-Hybrid System
- Two hybrid proteins are generated with
transcription factor domains - Both fusions are expressed in a yeast cell that
carries a reporter gene whose expression is under
the control of binding sites for the DNA-binding
domain
Activation Domain
Prey Protein
Bait Protein
Binding Domain
Reporter Gene
4The Two-Hybrid System
- Interaction of bait and prey proteins localizes
the activation domain to the reporter gene, thus
activating transcription. - Since the reporter gene typically codes for a
survival factor, yeast colonies will grow only
when an interaction occurs.
Activation Domain
Prey Protein
Reporter mRNA
Bait Protein
Reporter mRNA
Reporter mRNA
Reporter mRNA
Binding Domain
Reporter mRNA
Reporter Gene
5Interactions shown as a network
6Networks
- When methods of detecting functional linkages are
applied to all the proteins of an organism,
network of interacting, functionally linked
proteins can be traced.
- As methods improve for detecting protein
linkages, it seems likely that most of the
proteins will be included in the network.
7What do you miss?
- Tertiary interactions
- Regulated interactions
- Subcellular localization dependent
- Cofactor dependent (eg. Hormone-regulated)
- Low-affinity (Kdgt10-6)
8Cellular Location
- Immunolocalization
- FUSION PROTEINS
- Prediction
- Membrane vs non-membrane
- improved by homology
- WHICH MEMBRANE
- Nuclear vs cytoplasmic
9Drosophila Fusion Project (FlyTrap)
- Exon GFP vector
- Inserts fairly randomly.
- Fluorescent sort thousands of embryos.
- Find embryos with an insertion that produces GFP
expression. - Image
- Capture and analyze images
- Curate by hand.
- Computer image analysis and classification.
10Developmental Localization
11Mouse genomic gene expression
- Allen Brain Atlas (ABA) is an interactive,
genome-wide image database of gene expression in
the mouse brain
12Allen Brain Atlas
133D mouse gene expression project
- Single gene expression database for the mouse
research community. Integrated in the Mouse
Genome Database (MGD) at the Jackson Laboratory. - 3600 expression entries
- http//genex.hgu.mrc.ac.uk/MouseGeneExpInfoRes/
WT1 expression (red) on a section of the E9
(Theiler Stage 14) embryo from the Edinburgh
Mouse Atlas. The gut epithelium is shown in
yellow and the neural tube in a blue overlay. WT1
is expressed in the presumptive mesothelium of
the coelom and in the intermediate mesoderm
(ventral to the somites).
14Protein Function
Gene Knockouts
- Automated Binding Assays
- High Throughput Enzyme Assays
Protein Assay
15Genome-wide Knockouts
- Yeast Genome
- Recombination strategy
- Mouse Genome
- More in Functional Genomics!!!
16Essential vs Non-essential
- Transcription similar
- gt99 essential genes transcribed
- Transcript level 70 higher
- gt90 non-essential transcribed
- Genome locations similar
- Not clustered
- Essential genes rarely near telomeres
17Why only 20 essential?
- Redundant
- 8.5 of non-essential had CLOSE homolog in genome
(Plt10-150) - Essential in another condition
- Marginal Benefit
18Resources
- YEAST
- Saccharomyces Genome Deletion Project
- http//www-sequence.stanford.edu/group/yeast_delet
ion_project/deletions3.html
- MOUSE
- Mouse Phenome Database
- http//phenome.jax.org/pub-cgi/phenome/mpdcgi?rtn
docs/home - Knockout Mouse Project
- http//www.knockoutmouse.org/
19Genome-Scale Biochemical Assay
- Protein arrays-biochemically active
20Databases
- Relationships between genes/proteins.
- How are different types of experimental data
integrated? - Schema
- Data quality
- Who curates?
- Who revises?
21Proteome Projects
- SwissProt (ExPasy)
- http//expasy.org/ch2d/
- Saccharomyces Genome Database (SGD) Function
Junction - 2-hybrid, functional assignments, pathways.
- http//db.yeastgenome.org/cgi-bin/functionJunction
- Yale TRIPLES
- Database of TRansposon-Insertion Phenotypes,
Localization, and Expression in Saccharomyces. - 2-hybrid databases
- http//proteome.wayne.edu/YTHwebsites.html
22Pathway and interaction databases
- KEGG (http//www.genome.jp/kegg/)
- Metabolic and signaling pathways
- PUMA (http//compbio.mcs.anl.gov/puma2/cgi-bin/ind
ex.cgi) - Metabolic and signaling pathways
- DIP (http//dip.doe-mbi.ucla.edu/)
- Protein-protein interactions
- BIND (http//bind.ca/)
- Molecular and genetic interactions
23KEGG pathway map
HISTIDINE METABOLISM
Pentose phosphate cycle
5P-D-1-ribulosyl- formimine
3.5.1.-
Phosphoribulosyl- Formimino-AICAR-P
2.6.1.-
Imidazole- acetole P
Phosphoribosyl-AMP
L-Hisyidinal
3.6.1.31
3.5.4.19
5.3.1.16
4.2.1.19
3.1.3.15
2.4.2.17
2.4.2.-
2.6.1.9
PRPP
Phosphoribosyl- Formimino-AICAR-P
Phosphoriboxyl-ATP
Imidazole- Glicerol-3P
L-Histidinol-P
1.1.1.23
5P Ribosyl-5-amino 4- Imidazole
carboxamide (AICAR)
1-Methyl- L-histidine
L-Hisyidinal
3.4.13.5
Aneserine
6.3.2.11
2.1.1.-
2.1.1.22
Purine metabolism
6.3.2.11
Carnosine
1.1.1.23
3.4.13.3
3.4.13.20
6.1.1
N-Formyl-L- aspartate
Imidazolone acetate
Imidazole- 4-acetate
Imidazole acetaldehyde
Histamine
Hercyn
4.1.1.22
1.14135
1.4.3.6
1.2.1.3
3.5.2.-
3.5.3.5
4.1.1.28
L-Histidine
24Integrating pathway and expression data
The list of genes being activated or inactivated
or that are unaffected when comparing two samples
becomes more informative if the genes can be
mapped onto maps from which functions can be
deduced.