Title: Last time:
1Last time -- Analyzing gene expression
microarray data - Similarity metrics -
Clustering methods - Other methods of data
organization LAB Hierarchical clustering
with different similarity metrics, clustering
methods, datasets
2Homework 5 -- Why might you have missed genes
by using p/q values -- An example of when
this is important comparing significant gene
s from 3 different conditions if there is
noise in a subset of the conditions, some of the
lists of significant genes might be
artifactually small this may artificially
little overlap between lists. - What does it
mean for an individual gene to have a q lt
0.35? - Sensitivity business use a subset of
the date as a test case Of the known Hsf1p
targets, we had data for 62 of them. 42 of 62
(67.7) were identified as positives. Therefore,
our sensitivity is 68.
3Homework 6 The microarray data follow the
response to yeast cells to different stresses
(heat shock, H2O2 treatment, amino acid
starvation, and nutrient limitation). The goal
was to experiment with different clustering
parameters to see how they affected the
clustering. Which parameters you use in
clustering depends on what experimental goals you
have
4Which of these is clustered with Euclidean
distance?
When would you want to use Euclidean distance?
5When would you want to use Absolute value
(Pearson correlation)?
6What kinds of information can we extract from
whole-genome expression data?
- Hypothetical functions for uncharacterized genes
- -- genes encoding subunits of multi-subunit
protein complexes - are often highly coregulated
- example ribosomal protein genes, proteasome
genes in yeast - -- genes involved in the same cellular processes
are often coregulated
- 2. New roles for characterized genes
- 3. Better understanding of the experimental
conditions - -- based on expression patterns of characterized
genes - 4. Implications of gene regulation
- -- WT vs. mutants can identify transcription
factor targets - -- promoter analysis of coregulated genes
upstream elements - -- gene coregulation with known pathway targets
can implicate - pathway activity
- Understanding developmental pathways
- Defining experimental samples based on expression
profiles - example comparing tumor samples from patients
7Genes involved in same cellular process are often
coregulated
These genes may not have the same annotation, but
still function together and are thus co-expressed
8GO Gene Ontology A common language to
describe gene function
http//www.geneontology.org/ Initiated by the GO
Consortium (which started as model-organism DBs)
Controlled vocabulary for gene product function
and the relationships between them (alla DAG
directed acyclic graph ie. Parents can have
more than one children
9GO Gene Ontology A common language to
describe gene function
Initiated by the GO Consortium (which started as
model-organism DBs)
Biological Process Molecular Function
Cellular Component
10Enrichment of specific biological-process
annotations
11Enrichment of specific molecular-function
annotations
Be careful of relying too heavily on annotations
12How can you tell if your clustering is
significant?
Genes induced by carbon starvation
13M choose i of possible groups of size
i composed of the objects M M !
(M-i)! i !
14Homework question what is a hypothetical
function for YGR136W?
Goal cluster different datasets, identify
YGR136W cluster, look for enrichment of GO
categories using the FUNSPEC website.
What functions were significantly enriched in
your cluster chosen based on HS_timecourse data
only? What functions were significantly enriched
in the cluster chosen based on the multi-stress
dataset? Which do you trust more? Note when you
might want to use a lot of experiments vs. a few
in your clustering.
15What kinds of information can we extract from
whole-genome expression data?
- Hypothetical functions for uncharacterized genes
- -- genes encoding subunits of multi-subunit
protein complexes - are often highly coregulated
- example ribosomal protein genes, proteasome
genes in yeast - -- genes involved in the same cellular processes
are often coregulated
- 2. New roles for characterized genes
- 3. Better understanding of the experimental
conditions - -- based on expression patterns of characterized
genes - 4. Implications of gene regulation
- -- WT vs. mutants can identify transcription
factor targets - -- promoter analysis of coregulated genes
upstream elements - -- gene coregulation with known pathway targets
can implicate - pathway activity
- Understanding developmental pathways
- Defining experimental samples based on expression
profiles - example comparing tumor samples from patients
16Many similarly expressed genes are coregulated by
the same transcription factor(s) Therefore,
can search promoters of coregulated genes for
binding sites
Genes induced by carbon starvation
17Many similarly expressed genes are coregulated by
the same transcription factor(s) Therefore,
can search promoters of coregulated genes for
binding sites
Genes induced by carbon starvation
ORFs
Upstream region
18Many similarly expressed genes are coregulated by
the same transcription factor(s) Therefore,
can search promoters of coregulated genes for
binding sites
Genes induced by carbon starvation
ORFs
Upstream region
Similar sequence found in most upstream
regions (here CCAAT which Hap4p binding site)
19Sequencing
20(No Transcript)