Oct 4 - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Oct 4

Description:

Wang et al, Dev Cell 2004. Gene expression in the preimplantation embryo: in ... dimensional plot of all 31 cutaneous melanoma samples showing major cluster of ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 24
Provided by: shen161
Category:
Tags: cutaneous | oct

less

Transcript and Presenter's Notes

Title: Oct 4


1
Presentation volunteer needed
  • Oct 4
  • A genome-wide study of gene activity reveals
    developmental signaling pathways in the
    preimplantation mouse embryo. Wang et al, Dev
    Cell 2004.
  • Gene expression in the preimplantation embryo
    in-vitro developmental changes. Reproductive
    BioMedicine 2005

2
Dimension Reduction Methods
3
Motivation
  • High dimensional data points are difficult to
    visualize to detect or confirm the relationship
    among them
  • In microarray data, one sample point has
    thousands of genes, and one gene points has tens
    of samples

4
If applying dimension reduction
  • Better visualize the unsupervised clustering
    results
  • Color hierarchical or K-means clusters in reduced
    dimension (2 or 3D) to assess cluster tightness
    and outliers
  • Discover clusters visually in lower dimensions

5
Better visualize unsupervised clustering results
-- HC Sample clustering use the 167 filtered
genes -- Three major samples clusters identified
and colored by HC -- Use the cluster information
to project samples from 167 dimension to 2D using
Linear Discriminant Analysis (LDA)
6
Discover clusters visually in lower dimensions
-- HC Sample clustering use the 167 filtered
genes -- Three major samples clusters identified
and colored by HC -- Do NOT use the cluster
information to project samples from 167 dimension
to 2D using Principle Component Analysis (PCA)
The visual clustering after PCA may not agree
with the HC results well
7
Viewing clustering result through dimension
reduction, more example
8
Viewing clustering result through dimension
reduction, more example
Clustering of gene expression data. a,
Hierarchical clustering dendrogram with the
cluster of 19 melanomas at the centre. b, MDS
three-dimensional plot of all 31 cutaneous
melanoma samples showing major cluster of 19
samples (blue, within cylinder), and remaining 12
samples (gold). Bitter et al. Nature. VOL 406
p536, 2000
9
Dimension Reduction Methods
  • Principle Component Analysis (PCA)
  • Linear Discriminant Analysis (LDA)
  • Multi-Dimensional Scaling (MDS)

10
Principle Component Analysis (PCA)
  • Given N data vectors from k-dimensions, find c lt
    k orthogonal vectors that can be best used to
    represent data
  • The original data set is reduced to one
    consisting of N data vectors on c principal
    components (reduced dimensions)
  • Each data vector is a linear combination of the c
    principal component vectors
  • Project on the subspace which preserve the most
    of the data variability

11
Principle Component Analysis (PCA)
12
E.g. a sample point X (gene 1, gene 2, gene 3
gene n) a gene point X (sample 1, sample2,
sample p)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
by
First three PC directions
First two PC directions
17
Homework (Due Sept 25)
  • Read more on Principle Component Analysis
  • http//csnet.otago.ac.nz/cosc453/student_tutorial
    s/principal_components.pdfsearch22Principle20c
    omponent22
  • (Team) Download a gene expression
    datasethttp//bioinfor.bioen.uiuc.edu/bioe598/da
    ta/colon-cancer.xlsRead it into R. Find the top
    200 genes with the largest standard deviation.
    Use these 200 genes to cluster the samples (the
    pam function is k-means clustering). Are you
    satisfied with your clustering result? Are there
    alternative ways to do this? Hand in your report.
  • Inspect the course project datasets on the course
    webpage. What can you do with them?

18
Self-organizing maps (SOM)
Interpreting patterns of gene expression with
self-organizing maps Methods and application to
hematopoietic differentiation, Tamayo et al. PNAS
Vol. 96, pp. 2907, 1999
19
  • Method
  • choose a geometry of nodes (e.g. a 6 by 5
    grid)
  • The nodes are mapped into k-dimensional gene
    expression space (kno. of conditions),
    initially at random, and then iteratively
    adjusted
  • Each iteration involves randomly selecting a data
    point P and moving the nodes in the direction of
    P. The closest node N_P is moved the most,
    whereas other nodes are moved by smaller amounts
    depending on their distance from N_P in the
    initial geometry.

20
(No Transcript)
21
The position of node N at iteration i is denoted
fi(N). The initial mapping f0 is random. On
subsequent iterations, a data point P is selected
and the node NP that maps nearest to P is
identified. The mapping of nodes is then adjusted
by moving points toward P
22
Yeast cell cycle data Cho et al. 1998, Mol. Cell
2, 65-73
23
SOM clustering of periodic genes
Write a Comment
User Comments (0)
About PowerShow.com