Structural Genomics presentation

About This Presentation

Transcript and Presenter's Notes

Title: Structural Genomics

1
Structural Genomics

Entire genomes are being sequenced
Data bases full of genomic sequences
Provides the string of nucleotides
Copy number
Intergenic regions

2
Information in a Genome
Genome (DNA)
Transcriptome (RNA)
Proteome (protein)
3
Functional Genomics

Understand gene function gene interactions
Transcriptional Regulation
What genes how much they change
What tissue
What stimulus
Proteins
difficult to purify
may require modification for activity
Monitoring RNA is relatively easy
Remember proteins do the work

4
Northern Blot Strategy

Harvest tissue and extract RNA
Electrophoresis yields size separation
Design label probe of known sequence
Hybridize complements anneal
Visualize quantify signal

5
Northern Blot Strategy
---GUACCGUAGUCGACU---
6
Northern Blot Strategy
7
Northern Blot Strategy
Dot Blot or Slot Blot
Northern Blot
Con
Increasing treatment level
8
Northern Blot Strategy
Final Product
9
Northern Blot StrategyExpanded

Ideal situation would be to study expression of
all genes simultaneously
Problem
Probe generation requires prior knowledge of
sequence
Large number of probes to be made
Solution
Genome projects Genomic EST/cDNA
fragments
Electronic databases
High-throughput, assembly line style,
semi-automated processes to make microarray chips

10
Microarray/Chip Reverse Northern

Array Fabrication
Target Labeling and Hybridization
Detection and quantitation of signal

11
Microarray Fabrication

ssDNA deposited on a solid surface in a defined
grid Probe
Advantages
Can put many different DNAs on a single surface
Dont need prior knowledge of the gene or its
function

12
Microarray Fabrication

Spotting DNA on Glass Slides
Probe generation
Amplify cDNA inserts from a cDNA library
large fragments (full-length or near full-length)
Design synthetic oligos from electronic database
oligonucleotides (50-80 mers)
Deposit probe in defined positions
Requires automation or semi-automation

13
Microarray Fabrication

Robotic deposits DNA at defined coordinates
Pins deposit small amounts of liquid on surface
ul containing 1-10 ng per spot
10,000-40,000, 100,000? spot per slide

14
(No Transcript)
15
Microarray Fabrication
16
Microarray Fabrication

Each spot representing a different sequence,
has a unique physical location
May or may not have knowledge of the sequence

EST189022
M91589 Rat Beta-arrestin1 cds
U73142 Rat Mitogen activated protein kinase cds
17
Basic Microarray Experiment
18
Assumptions of Gene Expression Studies

Tight correlation between gene function and
expression pattern
Expression patterns determine cell type and
function
Expressed genes reflect the the environment
and/or internal state of the cell

19
Target Labeling Hybridization

RNA extraction
mRNA extracted from control and treated
experimental units
Target Labeling
cDNA synthesis used to incorporate labeled
nucleotides
Hybridization
Labeled target binds specifically and
quantitatively to its complement (probe) on the
microarray

20
RNA Isolation

Raw sample contains biochemical contaminants
which need to be removed (protein, DNA, cellular
debris)

mRNA quality has the largest effect on the
success of the experiment
21
RNA IsolationQuality Assessment
22
Target Labeling

To allow quantitative measurements, nucleic acid
is labeled with nucleotides that contain a fluor,
biotin or radioactivity

23
Target Labeling
24
Target Labeling
RNA Extraction
25
Target Labeling
26
Target Labeling
TTTT
(dT)24 Primed
27
Target Labeling
TTTT
Reverse Transcription
28
Target Labeling
T T T T
First Strand
29
Target Labeling
T T T T
RNaseH nicks RNA Strand
30
Target Labeling
Cells
A A A A
mRNA
T T T T
DNA Polymerase Extends From Nick
31
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
32
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
33
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
34
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
35
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
Second Strand
36
Target Labeling

Prepare mRNA from different sources
Incorporation of different fluorescent labels
Cy3 (green) for one sample
Cy5 (red) used for other sample
Mix samples
Hybridize

37
Hybridization

The labeled target is selectively bound to
complementary probes
Note know the location of each probe on the
array

x 12,000
38

Competitive Hybridization
39
Competitive Hybridization Reference Sample
40
Scanning / Visualization

Signal intensity for each probe is quantitatively
measured at each of two wave lengths
Signal intensity represents the quantity of the
transcript from each original sample
Thus, the ratio of the signal intensities
represents the relative change in gene expression
between samples

41
Data Analysis

Large data sets require computing power
Clustering genes with common expression patterns
is a common way to show microarray results

42
Data Analysis

After the Chips

43
MicroarrayData Analysis

Data Analysis
Lists
Differentially Expressed Genes
Above a fold change (2X, 3X, 5X)
ANOVA
Cluster
Groups similarly responding genes or arrays
Hierarchical
K-Means
PCA
Functional Annotation

44
Data AnalysisLists
45
Data AnalysisANOVA Table
46
Data Analysis Agglomerative Hierarchical
Clustering

Bottom-up clustering method
Clusters have sub-clusters, which have
sub-clusters, etc.
Process
Each signal value is a separate cluster
Evaluate all pair-wise distances between
Construct a distance matrix using the distance
values
Look for the pair of clusters with the shortest
distance
Remove the pair from the matrix and merge them
Evaluate all distances from this new cluster to
all other clusters
Repeat until the distance matrix is reduced to a
single element
Visualize tree
Results
It can produce an ordering of the objects, which
may be informative
Smaller clusters are generated, which may be
helpful for discovery

47
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair
48
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1
49
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1 Find most similar to group,
consider 1
50
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1 Find most similar to group,
consider 1 Repeat process till only 1
point Create tree
51
Data Analysis Agglomerative Hierarchical
Clustering
Data set contains only significant genes Green
(-), Red (), Black (0) Black lines indicate
similarity Short lines imply greater
similarity Rows are individual genes Columns
are different chips
52
Data Analysis Agglomerative Hierarchical
Clustering
Two different cluster analyses Small branches of
larger clusters
53
Data Analysis Agglomerative Hierarchical
Clustering
Small branch of large cluster analysis Lines
show similarity between treatments 10 KPa most
similar to each other 2 most similar to
10KPa 101 KPa somewhat dissimilar to each
other Gene Id list on right
54
Data Analysis K-Means Clustering

K-Means Clustering
creates a specific number of non-hierarchical
clusters
non-deterministic and iterative
Properties
always K clusters.
always at least one item in each cluster.
clusters are non-hierarchical
every member closer to its cluster than any other
cluster

55
Data Analysis K-Means Clustering

Process
The dataset is partitioned into K clusters
randomly with roughly the same number of data
points
Calculate cluster centroid
For each data point
Calculate the distance from data point to each
cluster centroid
If data point is closest to its own cluster,
leave it where it is. If not, move it into the
closest cluster
Recalculate centroid
Repeat second step until a complete pass through
of all the data points results in no data point
moving from one cluster to another
initial partition can greatly affect the final
clusters that result, in terms of inter-cluster
and intra-cluster distances and cohesion
Try different number of groups

56
Data Analysis K-Means Clustering
Approximately 250 genes Significant treatment
effect Partitioned into 10 Clusters
Identify different expression patterns
Isolate individual groups
57
Data Analysis K-Means Clustering
Isolate individual groups Find mean of each
column Plot mean
Graphs display different expression patterns
58
Data Analysis Principal Component Analysis

Reduces the dimensionality of the data
Axis of greatest effect identified
Relation between this line and points determined
Repeat process

59
Data Analysis Principal Component AnalysisTime
Zero
60
Data Analysis Principal Component Analysis
61
Data Analysis Principal Component Analysis
62
Data Analysis Principal Component Analysis
63
Data Analysis Principal Component Analysis
64
Data Analysis Principal Component Analysis
65
Data Analysis Principal Component Analysis
66
Data Analysis Principal Component Analysis
67
Data Analysis Principal Component Analysis
68
Data Analysis Functional Annotation

Annotation
Gene description
thioredoxin reductase 1
Gene function (Ontology)
Signal transducer
Pathway information
Proteasome degraduation

69
Data Analysis Functional Annotation
Genes within a cluster or influenced by a
treatment can be annotated
70
Data Analysis Functional Annotation
General
Specific
71
Data Analysis Functional Annotation

Write a Comment

User Comments (0)

About PowerShow.com

Structural Genomics PowerPoint PPT Presentation