Structural Genomics - PowerPoint PPT Presentation

1 / 71
About This Presentation
Title:

Structural Genomics

Description:

Find 2 most similar and consider 1. Data Analysis. Agglomerative Hierarchical Clustering ... Data set contains only significant genes. Green (-), Red ( ), Black (0) ... – PowerPoint PPT presentation

Number of Views:1300
Avg rating:3.0/5.0
Slides: 72
Provided by: bill251
Category:

less

Transcript and Presenter's Notes

Title: Structural Genomics


1
Structural Genomics
  • Entire genomes are being sequenced
  • Data bases full of genomic sequences
  • Provides the string of nucleotides
  • Copy number
  • Intergenic regions

2
Information in a Genome
Genome (DNA)
Transcriptome (RNA)
Proteome (protein)
3
Functional Genomics
  • Understand gene function gene interactions
  • Transcriptional Regulation
  • What genes how much they change
  • What tissue
  • What stimulus
  • Proteins
  • difficult to purify
  • may require modification for activity
  • Monitoring RNA is relatively easy
  • Remember proteins do the work

4
Northern Blot Strategy
  • Harvest tissue and extract RNA
  • Electrophoresis yields size separation
  • Design label probe of known sequence
  • Hybridize complements anneal
  • Visualize quantify signal

5
Northern Blot Strategy
---GUACCGUAGUCGACU---
6
Northern Blot Strategy
7
Northern Blot Strategy
Dot Blot or Slot Blot
Northern Blot
Con
Increasing treatment level
8
Northern Blot Strategy
Final Product
9
Northern Blot StrategyExpanded
  • Ideal situation would be to study expression of
    all genes simultaneously
  • Problem
  • Probe generation requires prior knowledge of
    sequence
  • Large number of probes to be made
  • Solution
  • Genome projects Genomic EST/cDNA
    fragments
  • Electronic databases
  • High-throughput, assembly line style,
    semi-automated processes to make microarray chips

10
Microarray/Chip Reverse Northern
  • Array Fabrication
  • Target Labeling and Hybridization
  • Detection and quantitation of signal

11
Microarray Fabrication
  • ssDNA deposited on a solid surface in a defined
    grid Probe
  • Advantages
  • Can put many different DNAs on a single surface
  • Dont need prior knowledge of the gene or its
    function

12
Microarray Fabrication
  • Spotting DNA on Glass Slides
  • Probe generation
  • Amplify cDNA inserts from a cDNA library
  • large fragments (full-length or near full-length)
  • Design synthetic oligos from electronic database
  • oligonucleotides (50-80 mers)
  • Deposit probe in defined positions
  • Requires automation or semi-automation

13
Microarray Fabrication
  • Robotic deposits DNA at defined coordinates
  • Pins deposit small amounts of liquid on surface
  • ul containing 1-10 ng per spot
  • 10,000-40,000, 100,000? spot per slide

14
(No Transcript)
15
Microarray Fabrication
16
Microarray Fabrication
  • Each spot representing a different sequence,
    has a unique physical location
  • May or may not have knowledge of the sequence

EST189022
M91589 Rat Beta-arrestin1 cds
U73142 Rat Mitogen activated protein kinase cds
17
Basic Microarray Experiment
18
Assumptions of Gene Expression Studies
  • Tight correlation between gene function and
    expression pattern
  • Expression patterns determine cell type and
    function
  • Expressed genes reflect the the environment
    and/or internal state of the cell

19
Target Labeling Hybridization
  • RNA extraction
  • mRNA extracted from control and treated
    experimental units
  • Target Labeling
  • cDNA synthesis used to incorporate labeled
    nucleotides
  • Hybridization
  • Labeled target binds specifically and
    quantitatively to its complement (probe) on the
    microarray

20
RNA Isolation
  • Raw sample contains biochemical contaminants
    which need to be removed (protein, DNA, cellular
    debris)

mRNA quality has the largest effect on the
success of the experiment
21
RNA IsolationQuality Assessment
22
Target Labeling
  • To allow quantitative measurements, nucleic acid
    is labeled with nucleotides that contain a fluor,
    biotin or radioactivity

23
Target Labeling
24
Target Labeling
RNA Extraction
25
Target Labeling
26
Target Labeling
TTTT
(dT)24 Primed
27
Target Labeling
TTTT
Reverse Transcription
28
Target Labeling
T T T T
First Strand
29
Target Labeling
T T T T
RNaseH nicks RNA Strand
30
Target Labeling
Cells
A A A A
mRNA
T T T T
DNA Polymerase Extends From Nick
31
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
32
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
33
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
34
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
DNA Polymerase Extends From Nick
35
Target Labeling
Cells
A A A A
mRNA
T T T T
A A A A
cDNA
T T T T
Second Strand
36
Target Labeling
  • Prepare mRNA from different sources
  • Incorporation of different fluorescent labels
  • Cy3 (green) for one sample
  • Cy5 (red) used for other sample
  • Mix samples
  • Hybridize

37
Hybridization
  • The labeled target is selectively bound to
    complementary probes
  • Note know the location of each probe on the
    array

x 12,000
38

Competitive Hybridization
39
Competitive Hybridization Reference Sample
40
Scanning / Visualization
  • Signal intensity for each probe is quantitatively
    measured at each of two wave lengths
  • Signal intensity represents the quantity of the
    transcript from each original sample
  • Thus, the ratio of the signal intensities
    represents the relative change in gene expression
    between samples

41
Data Analysis
  • Large data sets require computing power
  • Clustering genes with common expression patterns
    is a common way to show microarray results

42
Data Analysis
  • After the Chips

43
MicroarrayData Analysis
  • Data Analysis
  • Lists
  • Differentially Expressed Genes
  • Above a fold change (2X, 3X, 5X)
  • ANOVA
  • Cluster
  • Groups similarly responding genes or arrays
  • Hierarchical
  • K-Means
  • PCA
  • Functional Annotation

44
Data AnalysisLists
45
Data AnalysisANOVA Table
46
Data Analysis Agglomerative Hierarchical
Clustering
  • Bottom-up clustering method
  • Clusters have sub-clusters, which have
    sub-clusters, etc.
  • Process
  • Each signal value is a separate cluster
  • Evaluate all pair-wise distances between
  • Construct a distance matrix using the distance
    values
  • Look for the pair of clusters with the shortest
    distance
  • Remove the pair from the matrix and merge them
  • Evaluate all distances from this new cluster to
    all other clusters
  • Repeat until the distance matrix is reduced to a
    single element
  • Visualize tree
  •  
  • Results
  • It can produce an ordering of the objects, which
    may be informative
  • Smaller clusters are generated, which may be
    helpful for discovery

47
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair
48
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1
49
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1 Find most similar to group,
consider 1
50
Data Analysis Agglomerative Hierarchical
Clustering
Points in n dimensional space Create difference
measure between each pair Find 2 most similar
and consider 1 Find most similar to group,
consider 1 Repeat process till only 1
point Create tree
51
Data Analysis Agglomerative Hierarchical
Clustering
Data set contains only significant genes Green
(-), Red (), Black (0) Black lines indicate
similarity Short lines imply greater
similarity Rows are individual genes Columns
are different chips
52
Data Analysis Agglomerative Hierarchical
Clustering
Two different cluster analyses Small branches of
larger clusters
53
Data Analysis Agglomerative Hierarchical
Clustering
Small branch of large cluster analysis Lines
show similarity between treatments 10 KPa most
similar to each other 2 most similar to
10KPa 101 KPa somewhat dissimilar to each
other Gene Id list on right
54
Data Analysis K-Means Clustering
  • K-Means Clustering
  • creates a specific number of non-hierarchical
    clusters
  • non-deterministic and iterative 
  • Properties
  • always K clusters.
  • always at least one item in each cluster.
  • clusters are non-hierarchical
  • every member closer to its cluster than any other
    cluster
  •  

55
Data Analysis K-Means Clustering
  • Process
  • The dataset is partitioned into K clusters
    randomly with roughly the same number of data
    points
  • Calculate cluster centroid
  • For each data point
  • Calculate the distance from data point to each
    cluster centroid
  • If data point is closest to its own cluster,
    leave it where it is. If not, move it into the
    closest cluster
  • Recalculate centroid
  • Repeat second step until a complete pass through
    of all the data points results in no data point
    moving from one cluster to another
  • initial partition can greatly affect the final
    clusters that result, in terms of inter-cluster
    and intra-cluster distances and cohesion
  • Try different number of groups 

56
Data Analysis K-Means Clustering
Approximately 250 genes Significant treatment
effect Partitioned into 10 Clusters
Identify different expression patterns
Isolate individual groups
57
Data Analysis K-Means Clustering
Isolate individual groups Find mean of each
column Plot mean
Graphs display different expression patterns
58
Data Analysis Principal Component Analysis
  • Reduces the dimensionality of the data
  • Axis of greatest effect identified
  • Relation between this line and points determined
  • Repeat process

59
Data Analysis Principal Component AnalysisTime
Zero
60
Data Analysis Principal Component Analysis
61
Data Analysis Principal Component Analysis
62
Data Analysis Principal Component Analysis
63
Data Analysis Principal Component Analysis
64
Data Analysis Principal Component Analysis
65
Data Analysis Principal Component Analysis
66
Data Analysis Principal Component Analysis
67
Data Analysis Principal Component Analysis
68
Data Analysis Functional Annotation
  • Annotation
  • Gene description
  • thioredoxin reductase 1
  • Gene function (Ontology)
  • Signal transducer
  • Pathway information
  • Proteasome degraduation

69
Data Analysis Functional Annotation
Genes within a cluster or influenced by a
treatment can be annotated
70
Data Analysis Functional Annotation
General
Specific
71
Data Analysis Functional Annotation
Write a Comment
User Comments (0)
About PowerShow.com