ENCODE Chromatin Replication Subgroup - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

ENCODE Chromatin Replication Subgroup

Description:

How can chromatin & replication analysis deepen and extend the current ... ORCHID stats. Greg Crawford (NHGRI) HS stats. Mike Hawrylycz (Regulome/AIBS) ... – PowerPoint PPT presentation

Number of Views:109
Avg rating:3.0/5.0
Slides: 52
Provided by: john403
Category:

less

Transcript and Presenter's Notes

Title: ENCODE Chromatin Replication Subgroup


1
ENCODE Chromatin / Replication Subgroup Summary
of Preliminary Analyses July 18, 2005
2
Major Questions
How can chromatin replication analysis deepen
and extend the current annotation of the human
genome? How can large-scale chromatin/replicatio
n analyses illuminate the biology of human gene
regulation and relate it to the physical
organization of the genome?
3
Overall Goals
Chromatin Accessibility (DNaseI)
Histone Modifications
Genes Transcription 1oSequence Features Conservati
on
Replication
4
The ENCODE Chromatin/Replication Workshop Team
Shamil Sunyaev (Harvard) Conservation stats
Chris Spencer (Oxford) Recomb hotspot stats
Rob Andrews (Sanger) Histone mods
Greg Crawford (NHGRI) HS stats
Jay Greenbaum (BU) ORCHID stats
Terry Furey (Duke) CpG effects
Scot Kuehn (U Wash) Annotation pipeline
Ian Dunham (Sanger) Histone mods
Chris Taylor (UVa) Replication stats
Ankit Malhotra (UVa) Replication stats
Anindya Dutta (UVa) Replication stats
John Stam (Regulome) DNaseI stats
Bob Thurman (U Wash) Wavelet correlation pipeline
Mike Hawrylycz (Regulome/AIBS) Mutual information
correlations
5
The Data Sets
DNaseI sensitivity / hypersensitivity
(UW/Regulome, NHGRI) Histone modifications
(Sanger) DNA Replication (UVa) Transcription
(Affy and Yale) OH radical cleavage prediction
(BU) Recombination rate (Oxford) Gencode
(ENCODE Genes Transcripts group/Havana)
6
Specific Aims and Approach
  • Define the union/intersection of major
    experimental data sets
  • and genomic features/annotations
  • Approach Genomic feature annotation
    pipeline (gt30 features)
  • Segment continuous data types using a
    standardized approach
  • Approach 2-, 3-, and 4-state HMMs
    segmentation pipeline
  • Examine the short- and long-range correlations
    between
  • major experimental data sets and genomic features
  • Approaches Wavelet heatmap and correlation
    pipeline
  • Mutual information correlation pipeline

7
Distribution of DNaseI HSs vs. TSS in Different
Gene Annotations
8
DNaseI HSs, CpG islands, and Conservation
CpG
CNSs
HSs
9-22 (depending on definition of CNS)
25-64 (depending on definition of CpG island)
9
Sorting Out CpG Effects
Histone Mods
Tissue HSs
Gene Annotations
Transfrags TARs
10
Functionality of CpG islands
conservative criteria
11
Frequency of different histone codes at TSSs
H4Ac H3Ac H3K4Me3 H3K4Me2 H3K4Me1
12
Histone H3 Methylation Code at Transcription
Start Sites
13
Histone H3 Methylation Combinatorial Code for
DNaseI HSs
14
DNA Replication Chromatin
15
Replication dynamics of chromosomes
Time
16
TR50 Calculation
TR50 - Time at which 50 of the locus is
replicated In the example below, probe A has
a TR50 of 1.25hr (80 at 2hr, 0 at 0hr) probe
B has a TR50 of 6.33hr (100 at 8hr, 40 at 6hr)
Example
Probe Probe
17
  • TR50 improves the analysis of the data
  • For segregating pan-S and various temporally
    specific segments
  • For defining chromatin domains
  • For predicting origins

18
Specific Classification
  • Within a specific region, we classify sub-regions
    as early, mid, or late based on the average TR50
  • For ENCODE regions
  • 23 Pan-S
  • 77 Specific
  • 35 Early
  • 38 Mid
  • 27 Late

Taylor, Malhotra
19
Examples
ENm005
ENm012
20
TR50 for defining chromosomal domains
21
ENm005 - Temporal Profile of Replication
ENm005 Replicates with Possible origins
A
B
E
F
C
D
G
H
K
L
I
J
22
Time of replication confirmed by interphase FISH
Replicated
Unreplicated
23
Confirmation of replication time by interphase
FISH
0hr
2hr
4hr
6hr
8hr
10hr
early
late
Karnani
24
Chromosomal domain transcription
25
TR50 for defining origins
26
Sequence features of known metazoan replicators
IR
IR
IR
IR
IR
IR
27
Conf. Start Conf. End Difference Avg. Pred.
A 5,065,180 5,092,935 27,755
5,082,768 B 5,178,000 5,219,000
41,000 5,178,000 -5,219,000
C 5,263,570 5,292,750 29,180 5,271,110 D 5,36
6,290 5,460,645 94,355 5,399,952 E 5,543,905 5
,568,880 24,975 5,558,825 F 5,650,750 5,681,275
30,525 5,667,011
A
B
C
D
E
F
28
LCR
29
  • Correlation of replication dynamics with
  • DNA features
  • Chromatin features

30
High gene-density correlates with early
replication
TR50
Gene Density (50 kb window)
31
High AT content correlates with late replication
TR50
AT Content
32
Predicted origins distributed equally in all
temporal segments
33
DNAse I hypersensitive sites correlate with early
replication (and with pan-S replication)
Taylor, Malhotra, Stam, Crawford, Collins, Kuehn,
Noble
34
DNAse I hypersensitive sites correlate with early
replication (and with pan-S replication)
35
Histone modification marks correlate with early
replication (and pan-S)
Taylor, Malhotra, Dunham, Stam, Kuehn, Noble
36
Is HeLa cell replication dynamics saying
something general about DNA/chromatin structure
across cell lines?
Why is pan-S replication correlated with DNAse
hypersensitivity sites, histone modifications and
recombination hot spots?
Do predicted origins correlate with something
proximity to genes, motifs, MCS etc?
37
Chromatin Conservation
38
Conservation Patterns in CpG Islands
39
Conservation Patterns in HSs and HS-CpG Islands
40
Region-specific Variation in Conservation Patterns
CpG
-CpG
ENm006
ENm010
Without CpG islands
With CpG islands
41
Correlating Chromatin Features
42
Visualizing and quantifying higher-order
chromatin features using wavelet analysis
DNaseI Sensitivity
Scale of feature (kb)
1Mb
Wavelet analyses allow simultaneous visualization
of features and the scale over which they occur
43
Visualizing and quantifying higher-order
chromatin features using wavelet analysis
DNaseI Sensitivity
Gencode annotation
Scale of feature (kb)
1Mb
Wavelet analyses allow simultaneous visualization
of features and the scale over which they occur
44
Chromatin accessibility (DNaseI) vs. Histone
modifications
DNaseI
H3K4Me3
H3K4Me2
H3K4Me1
45
Chromatin accessibility (DNaseI) vs. Histone
modifications
DNaseI
H3K4Me3
H3Ac
H4Ac
46
Chromatin accessibility (DNaseI) vs. Histone
modifications
DNaseI
H3K4Me3
Correlation coefficient (-1 to 1)
H3K4Me2
H3K4Me1
Scale over which correlation occurs (kb)
47
Cross-Correlating Experimental and Genomic
Feature Sets
48
Wavelet correlations over 72 experimental/feature
pairs
Ubiquitous correlations (DNaseI vs. Histone Mods)
Region-specific correlations (high correlation is
specific to subset of ENCODE regions)
correlation coefficient gt 0.7 (at one or more
scales)
49
Negative correlation between recombination rate
chromatin accessibility
P lt 0.005
50
Mutual Information (MI) correlation analysis In
progress
DNaseI
51
Next Steps
  • Unraveling region-specific correlations What
    features are driving
  • the correlation?
  • Patterns of conservation in DNaseI HSs
    Regional or focal?
  • Quantification of chromatin domains/features
    correlation
  • with the gene/transcript annotation
  • Needed
  • Further methods development (segmentation,
    correlation)
  • More experimental data from the same cell
    type(s)!
Write a Comment
User Comments (0)
About PowerShow.com