Title: Comparative genomics, ChIP-chip and transfections to find cis-regulatory modules
1Comparative genomics, ChIP-chip and transfections
to find cis-regulatory modules
What is conservation good for??
- Penn State University, Center for Comparative
Genomics and Bioinformatics Webb Miller,
Francesca Chiaromonte, Ross Hardison - Childrens Hospital of Philadelphia Mitch Weiss,
Lou Dore - NimbleGen Roland Green, Xinmin Zhang
Cold Spring Harbor, March 2007
2Ideal cases for interpretation by comparative
genomics
3Putative transcriptional regulatory regions
pTRRs
- Antibodies vs 10 sequence-specific factors
- Sp1, Sp3, E2F1, E2F4, cMyc, STAT1, cJun, CEBPe,
PU1, RA Receptor A - High resolution ChIP-chip platforms Affymetrix
and NimbleGen - Data from several different labs in ENCODE
consortium - High likelihood hits for ChIP-chip
- 5 false discovery rate
- Supported by chromatin modification data
- Modified histones in chromatin H4Ac, H3Ac,
H3K4me, H3K4me2, H3K4me3, etc. - DNase hypersensitive sites (DHSs) or nucleosome
depleted sites - Result set of 1369 pTRRs
4Functional classes show distinctive trends in
phylogenetic depth of conservation
5Genes likely regulated by clade-specific pTRRs
are enriched for distinctive functions
Percentage of pTRRs that align no further than
David King
Primates 3
Millions of years
91
Eutherians 71
173
310
Marsupials 21
450
Tetrapods 4
Vertebrates 1
6Regulatory potential (RP) captures pattern,
composition and constraint in alignments
Genome Research 161585 (2006)
- High RP for an aligned sequence means it contains
patterns similar to those found in gene
regulatory regions - Positive training set Alignments of known
regulatory regions - Negative training set Alignments of likely
neutral DNA (ancestral repeats) - Human and mouse RP scores are on UCSC Genome
Browser and PSUs Galaxy
7High RP plus conserved consensus motif is a good
predictor of CRMs around GATA-1 regulated genes
Genome Research 161480 (2006)
8Genes Co-expressed in Late Erythroid Maturation
G1E cells proerythroblast line lacking the
transcription factor GATA-1. G1E-ER cells
rescued by expressing an estrogen-responsive form
of GATA-1 Rylski et al., Mol Cell Biol. 2003
9Predict CRMs based on alignment and expression of
nearby genes
- Gene is up- or down-regulated by GATA-1
- Noncoding DNA sequence
- Aligns between mouse and other mammals and has a
positive RP score - Contains a conserved consensus binding site motif
for GATA-1
10preCRMs with conserved consensus GATA-1 BS tend
to be active on transfected plasmids
11DNA segments with positive RP and a GATA-1
binding motif validate as enhancers at a good
rate
RP consensus motif Tested Validated
Success Positive conserved 44 23
52 Positive mouse 6 4
67 Negative conserved 6 1
17 Negative none 17 0 0
12Design of ChIP-chip for occupancy by GATA-1
- Non-overlapping tiling array with 50bp probe and
100bp resolution (NimbleGen) - Cover range
- Mouse chr757225996-123812258 (70Mbp)
- 3. Antibody against the ER portion of
GATA-1-ER protein in rescued G1E-ER4 cells
Yong Cheng (PSU), with Mitch Weiss Lou Dore
(CHoP), Roland Green, Xinmin Zhang(NimbleGen)
13Signals in known occupied sites in Hbb LCR
HS1
HS2
HS3
1) Cluster of high signals 2) hill shape of the
signals
14ChIP-chip hits are high quality and tend to have
GATA-1 binding motifs
- Peak calling by Mpeak (Ren) and Tamalpais (Beida
and Farnham) gave 321 ChIP-chip hits - 19 hits were tested by qPCR
- 13 were validated 70
- 267 out of the 321 (83) have WGATAR motifs,
binding site for GATA-1 - Random sampling on average gives 102 DNA segments
with the motif - The ChIP-chip hits are 2.6-fold enriched for the
GATA-1 binding site motif
15Only HALF the GATA-1 binding site motifs are
conserved outside rodents
- Of the GATA-1 binding motifs in those 249 hits,
112 (45) are conserved between mouse and at
least one non-rodent species.
16Distribution of ChIP-chip hits on 70Mb of mouse
chr7
Yong Cheng, Yuepin Zhou and Christine Dorman
1736 of ChIP-chip hits act as enhancers in K562
cells
14.5
5.7
21 out of 59 ChIP-chip hits increase activity of
HBGpr-Luc in K562 cells.
GATA-1 occupied sites by ChIP-chip
No GATA-1
1830 of ChIP-chip hits act as enhancers in MEL
cells
15 out of 50 ChIP-chip hits increase activity of
HBGpr-Luc in MEL cells.
GATA-1 occupied sites by ChIP-chip
No GATA-1
19Validated ChIP hit, enhancer, deep conservation
20Validated ChIP hit, enhancer, limited conservation
21ChIP-chip hit, enhancer, rodent specific
22Test of neutrality using polymorphism and
divergence data
23A promoter distal to the beta-like globin genes
has a signal for recent purifying selection
24The distal promoter is close to the locus control
region for beta-globin genes
25Evolutionary approaches to predicting and
analyzing regulatory regions
- Sequence comparison alone will not detect all
regulatory regions - Need comprehensive protein-binding data
- Comparative genomics can help interpret the
binding data - Aspects of regulation of some functional groups
are clade-specific - Depth of conservation may correlate with certain
types of function - Strong constraint on basal mechanisms?
- Lineage-specific fine tuning?
- A majority of sites occupied by GATA-1 in G1E-ER
cells have some function other than enhancement
(by our assays) - Incorporation of pattern and composition
information along with with conservation can lead
to effective discrimination of functional classes
(regulatory potential).
26Many thanks
PSU Database crew Belinda Giardine, Cathy
Riemer, Yi Zhang, Anton Nekrutenko
BYong Cheng, Ross, Yuepin Zhou, David
King FYing Zhang, Joel Martin, Christine Dorman,
Hao Wang
RP scores and other bioinformatic
input Francesca Chiaromonte, James Taylor, Shan
Yang, Diana Kolbe, Laura Elnitski
Alignments, chains, nets, browsers, ideas, Webb
Miller, Jim Kent, David Haussler
Funding from NIDDK, NHGRI, Huck Institutes of
Life Sciences at PSU
27Categories of Tested DNA Segments
28Regulatory potential (RP) to distinguish
functional classes
29Examples of validated preCRMs
30ChIP-chip hits for GATA-1 occupancy
Technical replicates of ChIP-chip with antibody
against GATA1-ER
Mpeak
TAMALPAIS
275 hits in both
276 hits in both
216
60
59
321 total ChIP-chip hits
19 ChIP-chip hits were tested by qPCR 13 were
validated 70