Title: ABSTRACT
1Causal analysis of the androgen-induced
transcriptional program of human prostate cancer
Dexter Pratt1, William Hahn2, Jack Pollard1,
Andrea Matthews1, Phillip Febbo2, Ranann Berger2,
Brian Duckworth1, Josh Levy1, Toby Segaran1,
Justin Sun1, David Kightley1, Bill Ladd1, Jacques
Pappo1, and Keith Elliston1
1Genstruct, Inc., Cambridge, MA 02140
2Department of Medical Oncology, Dana-Farber
Cancer Institute, Boston MA 02115
Androgen-induced gene expression changes
explained by upstream causal networks
ABSTRACT A critical event in the progression of
prostate cancer (CaP) is the transition from
androgen dependent to androgen independent
growth. The LNCaP cell line is a model for
androgen dependent CaP and many expression
profiling experiments have measured its
transcriptional response to androgen stimulation.
Gene expression microarray experiments typically
show changes in hundreds, even thousands of
genes, making it difficult to develop
systematically coherent hypotheses to explain the
observed changes. Here we describe a causal
network model of prostate cancer biology and the
use of expression profiling data from an
experiment in which LNCaP cells were stimulated
with synthetic androgen. The data was used to
interrogate the model and generate rational and
testable hypotheses for the mechanism of androgen
stimulated cell proliferation in prostate cancer.
Automated causal analysis was used to define the
upstream network of molecular events that could
result in the observed expression changes. The
generated hypotheses were ranked by the
concordance of their predictions with the
expression dataset. In contrast to previous LNCaP
studies in which genes were hierarchically
clustered by their pattern of response1, the
causal analysis approach identifies possible
explanations in terms of discrete molecular
mechanisms. We have defined changes in cell
proliferation and fatty acid synthesis
transcriptional control mechanisms based on the
expression changes in transcriptional targets of
proteins such as RB1, E2F1,2,3, and SREBF1,2.
Further analysis has identified multiple causal
pathways linking the activity of the androgen
receptor (AR) to these processes, providing
succinct, testable hypotheses for subsequent
experiments. These findings show the utility of
knowledge assembly models and causal reasoning,
and suggest a powerful new approach for
interpreting molecular profiling data in drug
discovery.
ANDROGEN STIMULATION AND EXPRESSION PROFILING
LNCaP cells were cultured in the presence or
absence of the synthetic androgen R1881 (0.1 nM),
and RNA was sampled at 3, 6, 12 and 24 hr
post-stimulation. Gene expression profiling of
the RNA samples was performed using Affymetrix
HU133a chip sets. Raw intensity data were
analyzed by the R application using the
Bioconductor Affy and limma packages, starting
with intra-array normalization, background
correction, low intensity filtering and scaling.
Differentially expressed genes were selected
using gene by gene ANOVA, then filtered by
fold-change (gt1.3) and p-value (lt 0.01).
CAUSAL ANALYSIS OF EXPRESSION CHANGES Reverse
causal analysis generated hypotheses for the
observed changes by assigning an increased or
decreased state to individual nodes
representing the expression of each gene. The KA
model was then interrogated to hypothesize
changes in upstream control points which could
explain the observed changes. Each hypothetical
change identified in this way became a candidate
hypothesis, which was then ranked by comparing
its predictions against the observed expression
changes.
CONCLUSIONS Of the 694 genes whose expression
changed in LNCaP, the KA model represented
transcriptional control information for 385
(55). Reverse causal analysis inferred
transcriptional control hypotheses for the 385
genes and ranked these hypotheses by statistical
significance. Two of the top
RESULTS 694 genes were found to be differentially
expressed in LNCaP at 24 hours following androgen
stimulation. The LNCaP KA model contained causal
transcriptional regulation information for a
total 385 of the 694 genes (55). Of these 385
changes, 104 can be explained by the combined RB1
and SREBF hypotheses.
ranking hypotheses explained 104 expression
changes (27 of 385). The analysis defined
networks which link these hypotheses to the
activity of the androgen receptor (AR). These
networks are composed of chains of causal
relationships which form testable hypotheses. To
name one testable hypotheses out of many, the
model predicts that inhibition of fatty acid
synthase (FASN) would decrease proliferation in
response to androgen treatment. The LNCaP
Knowedge AssemblyTM model can be redeployed for
analysis of systems involving similar pathways,
such as cell models of androgen- independent
prostate cancer. The underlying knowledge, such
as causal mechanisms of cell cycle control, is
reusable in many contexts.
Fig.2 Causal Map showing pathways downstream from
the androgen receptor (AR) which are consistent
with experimentally observed gene expression
changes
THE AR -gt RB1 STORY
MATERIALS AND METHODS A Knowledge AssemblyTM (KA)
model describing LNCaP human prostate cancer
biology was constructed by a process of data
mining and curation (Fig. 1A) to select and
represent experimental findings using Biological
Case Frames (Fig. 1B). KA models are very large
computable models of biological interactions,
specifically capturing cause and effect
relationships such as transcriptional control,
catalysis, or post-translational modification.
The KA model was enriched in areas such as
prostate-cancer associated genes, AR signaling,
cell cycle transcriptional control, and fatty
acid synthesis. It contained 56893 nodes, 154440
connections and included 15313 cause and effect
relationships (Fig. 1C). The competency of the
model was assessed by forward causal analysis,
testing predictions based on perturbation of
well-studied pathways.
86 of the 385 expression changes (22) can be
explained by the phosphorylation of RB1 (Fig.
2C), which decreases its transcriptional
repressor activity and increases the
transcriptional activity of E2F1,2,3. The RB1 -
E2F system is the key driver of the G1/S
transcriptional program and its activation is
consistent with the observed increase in
proliferation of the androgen treated LNCaP
cells. Causal analysis was used to explore the
causal networks that can link the stimulation of
the androgen receptor to cell cycle regulation
through the phosphorylation of RB1, and
subsequently to the activation of E2F1, 2, 3.
This analysis defined the network shown in Fig.
2A, which consists of possible AR-gtcell cycle
paths known to the LNCaP KA model within a
12-step search. Of the many distinct paths shown,
some are additionally supported by observed
increases in the expression of ZBTB16, ODC1,
Cyclin D3 and E2F1.
THE FATTY ACID SYNTHESIS STORY
17 of the 385 expression changes can be explained
by the activation of one or more of the SREBF
transcription factors. Further analysis
identified a causal pathway where AR increases
the transcription of SCAP, which in turn
facilitates the translocation, cleavage, and
activation of SREBF1a, 1c 2 (Fig. 2B). This
hypothesis is additionally supported by the
observed increase in the expression of SCAP.
Increased fatty acid synthesis, in particular an
increase in FASN activity, can cause SKP2 to
increase the degradation of p27, leading to
increased CDK activity which drives the cell
cycle and proliferation.
REFERENCES
1. DePrimo SE, Diehn M, Nelson JB, Reiter RE,
Matese J, Fero M, Tibshirani R, Brown PO, Brooks
JD Transcriptional programs activated by
exposure of human prostate cancer cells to
androgen.Genome Biol. 2002 Jun 143(7)
research0032.10032.12
Fig.3 High-Ranking Transcriptional Hypotheses