Title: Consensus on Platform draft
1Main objectives of this meeting
- Consensus on Platform draft first phase
- How to get feed-back from users
- Data/knowledge base
- Web-site / deliverables
- How to fill the knowledge base
2Wet-lab experiments (DI)
- Functional Genomics Technologies
- Transcript microarrays custom array
- Genome tiling arrays for ChIP analysis
- Y2H / TAP complex information
- Experiments
- Optimise synchronisation protocol(s)
- produce 24-sample aphidicolin-synchronised
dataset - livey, gejae - produce unspecified number of timepoints post-DNA
damage, asynchronous culture sekus? gejae? - nitrogen or phosphorus starvation G1 phase
release - transcriptional response to proliferative or
antiproliferative stimuli. - Asynchronous cells after hormone treatment?
Nutrient depletion? - identification of dynamics of alternative splice
variants - Y2H and TAP-tag analysis
- Functional validation with RNAi, GFP fusions etc.
- Hypothesis validation by wet-lab experiments
3Data Integration (MK)
- Customise GIN-db
- classification and annotation of components (VIB,
Brunak) - Detailed structural analysis / comparison of gene
sequences - Microarray data pre-processing pipeline CAGE
- Microarray pre-processing Affymetrix
- PANDORA (Linial) integrated information
including quantitative info - harvest all available data on cell cycle
control, in all organisms, for key components
(100) - link with COMBIO en MITOCHECK
- feed data into mature GIN-db
- Set-up (limited) web service
4Data Mining (SB)
- Network stacking take yeast network and
overlay plant and human networks - analysis of functional sub-networks by
simulations - Combinatorial analysis of promoter elements
- text mining of abstracts, titles, full text
- extract directionality and interaction types from
text - comparative analysis of cell cycle genes
- mining transcriptome data
- linking to databases (KEGG, Transpath, ProtFun)
- analysis of promoters (GIBBS, saco-patterns,
TRANSFAC) - Mining proteome data INTACT
- Design targeted deregulation of cell cycle
5Cell cycle simulation and modelling (DT)
- Determine regulatory graphs
- Qualitative dynamical models by logical
formalism, PLDE - Quantitative modelling by ODE, stochastical
equations - Exploit Cytoscape as visualisation environment
- Explore alternative visualisation strategies
- Assemble all available knowledge on Arabidopsis
cell cycle into series of core network modules - Assemble all available knowledge on cell cycle of
human, and yeast into series of core network
modules
6Cell Cycle portal (MA)
- Integration of computational resources as
web-service www.Sbcellcycle.org - Windowing system, imaging model, api
7Action List (1)
- Establish contact with coordinators of related
projects (ASAP) - DIAMONDS web site available (3 months)
- www.SBcellcycle.org
- Definition of the essential design
characteristics of the toolbox (6 months) - Survey Report on state of the art classification
methods (6 months) - Set of Guidelines for data validation and
assessing of confidence of prediction (6 months) - GUI draft (milestone, 9 months)
8Action List (2)
- Optimised synchronisation protocol for yeasts,
Arabidopsis, human cells (12 months). - Methods for data integration and functional
classification and prediction (12 months). - Algorithms for transcription profile correlation
and graph-based clustering (12 months).
9Platform Core
- Transcriptome analysis Expression profiler /
GenePublisher - Extensions on analysis, visualisation, context
(databases, literature) - Data pre-processing tools
- Data uploading tools
- Proteomics analysis
- Y2H data
- Complex info from TAP-tag
- Extensions on analysis, visualisation (graphs),
context (databases, literature) - GIN-db
- Information uploading tools
- Text mining tools
- Simulation tools
- GIN-sim
- SIM-plex
- ODE
10Wet-lab experiments (DI)
Four partners (DI, KH, SB, JB) will optimise or
use synchronised cells (Arabidopsis, human cells,
budding yeast, and fission yeast, respectively)
for the extraction of RNA and microarray-based
expression profiling, to monitor in detail the
dynamics of transcriptome events. In addition,
TAP technology will be used, focusing on the key
protein components of the cell cycle (DI, KH).
The data will be subsequently pre-processed,
annotated and integrated into the central cell
cycle data warehouse (WP2). Based on the
dynamical simulations performed in WP4,
additional perturbations will be designed, and
carried out by partners DI, KH, SB, and JB. For
this later stage we will design custom arrays,
with fewer features, to generate fine
transcription maps focusing on cell cycle
components (DI, SB). The perturbations that we
will apply on A. thaliana include release from
mitotic and DNA replication arrest, DNA damage,
nutrient starvation, and auxin/cytokinin (DI). In
fission yeast and budding yeast we will study
cells entering stationary G0 phase (due to
starvation), of cells re-entering the cell cycle
after re-feeding, and we will use cell cycle
mutants (SB, JB). Perturbations applied to human
fibroblast cells include release from mitotic and
DNA replication arrest, DNA damage, and/or in
cells separated by elutriation (KH). To
complement the feature finding of WP2
(clustering, and cis-element analysis) partners
DI, KH (in collaboration with Richard Young,
MIT), and JB will design specific ChIP
experiments for global mapping of cell cycle
control transcription factor binding sites. For
final validation, siRNA technology will be used
(DI, KH) on a hundred selected genes involved in
the control of the cell cycle.
11Wet-lab experiments PSB
Exp. PSB Synchronised cells (aphidicolin) 0,2,4,6
,8,10,12,14,16,18 hrs
http//exgen.ma.umist.ac.uk/cgi-bin/R.cgi/EDv3.R
- Wet-lab experiments
- Synchronisation experiments data
pre-processing - Synchronisation optimisation, deliverable!
12Follow-up experiments
- Block cells with hydroxyurea or propyzamide,
nocodazole, colcemid - Analysis of mutants, double mutants
- Col-0
- DEL1 / overexpresser (2)
- DEL1 / Knock-out
- KRP2 / overexpresser (2 strong, 2 weak)
- CDKB11.N161 (2)
- CCS52A2KO
- ATM mutant, UV
13Kristian Helin
- DNA damage experiments.
- treat wild type and p53 mutated diploid
fibroblasts with UV or ionizing radiation - Affy gene expression profiling after different
length of time. - What is the required number of replicates?
14Data Integration (MK)
In order to produce the data warehouse for all
cell cycle data (either novel, from WP1, or
extracted from text or existing databases, or
though mining activities in WP3), GIN-db
relational database will be extended and
customised to support all model organisms of the
project (DT, MK). This knowledge integration
platform will be connected to relevant local or
public databases BASE for microarrays,
ArrayExpress, ProtoNet, ...) through data
transfer pipelines (MAGE-ML, other XML type). All
novel data produced within the project will be
subjected to primary analysis microarray data,
normalization, statistical analysis, curation,
annotation, clustering, classification and
visualization, with an emphasis on new analysis
tools (MK, KH, AB, SB, EH, ML, AV). Common data
formats, descriptions, and curation standards
will be implemented (MIAME, GO, other). We will
extract the relevant cell cycle information from
literature and from other databases (EH, AV). For
particular types of data retrieved through text
mining we will consider the construction of
ontologies from text (AV). We will identify and
integrate cell cycle components in other
organisms as data becomes available (partners MK,
DT, SB, EH, JB). We will train a model for
identification of cell cycle proteins without
obvious sequence similarity to known cell cycle
proteins, using sequence-derived protein features
(SB, ML, AV). We will combine the protein- and
transcript clustering results with sequence
analysis, for identification of promoter motifs,
and protein domains (MK, DT, SB, AB, ML, AV). For
the annotation of novel and prior data on the
(cell cycle) proteome we will use an Integrated
classification (ML) the partners will survey the
techniques for protein classification and data
mining, to develop a consolidated classification
methodology methods will be devised for
measuring the validity of predictions, and later
on providing a measure of confidence and
post-translational information will be extracted
based on proteomics experimental data (ML
PANDORA). We will provide web-servers (and to
some extent web-services) to allow researchers to
access classification and annotation tools (DT,
EH, ML, MA). Naturally, gene annotation will
focus on cell-cycle regulators and regulatees
(DI, KH, DT, AB, SB, ML). Finally, the data
warehouse will be embedded in the integrative
layer that will be developed in WP5 (MA, in
collaboration with all other partners).
15Data Mining (SB)
The components in the dynamical profiles
(transcriptome, protein complexes) from WP1 will
be clustered and analysed for periodicity, and
matched with functional annotation (to e.g. KEGG,
TRANSPATH, ProtFun predictions) (MK, YP, KH, SB,
ML, AV). Next to prior knowledge on cell cycle
mode of action, we will delineate further
conserved and organism-specific dynamical
patterns in S. cerevisiae and S. pombe (SB, AB,
JB). By applying feature selection tools we want
to identify subnets, crosstalk and experimental
information content (partners DI, KH, SB). A
major effort will be spent at functional
annotation of transcriptional networks, using the
characterised putative promoter elements of WP2,
in combination with the dynamical transcriptome
events and ChIP data obtained in WP1.
Complementary to this, transcriptional networks
will be extracted from a combination of
literature and transcription data that will be
gathered from available databases (EH). We will
use combinatorial analysis of promoter elements,
to allow a multivariate correlation with global
transcriptome dynamics (MK, KH, DT, SB). To
determine interactions between cell cycle genes
we will use graph-based probabilistic and
combinatorial clustering techniques, combining
microarray data, protein complex information and
GO (partners MK, DT, SB, AB). The integrated data
resource generated in WP2 will also be used to
identify and compare transcriptional programs and
other regulatory mechanisms, such as protein
degradation and activation/deactivation by
post-translational modifications. An additional
approach to predict protein interaction networks
is based on genome prediction methods that use
information from multiple sequence alignments
(AV). We will pursue the identification of
possible partner proteins to build
three-dimensional model of the corresponding
complexes, their domain decomposition, and the
careful construction of corresponding alignments
between target and sequences to model (AV). For
this, we will use structural models of key CC
proteins to map the binding characteristics
differentiating binders from non-binders
(AV). Subsequently, we will identify the regions
of interaction and the key residues responsible
of the specificity of interaction (AV), as a
prelude to targeted deregulation (or therapeutic
perturbation) of the cell cycle.
16Cell cycle simulation and modelling (DT)
All available relevant data and knowledge about
the cell cycle will be assembled in a series of
core network modules (MK, DI, KH, DT, SB, EH,
JB). This will include the information on
predicted complexes and protein interactions,
including sequence and structural aspects, in
relation with the cell-cycle interaction network
(AV, ML). These regulatory modules will be
further modelled in terms of a graph-based
formalism, enabling a structural analysis of the
corresponding networks across several model
species. These regulatory graphs will then serve
as a basis to build qualitative but yet rigorous
dynamical models, using the logic approach at the
core of GIN-sim (partner DT). These qualitative
dynamical models will be validated on novel
expression data sets to test its descriptive and
predictive power (MK, DI, KH, DT, SB, JB). For
key network control modules, we will launch a
prospective quantitative analysis (ODE,
Stochastic equations, parameterisation) (DT, SB).
For such components, we will initiate the
development of (semi-) quantitative models (MK,
DT, SB). Hybrid simulation tools will be applied
to perform the integrative analysis of novel data
in collaboration with the different wet-lab
partners (DI, KH, SB, JB), to perform in silico
experiments to evaluate perturbations effects and
to further optimise the network model itself ().
In order to accommodate the visualisation of the
output in an interactive way, MK (in
collaboration with the Whitehead institute,
CytoScape), DT (GIN-sim), SB and MA will exploit
or develop visualisation tools enabling an
intuitive representation of large gene networks,
as well as of their crucial dynamical properties.
17Cell Cycle portal (MA)
Partner 9 will determine the main characteristics
and carry out the requirement analysis that the
GUI has to fulfil. For this, the consortium will
celebrate two technical meeting during first 6
months of the project (coordinated by Partner
MA). At the first meeting, the wet-lab partners
(MK, DI, KH, SB, and JB), will convene, to define
main characteristics and functionalities that the
GUI has deliver in support of System Biology. At
the second meeting, the partners working on
dry-lab part of the project will convene, to
define operating systems, programming languages,
the integration of databases (selection), tools
and algorithms. Essential for the function of the
GUI is the development of a project web-portal
for data submission and mining, as well as for
model building/analysis (DT, MA, EH). To
facilitate its performance we will consider
distributed schemas and/or mirrors sites (e.g.
for the key project databases and computational
tools). 2) Training of project participants in
System Biology Toolbox After the delivery of a
Beta version, the System Biology Toolbox will be
introduced to all participants in a technical
meeting that will take place during M24-M26 of
the project. The main objective of this meeting
is to teach participants how to used the common
platform. Participants will then extensively use
the platform and propose modifications and
upgrades during the following months. All
feedbacks will be used to optimise the platform,
in order to deliver an optimised version by M34.
The last two months of the project will further
serve to disseminate the mining and modelling
environment among the European Scientific
Community. In parallel, another dissemination
activity will include the organisation of one
Workshop in each participating country (in the
last 6 months of the project) to enlarge the
visibility and accessibility of our System
Biology Toolbox and to create national
communication points that will facilitate the
exchange of knowledge and feedback on System
Biology research at the European level. Progress
on all aspects of the project will be organised
regularly (every 6 months) in workshops to
circulate information (MK, DT, SB, MA). Finally,
partner MK will take care of all
patenting/protection issues that arise during the
project.
18Review and assessment (MK)
Through careful and regular assessment of the
state of core components of the project (specific
tools, datasets, databases), the DIAMONDS
Management Committee will assess the progress
made within the project on a regular basis. The
workpackage coordinators will also assess the
progress and collaborative efforts done in
concert with related ongoing (EU) projects,
especially a review of project results that may
need to be communicated with responsible project
coordinators of such related projects prior to
publication. The different partners to the
respective workpackages will communicate major
hurdles and advances to the responsible
workpackage coordinators. These coordinators will
also regularly solicit feedback from contributing
partners. The results of the survey for project
advancements will be discussed at the MC level,
prior to the 6-monthly plenary meetings. In case
the review and assessment effort indicates that
re-steering is necessary, (a) proposal(s) to that
end will be placed on the agenda of the plenary
meeting, such that this can be properly discussed
and put to a vote.
19(No Transcript)
20(No Transcript)
21Website Experiment overview - posting of
designs? - posting of results? Deliverables
need to be published on website - reports,
platform draft Cell cycle models? Demos of
simulations? Links to all the tools proposed for
integration?