Title: Welcome to the Manchester BioModelling Network Lecture Series
1Welcome to the Manchester Bio-Modelling Network
Lecture Series
- System Identification and Experimental Design
- How to best use your data
- Dr Martin Brown
- Slides information http//www.mcisb.org/bmnet/
or v.simeonidis_at_manchester.co.uk
2Aims Objectives
- Aims
- To review how system identification and
experimental design techniques provide tools for
systems biology researchers - Objectives
- Define system identification experimental
design - Introduce relevant statistics, optimization and
control theory - Demonstrate, using Matlab, how they solve systems
biology problems - Motivate current research topics
3Talk Outline
- Exemplar compartmental modelling problem
- Model representation, identifiability,
experimental design - System biology modelling
- Modelling framework
- Cellular modelling problem
- System Identification
- Model representation
- Parameter estimation
- Parameter uncertainty
- Model selection
- Experimental design
- Common problems
- Optimization criteria
- Examples
40. Design Model Cycle
D
System Identification statistics, optimization,
dynamics, linear algebra
Experimental Design optimization,
statistics, linear algebra
H(E,M,q)
- Experimental design provides good quality data to
the analysts - Analysts inform the experimental designers about
unknown/partially known hypotheses - Experimental factors (disturbances) E
- Model structure M
- Parameters q
- The design lt-gt model process may be cyclic
(sequential), or the design may be robust (valid
over a range of hypotheses) - Want to avoid Garbage In Garbage Out
51. Example Pharamcokinetics Modelling
- A drug x is injected into the blood with profile
u(t) - The drug moves from the central compartment C to
the peripheral compartment P with profiles xC(t)
and xP(t). - Task identify the parameters in the reactions
and use this knowledge to infer the hidden state
responses
C
P
u(t)
xC(t)
xP(t)
y(t)
61. ODE Representation
- Linear ODE model (linear first order reactions)
- We can observe one noisy, transformed drug
concentration - The additive measurement noise (errors) is
assumed to be i.i.d. N(0,s2) - Inputs u(t)
- States x xC(t), xP(t)
- Parameters q KEL, KCP, KPC, 1/V, KEL, KCP,
KPCgt0 - Observation y(t)
71. State Space Modelling
- In this problem, we have a linear state space
model - Simulation Given u(t), x0 A,B,C, calculate
x(t) y(t) integrate up the ODEs - Parameter estimation Given u(t), x0 and y(t),
estimate - A ( B, C) matrices noise model.
81. Observability Identifiability
- Observability Given the system A,B,C and a
sequence of input/output measurements
u(t),y(t), can the hidden states x(t) be
estimated? -
-
- Identifiability Given a sequence of input/output
measurements u(t),y(t) can the parameters
A,B,C be uniquely identified - For this example, dG/dq is full rank.
- While these questions are yes/no, in practice
sparse, noisy data means that the answer is
graded gt experimental design
91 Page Matlab (Symbolic)
- Symbolic state space representation
- syms Kpc Kcp Kel V
- A -Kel-Kcp, Kpc Kcp, -Kpc
- B 1 0
- C 1/V 0
-
- Observability
- E C CA
- rank(E)
-
- Identifiability
- G CB CAB CAAB CAAAB
- dG jacobian(G, Kpc Kcp Kel V)
- rank(dG)
101. Experimental Design Problem
- The input profile u(t) consists of a 1 minute
loading infusion of 75mg/min followed by a
continuous maintenance infusion of 1.45 mg/min up
to 720min - The experimental design problem consists of
selecting which discrete time measurements to use
1ti 720 - Lets assume that only 8 points can be used
- Conventional design
- t (5, 10, 30, 60, 120, 180, 360, 720)
- D-optimal design
- t (1, 1, 10, 10, 74, 74, 720, 720)
111. Uncertainty in the Parameter Estimates
- For each design, 400 experiments where performed
and the dotted lines denote the least squares
parameter estimate histograms - KEL0.0242
Dashed line for the (fitted) marginal density of
the conventional design Solid lie for the
(fitted) marginal density of the optimal design
The optimal design significantly reduces the
uncertainty in the identified system Similar
results for the other 3 parameters
121. Summary
- Today, were concentrating on systems of
reactions that can be described by ODEs. - Tools and techniques for the system
identification, analysis and validation of linear
and non-linear systems have been well developed. - Computational tools Matlab, Mathematica, should
be your first port of call for developing novel
algorithms - Important to map your problem domain model
representation onto an appropriate framework and
exploit the appropriate set of tools. - Challenges in problem scale, data availability,
insights
132. System Biology
- What is systems biology?
- Ask Doug Kell ?
- Quantitative study of biological systems (virus
cell, ) aided by recent technological advances - Biology is an experimentally driven science
simply because evolutionary processes are not
sufficiently well understood (first principles
modelling). - Reductionist approach is not appropriate as
complex adaptive systems are inherently
non-linear.
142.1 Cellular Dynamic Models
- Complexity of cellular systems resides not only
in the sheer number of components, but also in
the operations on multiple levels and timescales. - Metabolic pathways are sequences of chemical
reactions, each catalyzed by an enzymes, where
certain product molecules are formed from other
small substrates. Metabolites are usually small
molecules while enzymes are proteins. - Signal transduction pathways are networks of
molecular interactions that provide communication
between the cell membrane and intracellular
end-points, leading to some change in the cell.
Signals are transducted by modification of one
proteins activity or location by another
protein. - Gene regulation pathways determine whether or not
a particular gene is expressed at any particular
time. Transcription factors, proteins that
promote or repress transcription, either directly
or indirectly bind regulatory DNA elements
152.2 ODE Modelling Assumptions
- It should be noted that using ODEs to model
biochemical reactions assumes that - The system is well-stirred in a homogeneous
medium - Spatial effects (such as diffusion) are
irrelevant, - Has a large number of molecules for each species
- If 1 /or 2 are not valid, partial differential
equations should be used which model/represent
spatial effects - If 3 is not valid, a discrete, stochastic model
will be more appropriate which represents
individual molecular interactions, rather than
average concentrations. -
- In this talk, were primarily concerned with ODEs
?
16Example 1 Michaelis Menten Pathway
ks are reaction parameters, x are metabolite
concentrations Well-stirred, significant
concentration assumptions
Kinetic Reaction Mechanisms
Substrate (S)
Enzyme (E)
x1
x2
k1/k2
x3
Pathway Dynamic Modelling via ODEs
k3
x4
Product (P)
http//en.wikipedia.org/wiki/Michaelis-Menten_kine
tics
171 Page Magic Matlab
- Simulation
- t, x ode45(_at_mmode, 0 10, 1 0.2 0 0)
- plot(t,x)
- ODE representation
- function dx mmode(t, x)
-
- k1 2
- k2 1
- k3 0.1
-
- dx zeros(4,1)
- dx(1) -k1x(1)x(2) k2x(3)
- dx(2) -k1x(1)x(2) k2x(3) k3x(3)
- dx(3) k1x(1)x(2) - k2x(3) - k3x(3)
- dx(4) k3x(3)
18Example 2 IkB - NF-kB Signaling Pathway
NF-?B (nuclear factor-kappa B) is a protein
complex which is a transcription factor. NF-?B is
found in all cell types and is involved in
cellular responses to stimuli such as stress,
cytokines, free radicals, ultraviolet
irradiation, and bacterial or viral antigens.
NF-?B plays a key role in regulating the immune
response to infection
IkB-NF-kB signal pathway module
Graphical Connection Map for IkB-NF-kB Model
19Example 2 IkB - NF-kB Signaling Pathway
NF-kB reaction mechanisms
Dynamic ODEs model
202. Summary
- Several common pathway models can be represented
as bilinear state space models - Bilinear models involve both a linear component
and a cross product term as well and can be
written as - Matlab, .Copasi, . Many software tools can be
used to simulate the systems, but for system
identification, local sensitivity information can
be useful.
213. What is System Identification?
- System identification is a general term to
describe mathematical tools and algorithms that
build dynamical models from measured data. - System identification is generally done in the
time domain, but analysis can be done frequency
domain
223. The System Identification Process
Prior knowledge
- In building a model, the designer has control
over three parts of the process - Generating the data set D
- Selecting a (set of) model structure
- Selecting the criteria (least squares for
instance), used to specify the optimal parameter
estimates validate the model - Section 3 is really about steps 23, section 4 is
about step 1.
Experiment Design
Data
Choose Model Set
Choose Criterion of Fit
Calculate Model
Validate Model
?
?
233. Elements of System Identification
- Understanding system identification involves a
mix of techniques - 1. Statistical view of induction and inference
because all work is relative to the finite,
sampled, noisy data set - 2. Optimization
- continuous parameter estimation and
- discrete model selection
- 3. Dynamics (control/state space) representation
of the ODE/difference equation - As well as a good knowledge of the application
domain in order to validate the model.
243.1 Statistical Modelling
- An old statistical proverb is worth remembering
whenever an engineer attempts to develop a
grey/black model - All models are wrong (bad), its just that
- some are more useful than others
- Typically, models are developed for two main
reasons - Hypothesis testing how do different model
structures compare, when their assumptions are
different? - Predictive accuracy How can a model be
constructed to offer adequate predictive
performance in a simulation - Parametric and non-parametric modelling are both
widely used, but well concentrate on parametric
modelling
253.1 Statistical Assumptions
- The aim of statistical induction is to try and
deduce a general rule (model) from a finite data
set - The exemplar data is stochastic as it involves
random measurement error/disturbances. - Repeating the experiment will generate a
different exemplar data set. - Induction is stochastic and we can only infer a
probability distribution on the parameters q
p(qD) - Any model inference is therefore also uncertain
p(yq)
Model
Induction
Inference
Data
Prediction
Deduction
263.1 Statistical Induction
- Lets make 10 repeated (noisy) measurements about
a parameter -
- Inductive estimates
- Even in this simple case, its important to
quantify the uncertainty. Experiments can then
be designed to minimize its effect (maximize
signal to noise ratio) - In addition, we can quantify the amount of
prediction uncertainty
273.1 Gaussian Measurement Errors?
- A frequent assumption is that the noise is iid
(independent identically distributed), additive
and Gaussian - iid Noise structure may be incorrectly
attributed to other effects. The noise should be
identically distributed to allow us to easily
aggregate errors - Additive noise is added onto the true signal
- Gaussian Gaussian distributions are completely
characterised by their mean and variance, - A linear transformation of a Gaussian is another
Gaussian (with known means variances)
283.2 Dynamic Model Framework
- What signals impact on our model?
v(t)
System q x(t)
y(t)
w(t)
u(t)
- The following signals define the model
- u(t) are external input signals that can be
controlled (factors ) - w(t) are measurable disturbances
- v(t) are unmeasurable disturbances
- y(t) is the output signal
- x(t) are the dynamic, unobservable states of the
model - In addition, the (time invariant) model has a set
of unknown parameters q
293.2 ODE/State Space Models
- In this talk were mainly concerned with ordinary
differential equation (ODE) systems - This is a generalized representation
- f() determines the dynamics of the ODE
- g() determines the structure of the observer
- Re-visiting linear state space models
303.2 States Mass Conservation
- A key part in understanding state space/ODE
models is the idea of a state. - It is a minimal signal set, which at any time,
contains enough information to predict future
behaviour, coupled with knowledge of external
signals - In pathway modelling problems, the states
generally correspond to molecular concentrations - NB we often have mass conserving constraints such
as . - These can be easily represented in the ODE so the
state transition matrix satisfies - This defn works in reverse on the left singular
values of A
313.2 Linear State Space Responses
- Linear state space models and their dynamics are
very well understood - Model form
- General solution
- Both the autonomous forced dynamical behaviour
is determined by the exponential matrix eAt - l real positive unstable
- l real negative stable
- l imaginary - sinusoidal
Parameters A, B C
Autonomous response
Forced/convolved response
323.2 Non-linear State Space Models
- In reality, any non-linear function could be used
for the state transition observer functions,
f() g() - Often the observer function is a simple binary
matrix which represents which states can be
measured - Remember, missing states may still be observed
- There has been a lot of work by researchers
developing and analysing new non-linear state
transition functions - Bilinear non-linear state space models are a
simple/popular tool for cellular models
(dimerization)
333.3 Parameter Estimation
- A key part of system identification is to assume
that the model structure is given (and correct!) - The goal is to estimate parameters whose values
cannot be directly estimated or derived from
first principles models - As one of the system biology goals is to better
understand first principle models and reaction
rates are difficult to estimate, parameter
estimation is a frequent problem in systems
biology
343.3 Model Parameters
- Lets consider the case where we have an
autonomous, bilinear state space model - This is a linear in the parameter model, but
non-linear in the state. - NB Often the A1 and A2 are quite sparse
- Consider the extended state
- Then the linear dependency of the model
parameters on the extended state is easy to see
353.3 Optimality Criterion
- For simplicity, lets assume that all the states
can be directly measured - For parameter estimation, we have the
input-output measured, sampled data u(k), x(k)k - Then the basic parameter estimation problem can
be posed as - This is least squares parameter estimation
- There are many variations on this basic
optimality criterion a key thing is to
recognise what assumptions they impose on problem
domain
f(q)
q
363.3 Optimization for Dynamic Systems
- The optimality criterion contains the (discrete
time) state estimates which must be obtained by
integrating the current model estimate. - Optimization iterates parameter estimates
- We have the following scenario
Final estimates
Sample data
Parameter iterations
u(k), x(k)
Even when dx/dt depends linearly on q, the
dynamic model means that the optimality criterion
is not quadratic (exponential)
Solve model
371 Page Matlab
- theta, fval, exitflag fminunc(_at_mmobj, 3 0.5
0.2) - function f mmobj(theta)
- dt 00.0510
- theta1 2 1 0.1
- t1, x1 ode45(_at_(t,x)mmode(t,x,theta1), dt, 1
0.2 0 0) - t2, x2 ode45(_at_(t,x)mmode(t,x,theta), dt, 1
0.2 0 0) - f sum(sum((x1-x2).2))
- function dx mmode(t, x, theta)
- dx zeros(4,1)
- dx(1) -theta(1)x(1)x(2) theta(2)x(3)
- dx(2) -theta(1)x(1)x(2) theta(2)x(3)
theta(3)x(3) - dx(3) theta(1)x(1)x(2) - theta(2)x(3) -
theta(3)x(3) - dx(4) theta(3)x(3)
383.3 Least Squares Assumptions
- Why is the least squares criterion so widely
used? - For statistical models, it corresponds to
minimizing the variance of the measured signal
with independent and identically distributed
noise. - The optimal model is the mean
- One common variation in system biology is to
assume that the noise variance differs for each
state
393.3 Gradient Estimates
- Without wanting to go over all the 1st/2nd order
optimization theory, lets consider the very
simple gradient descent parameter estimation
procedure - We can view this as
- sliding down a bowl to the
- bottom. The choice of h
- determines stability/rate
- of convergence.
- We can show
403.3 Local Sensitivity
- Using the least squares cost function, the
gradient is evaluated as - The objective gradient depends on the model
sensitivity - Using the model sensitivity in optimization
algorithms (PPT), were implicitly performing
some form of gradient descent - Using the normalized sensitivity
- Implicitly nomalizes the cost function
413.3 Local Sensitivity for a Linear Model
- In gradient-based optimization, obtaining the
sensitivity derivatives is v. important. - It can be calculated from an ODE, just like the
state space equations -
- Solving for the sensitivity states is a joint
ODE problem (extended ode function in Matlab)
421 Page Matlab
- Pseudo/untested code
- Simulation
- t, xS ode45(_at_mmode, 0 10, 1 0.2 0 0
zeros(1,12)) - plot(t,xS)
- ODE representation
- function dx mmode(t, x)
-
- k1 2
- k2 1
- k3 0.1
-
- dx zeros(16,1)
- States
- dx(1) -k1x(1)x(2) k2x(3)
- dx(2) -k1x(1)x(2) k2x(3) k3x(3)
- dx(3) k1x(1)x(2) - k2x(3) - k3x(3)
- dx(4) k3x(3)
- Sensitivities
433.3 Parameter Variance/Co-variance
- A key measure of identifiability and a key link
with experimental design is the estimated
parameter variance/covariance matrix - It forms part of 2nd order optimization
procedures - It measures the amount of uncertainty
correlation in the parameter estimates - For simple, static, linear systems it measures
all the uncertainty. More generally, its a
local approximation
44Example S Parameter Validation
- Consider the two parameter model with uncertainty
regions - We can marginalise the distribution to perform
single hypothesis testing on each parameter. - We can compare whether a new estimate is
statistically significant from a previous
estimate - In many systems biology system identification
problems there are significant problems with
identifiability
453.4 Pathway Selection
- Practical implications for the modelling tasks
are the emphasis on network structures over exact
values of the kinetic parameters - A key assumption in parameter estimation is that
we know the model structure - Non-linear f() g()
- Linear order sparse structure of A, B and C
- This means the estimation residual corresponds to
the stochastic noise term - In practice, we rarely know the parametric model
structure exactly. - Model selection is a discrete, non-convex search
through the space of all possible models in order
to find one that is suitably predictive and
suitably simple.
463.4 Selection Identifiability
- In practice, even though a reaction may exist,
the corresponding parameter value may be very
difficult to estimate. - Problem of identifiability
- The problem is badly conditioned
473.4 Null Hypothesis Test
- Locally, the estimated parameter uncertainty is
Gaussian when the measurement errors are Gaussian - To determine whether each feature/parameter is
significant, we can apply the null hypothesis
test. - This essentially determines whether there is
enough information in the training data set to
say that the particular parameter is
significantly non-zero, i.e. does the parameter
distribution contain the value 0, for a 67, 95
or a 99.7 certainty - Zero parameter ? no reaction
- Simple way to compare two models, differing by
one extra parameter
p(q)
67 certainty
95 certainty
0
E(q)
99.7 certainty
483.4 Convex Re-formulations
- Remember, there is no correct parameter
estimates/discrete model structure! - Parametric model selection is a discrete
combinatorial, and when the parameter estimation
problem is badly conditioned, there are often
numerous local minima - Were looking at regularization approaches for
parameter estimation approximate sparse model
selection
493.4 Example RKIP regulated ERK signaling pathway
- 11 states
- 11 parameters
- 0.5s worth of identification data
50Calculated Parameter Locus
Estimated parameter locus (l 0-gt8)
- Many parameters are largely conditionally
independent however a couple of parameters
exhibit Simpsons paradox (change sign when a new
parameter is introduced/removed)
513. Summary
- Weve really just scratched the surface of
parametric modelling techniques. - Dont reinvent the wheel, learn how to use
existing libraries - Important issues not really discussed
- Model validation (this is more than just looking
at mse) - Non-convex optimization (v. important for
medium-large parameter estimation problems) - Non-linear extensions to well-founded linear
state space models - Hidden state estimation/filtering
- Current work is looking at
- Regularization for parameter estimation sparse
model selection - Non-parametric (kernel/gaussian processes)
524. Experimental Design
- The basic assumption here is that it is expensive
to perform an experiment, hence the aim of
experimental design is - maximise the identification information while
minimizing the number of experiments - This is relative to the question being asked
about the model - How predictive is the model
- How well identified are the parameters
- How does M1 compare with M2
- If the modelling purpose is not well understood,
the experimental design problem is ill-posed. - Like parameter estimation, experimental design is
an optimization problem
534. Why do it?
- Consider a single factor, linear, static model,
where the measurements contain some additive,
zero mean, iid Gaussian noise - The inputs are constrained by
- Where should we place 2 experiments to best
identify q?
544. Experimental Design Parameters
- Given a (linear) model
- there are many experimental parameters that can
be optimized - The initial state values x0
- Which states to observe C
- The input/excitation signal u(k)
- The sampling time/rate tkkT
- These are a mix of continuous, discrete and time
varying variables which are problem dependent.
554. Experimental Design Optimality
- The basic measure of optimality is the Fisher
information matrix - forms a lower bound for the parameter
variance/covariance matrix - This produces a vector optimization problem and
common matrix (alphabet) scalarization measures - A-optimality - tr(Sq) (eigenvalue sum)
- D-optimality - det(Sq) (eigenvalue product)
- E-optimality - maxj l(Sq) (largest eigenvalue)
- all give (different) estimates of the amount of
parametric uncertainty resulting from the
experimental data.
l1,v1
l2,v2
564. Dynamic Experimental Design
- The basic problem with applying experimental
design to dynamic models is that the sensitivity
derivative which depends on the current parameter
estimates - Sequential experimental design
- Perform a new experimental design Dk after each
parameter estimate - Robust experimental design
- Calculate an experimental design which is valid
for a range of parameter values
sensitivity
estimation
exp design
574. Example Observer Selection
- For the NF-kB model/experimental set-up, there
exists a constraint that only 4 states (out of
20) can be directly measured - This is a discrete optimization problem and is
often relaxed as - Note that this assumes all observations can be
made in order to select which are the most
informative! An iterative, sampled approach may
be more realistic if there are measurement
constraints.
584. Example Robust Experimental Design
- Produce a design which minimizes the uncertainty
in a space around the nominal model estimate. - Because of the uncertainty in the parameter
estimates, we have an associated bounded
uncertainty in the sensitivity estimates - Like the regularization approach to sparse model
selection, this uses a convex re-formulation to
the problem and we can exploit a lot of
semi-definite programming theory.
594 Example Robust Experimental Design
- Increasing the uncertainty (p large) means that
all observations are equally valid, but when the
uncertainty is small, 4 observations larger than
the others.
604. Conclusions
- Experimental design is a bran flakes-type
technology, not necessarily very exciting, but
very useful. - Variety of ways of posing questions
- Better to do some analysis, even approximate,
than none at all - System identification and experimental design are
mature techniques and have a lot to offer systems
biology researchers. - However, they do not pretend to be a complete
theory and problems in systems biology offer
challenges in terms of scalability, robustness - Thanks to Fei He, George Papadopoulos, Liang Sun,
Hong Yue, Doug Kell, Steve Wilkinson