Welcome to the Manchester BioModelling Network Lecture Series

About This Presentation

Title:

Welcome to the Manchester BioModelling Network Lecture Series

Description:

To review how system identification and experimental design techniques provide ... systems resides not only in the sheer number of components, but also in the ... – PowerPoint PPT presentation

Number of Views:131

Avg rating:3.0/5.0

Slides: 61

Provided by: mci4

Category:

more less

Transcript and Presenter's Notes

Title: Welcome to the Manchester BioModelling Network Lecture Series

1
Welcome to the Manchester Bio-Modelling Network
Lecture Series

System Identification and Experimental Design
How to best use your data
Dr Martin Brown
Slides information http//www.mcisb.org/bmnet/
or v.simeonidis_at_manchester.co.uk

2
Aims Objectives

Aims
To review how system identification and
experimental design techniques provide tools for
systems biology researchers
Objectives
Define system identification experimental
design
Introduce relevant statistics, optimization and
control theory
Demonstrate, using Matlab, how they solve systems
biology problems
Motivate current research topics

3
Talk Outline

Exemplar compartmental modelling problem
Model representation, identifiability,
experimental design
System biology modelling
Modelling framework
Cellular modelling problem
System Identification
Model representation
Parameter estimation
Parameter uncertainty
Model selection
Experimental design
Common problems
Optimization criteria
Examples

4
0. Design Model Cycle
D
System Identification statistics, optimization,
dynamics, linear algebra
Experimental Design optimization,
statistics, linear algebra
H(E,M,q)

Experimental design provides good quality data to
the analysts
Analysts inform the experimental designers about
unknown/partially known hypotheses
Experimental factors (disturbances) E
Model structure M
Parameters q
The design lt-gt model process may be cyclic
(sequential), or the design may be robust (valid
over a range of hypotheses)
Want to avoid Garbage In Garbage Out

5
1. Example Pharamcokinetics Modelling

A drug x is injected into the blood with profile
u(t)
The drug moves from the central compartment C to
the peripheral compartment P with profiles xC(t)
and xP(t).
Task identify the parameters in the reactions
and use this knowledge to infer the hidden state
responses

C
P
u(t)
xC(t)
xP(t)
y(t)
6
1. ODE Representation

Linear ODE model (linear first order reactions)
We can observe one noisy, transformed drug
concentration
The additive measurement noise (errors) is
assumed to be i.i.d. N(0,s2)
Inputs u(t)
States x xC(t), xP(t)
Parameters q KEL, KCP, KPC, 1/V, KEL, KCP,
KPCgt0
Observation y(t)

7
1. State Space Modelling

In this problem, we have a linear state space
model
Simulation Given u(t), x0 A,B,C, calculate
x(t) y(t) integrate up the ODEs
Parameter estimation Given u(t), x0 and y(t),
estimate
A ( B, C) matrices noise model.

8
1. Observability Identifiability

Observability Given the system A,B,C and a
sequence of input/output measurements
u(t),y(t), can the hidden states x(t) be
estimated?
Identifiability Given a sequence of input/output
measurements u(t),y(t) can the parameters
A,B,C be uniquely identified
For this example, dG/dq is full rank.
While these questions are yes/no, in practice
sparse, noisy data means that the answer is
graded gt experimental design

9
1 Page Matlab (Symbolic)

Symbolic state space representation
syms Kpc Kcp Kel V
A -Kel-Kcp, Kpc Kcp, -Kpc
B 1 0
C 1/V 0
Observability
E C CA
rank(E)
Identifiability
G CB CAB CAAB CAAAB
dG jacobian(G, Kpc Kcp Kel V)
rank(dG)

10
1. Experimental Design Problem

The input profile u(t) consists of a 1 minute
loading infusion of 75mg/min followed by a
continuous maintenance infusion of 1.45 mg/min up
to 720min
The experimental design problem consists of
selecting which discrete time measurements to use
1ti 720
Lets assume that only 8 points can be used
Conventional design
t (5, 10, 30, 60, 120, 180, 360, 720)
D-optimal design
t (1, 1, 10, 10, 74, 74, 720, 720)

11
1. Uncertainty in the Parameter Estimates

For each design, 400 experiments where performed
and the dotted lines denote the least squares
parameter estimate histograms
KEL0.0242

Dashed line for the (fitted) marginal density of
the conventional design Solid lie for the
(fitted) marginal density of the optimal design
The optimal design significantly reduces the
uncertainty in the identified system Similar
results for the other 3 parameters
12
1. Summary

Today, were concentrating on systems of
reactions that can be described by ODEs.
Tools and techniques for the system
identification, analysis and validation of linear
and non-linear systems have been well developed.
Computational tools Matlab, Mathematica, should
be your first port of call for developing novel
algorithms
Important to map your problem domain model
representation onto an appropriate framework and
exploit the appropriate set of tools.
Challenges in problem scale, data availability,
insights

13
2. System Biology

What is systems biology?
Ask Doug Kell ?
Quantitative study of biological systems (virus
cell, ) aided by recent technological advances
Biology is an experimentally driven science
simply because evolutionary processes are not
sufficiently well understood (first principles
modelling).
Reductionist approach is not appropriate as
complex adaptive systems are inherently
non-linear.

14
2.1 Cellular Dynamic Models

Complexity of cellular systems resides not only
in the sheer number of components, but also in
the operations on multiple levels and timescales.
Metabolic pathways are sequences of chemical
reactions, each catalyzed by an enzymes, where
certain product molecules are formed from other
small substrates. Metabolites are usually small
molecules while enzymes are proteins.
Signal transduction pathways are networks of
molecular interactions that provide communication
between the cell membrane and intracellular
end-points, leading to some change in the cell.
Signals are transducted by modification of one
proteins activity or location by another
protein.
Gene regulation pathways determine whether or not
a particular gene is expressed at any particular
time. Transcription factors, proteins that
promote or repress transcription, either directly
or indirectly bind regulatory DNA elements

15
2.2 ODE Modelling Assumptions

It should be noted that using ODEs to model
biochemical reactions assumes that
The system is well-stirred in a homogeneous
medium
Spatial effects (such as diffusion) are
irrelevant,
Has a large number of molecules for each species
If 1 /or 2 are not valid, partial differential
equations should be used which model/represent
spatial effects
If 3 is not valid, a discrete, stochastic model
will be more appropriate which represents
individual molecular interactions, rather than
average concentrations.
In this talk, were primarily concerned with ODEs
?

16
Example 1 Michaelis Menten Pathway
ks are reaction parameters, x are metabolite
concentrations Well-stirred, significant
concentration assumptions
Kinetic Reaction Mechanisms
Substrate (S)
Enzyme (E)

x1
x2
k1/k2

x3
Pathway Dynamic Modelling via ODEs
k3

x4
Product (P)
http//en.wikipedia.org/wiki/Michaelis-Menten_kine
tics
17
1 Page Magic Matlab

Simulation
t, x ode45(_at_mmode, 0 10, 1 0.2 0 0)
plot(t,x)
ODE representation
function dx mmode(t, x)
k1 2
k2 1
k3 0.1
dx zeros(4,1)
dx(1) -k1x(1)x(2) k2x(3)
dx(2) -k1x(1)x(2) k2x(3) k3x(3)
dx(3) k1x(1)x(2) - k2x(3) - k3x(3)
dx(4) k3x(3)

18
Example 2 IkB - NF-kB Signaling Pathway
NF-?B (nuclear factor-kappa B) is a protein
complex which is a transcription factor. NF-?B is
found in all cell types and is involved in
cellular responses to stimuli such as stress,
cytokines, free radicals, ultraviolet
irradiation, and bacterial or viral antigens.
NF-?B plays a key role in regulating the immune
response to infection
IkB-NF-kB signal pathway module
Graphical Connection Map for IkB-NF-kB Model
19
Example 2 IkB - NF-kB Signaling Pathway
NF-kB reaction mechanisms
Dynamic ODEs model
20
2. Summary

Several common pathway models can be represented
as bilinear state space models
Bilinear models involve both a linear component
and a cross product term as well and can be
written as
Matlab, .Copasi, . Many software tools can be
used to simulate the systems, but for system
identification, local sensitivity information can
be useful.

21
3. What is System Identification?

System identification is a general term to
describe mathematical tools and algorithms that
build dynamical models from measured data.
System identification is generally done in the
time domain, but analysis can be done frequency
domain

22
3. The System Identification Process
Prior knowledge

In building a model, the designer has control
over three parts of the process
Generating the data set D
Selecting a (set of) model structure
Selecting the criteria (least squares for
instance), used to specify the optimal parameter
estimates validate the model
Section 3 is really about steps 23, section 4 is
about step 1.

Experiment Design
Data
Choose Model Set
Choose Criterion of Fit
Calculate Model
Validate Model
?
?
23
3. Elements of System Identification

Understanding system identification involves a
mix of techniques
1. Statistical view of induction and inference
because all work is relative to the finite,
sampled, noisy data set
2. Optimization
continuous parameter estimation and
discrete model selection
3. Dynamics (control/state space) representation
of the ODE/difference equation
As well as a good knowledge of the application
domain in order to validate the model.

24
3.1 Statistical Modelling

An old statistical proverb is worth remembering
whenever an engineer attempts to develop a
grey/black model
All models are wrong (bad), its just that
some are more useful than others
Typically, models are developed for two main
reasons
Hypothesis testing how do different model
structures compare, when their assumptions are
different?
Predictive accuracy How can a model be
constructed to offer adequate predictive
performance in a simulation
Parametric and non-parametric modelling are both
widely used, but well concentrate on parametric
modelling

25
3.1 Statistical Assumptions

The aim of statistical induction is to try and
deduce a general rule (model) from a finite data
set
The exemplar data is stochastic as it involves
random measurement error/disturbances.
Repeating the experiment will generate a
different exemplar data set.
Induction is stochastic and we can only infer a
probability distribution on the parameters q
p(qD)
Any model inference is therefore also uncertain
p(yq)

Model
Induction
Inference
Data
Prediction
Deduction
26
3.1 Statistical Induction

Lets make 10 repeated (noisy) measurements about
a parameter
Inductive estimates
Even in this simple case, its important to
quantify the uncertainty. Experiments can then
be designed to minimize its effect (maximize
signal to noise ratio)
In addition, we can quantify the amount of
prediction uncertainty

27
3.1 Gaussian Measurement Errors?

A frequent assumption is that the noise is iid
(independent identically distributed), additive
and Gaussian
iid Noise structure may be incorrectly
attributed to other effects. The noise should be
identically distributed to allow us to easily
aggregate errors
Additive noise is added onto the true signal
Gaussian Gaussian distributions are completely
characterised by their mean and variance,
A linear transformation of a Gaussian is another
Gaussian (with known means variances)

28
3.2 Dynamic Model Framework

What signals impact on our model?

v(t)
System q x(t)
y(t)
w(t)
u(t)

The following signals define the model
u(t) are external input signals that can be
controlled (factors )
w(t) are measurable disturbances
v(t) are unmeasurable disturbances
y(t) is the output signal
x(t) are the dynamic, unobservable states of the
model
In addition, the (time invariant) model has a set
of unknown parameters q

29
3.2 ODE/State Space Models

In this talk were mainly concerned with ordinary
differential equation (ODE) systems
This is a generalized representation
f() determines the dynamics of the ODE
g() determines the structure of the observer
Re-visiting linear state space models

30
3.2 States Mass Conservation

A key part in understanding state space/ODE
models is the idea of a state.
It is a minimal signal set, which at any time,
contains enough information to predict future
behaviour, coupled with knowledge of external
signals
In pathway modelling problems, the states
generally correspond to molecular concentrations
NB we often have mass conserving constraints such
as .
These can be easily represented in the ODE so the
state transition matrix satisfies
This defn works in reverse on the left singular
values of A

31
3.2 Linear State Space Responses

Linear state space models and their dynamics are
very well understood
Model form
General solution
Both the autonomous forced dynamical behaviour
is determined by the exponential matrix eAt
l real positive unstable
l real negative stable
l imaginary - sinusoidal

Parameters A, B C
Autonomous response
Forced/convolved response
32
3.2 Non-linear State Space Models

In reality, any non-linear function could be used
for the state transition observer functions,
f() g()
Often the observer function is a simple binary
matrix which represents which states can be
measured
Remember, missing states may still be observed
There has been a lot of work by researchers
developing and analysing new non-linear state
transition functions
Bilinear non-linear state space models are a
simple/popular tool for cellular models
(dimerization)

33
3.3 Parameter Estimation

A key part of system identification is to assume
that the model structure is given (and correct!)
The goal is to estimate parameters whose values
cannot be directly estimated or derived from
first principles models
As one of the system biology goals is to better
understand first principle models and reaction
rates are difficult to estimate, parameter
estimation is a frequent problem in systems
biology

34
3.3 Model Parameters

Lets consider the case where we have an
autonomous, bilinear state space model
This is a linear in the parameter model, but
non-linear in the state.
NB Often the A1 and A2 are quite sparse
Consider the extended state
Then the linear dependency of the model
parameters on the extended state is easy to see

35
3.3 Optimality Criterion

For simplicity, lets assume that all the states
can be directly measured
For parameter estimation, we have the
input-output measured, sampled data u(k), x(k)k
Then the basic parameter estimation problem can
be posed as
This is least squares parameter estimation
There are many variations on this basic
optimality criterion a key thing is to
recognise what assumptions they impose on problem
domain

f(q)
q
36
3.3 Optimization for Dynamic Systems

The optimality criterion contains the (discrete
time) state estimates which must be obtained by
integrating the current model estimate.
Optimization iterates parameter estimates
We have the following scenario

Final estimates
Sample data
Parameter iterations
u(k), x(k)
Even when dx/dt depends linearly on q, the
dynamic model means that the optimality criterion
is not quadratic (exponential)
Solve model
37
1 Page Matlab

theta, fval, exitflag fminunc(_at_mmobj, 3 0.5
0.2)
function f mmobj(theta)
dt 00.0510
theta1 2 1 0.1
t1, x1 ode45(_at_(t,x)mmode(t,x,theta1), dt, 1
0.2 0 0)
t2, x2 ode45(_at_(t,x)mmode(t,x,theta), dt, 1
0.2 0 0)
f sum(sum((x1-x2).2))
function dx mmode(t, x, theta)
dx zeros(4,1)
dx(1) -theta(1)x(1)x(2) theta(2)x(3)
dx(2) -theta(1)x(1)x(2) theta(2)x(3)
theta(3)x(3)
dx(3) theta(1)x(1)x(2) - theta(2)x(3) -
theta(3)x(3)
dx(4) theta(3)x(3)

38
3.3 Least Squares Assumptions

Why is the least squares criterion so widely
used?
For statistical models, it corresponds to
minimizing the variance of the measured signal
with independent and identically distributed
noise.
The optimal model is the mean
One common variation in system biology is to
assume that the noise variance differs for each
state

39
3.3 Gradient Estimates

Without wanting to go over all the 1st/2nd order
optimization theory, lets consider the very
simple gradient descent parameter estimation
procedure
We can view this as
sliding down a bowl to the
bottom. The choice of h
determines stability/rate
of convergence.
We can show

40
3.3 Local Sensitivity

Using the least squares cost function, the
gradient is evaluated as
The objective gradient depends on the model
sensitivity
Using the model sensitivity in optimization
algorithms (PPT), were implicitly performing
some form of gradient descent
Using the normalized sensitivity
Implicitly nomalizes the cost function

41
3.3 Local Sensitivity for a Linear Model

In gradient-based optimization, obtaining the
sensitivity derivatives is v. important.
It can be calculated from an ODE, just like the
state space equations
Solving for the sensitivity states is a joint
ODE problem (extended ode function in Matlab)

42
1 Page Matlab

Pseudo/untested code
Simulation
t, xS ode45(_at_mmode, 0 10, 1 0.2 0 0
zeros(1,12))
plot(t,xS)
ODE representation
function dx mmode(t, x)
k1 2
k2 1
k3 0.1
dx zeros(16,1)
States
dx(1) -k1x(1)x(2) k2x(3)
dx(2) -k1x(1)x(2) k2x(3) k3x(3)
dx(3) k1x(1)x(2) - k2x(3) - k3x(3)
dx(4) k3x(3)
Sensitivities

43
3.3 Parameter Variance/Co-variance

A key measure of identifiability and a key link
with experimental design is the estimated
parameter variance/covariance matrix
It forms part of 2nd order optimization
procedures
It measures the amount of uncertainty
correlation in the parameter estimates
For simple, static, linear systems it measures
all the uncertainty. More generally, its a
local approximation

44
Example S Parameter Validation

Consider the two parameter model with uncertainty
regions
We can marginalise the distribution to perform
single hypothesis testing on each parameter.
We can compare whether a new estimate is
statistically significant from a previous
estimate
In many systems biology system identification
problems there are significant problems with
identifiability

45
3.4 Pathway Selection

Practical implications for the modelling tasks
are the emphasis on network structures over exact
values of the kinetic parameters
A key assumption in parameter estimation is that
we know the model structure
Non-linear f() g()
Linear order sparse structure of A, B and C
This means the estimation residual corresponds to
the stochastic noise term
In practice, we rarely know the parametric model
structure exactly.
Model selection is a discrete, non-convex search
through the space of all possible models in order
to find one that is suitably predictive and
suitably simple.

46
3.4 Selection Identifiability

In practice, even though a reaction may exist,
the corresponding parameter value may be very
difficult to estimate.
Problem of identifiability
The problem is badly conditioned

47
3.4 Null Hypothesis Test

Locally, the estimated parameter uncertainty is
Gaussian when the measurement errors are Gaussian
To determine whether each feature/parameter is
significant, we can apply the null hypothesis
test.
This essentially determines whether there is
enough information in the training data set to
say that the particular parameter is
significantly non-zero, i.e. does the parameter
distribution contain the value 0, for a 67, 95
or a 99.7 certainty
Zero parameter ? no reaction
Simple way to compare two models, differing by
one extra parameter

p(q)
67 certainty
95 certainty
0
E(q)
99.7 certainty
48
3.4 Convex Re-formulations

Remember, there is no correct parameter
estimates/discrete model structure!
Parametric model selection is a discrete
combinatorial, and when the parameter estimation
problem is badly conditioned, there are often
numerous local minima
Were looking at regularization approaches for
parameter estimation approximate sparse model
selection

49
3.4 Example RKIP regulated ERK signaling pathway

11 states
11 parameters
0.5s worth of identification data

50
Calculated Parameter Locus
Estimated parameter locus (l 0-gt8)

Many parameters are largely conditionally
independent however a couple of parameters
exhibit Simpsons paradox (change sign when a new
parameter is introduced/removed)

51
3. Summary

Weve really just scratched the surface of
parametric modelling techniques.
Dont reinvent the wheel, learn how to use
existing libraries
Important issues not really discussed
Model validation (this is more than just looking
at mse)
Non-convex optimization (v. important for
medium-large parameter estimation problems)
Non-linear extensions to well-founded linear
state space models
Hidden state estimation/filtering
Current work is looking at
Regularization for parameter estimation sparse
model selection
Non-parametric (kernel/gaussian processes)

52
4. Experimental Design

The basic assumption here is that it is expensive
to perform an experiment, hence the aim of
experimental design is
maximise the identification information while
minimizing the number of experiments
This is relative to the question being asked
about the model
How predictive is the model
How well identified are the parameters
How does M1 compare with M2
If the modelling purpose is not well understood,
the experimental design problem is ill-posed.
Like parameter estimation, experimental design is
an optimization problem

53
4. Why do it?

Consider a single factor, linear, static model,
where the measurements contain some additive,
zero mean, iid Gaussian noise
The inputs are constrained by
Where should we place 2 experiments to best
identify q?

54
4. Experimental Design Parameters

Given a (linear) model
there are many experimental parameters that can
be optimized
The initial state values x0
Which states to observe C
The input/excitation signal u(k)
The sampling time/rate tkkT
These are a mix of continuous, discrete and time
varying variables which are problem dependent.

55
4. Experimental Design Optimality

The basic measure of optimality is the Fisher
information matrix
forms a lower bound for the parameter
variance/covariance matrix
This produces a vector optimization problem and
common matrix (alphabet) scalarization measures
A-optimality - tr(Sq) (eigenvalue sum)
D-optimality - det(Sq) (eigenvalue product)
E-optimality - maxj l(Sq) (largest eigenvalue)
all give (different) estimates of the amount of
parametric uncertainty resulting from the
experimental data.

l1,v1
l2,v2
56
4. Dynamic Experimental Design

The basic problem with applying experimental
design to dynamic models is that the sensitivity
derivative which depends on the current parameter
estimates
Sequential experimental design
Perform a new experimental design Dk after each
parameter estimate
Robust experimental design
Calculate an experimental design which is valid
for a range of parameter values

sensitivity
estimation
exp design
57
4. Example Observer Selection

For the NF-kB model/experimental set-up, there
exists a constraint that only 4 states (out of
20) can be directly measured
This is a discrete optimization problem and is
often relaxed as
Note that this assumes all observations can be
made in order to select which are the most
informative! An iterative, sampled approach may
be more realistic if there are measurement
constraints.

58
4. Example Robust Experimental Design

Produce a design which minimizes the uncertainty
in a space around the nominal model estimate.
Because of the uncertainty in the parameter
estimates, we have an associated bounded
uncertainty in the sensitivity estimates
Like the regularization approach to sparse model
selection, this uses a convex re-formulation to
the problem and we can exploit a lot of
semi-definite programming theory.

59
4 Example Robust Experimental Design

Increasing the uncertainty (p large) means that
all observations are equally valid, but when the
uncertainty is small, 4 observations larger than
the others.

60
4. Conclusions

Experimental design is a bran flakes-type
technology, not necessarily very exciting, but
very useful.
Variety of ways of posing questions
Better to do some analysis, even approximate,
than none at all
System identification and experimental design are
mature techniques and have a lot to offer systems
biology researchers.
However, they do not pretend to be a complete
theory and problems in systems biology offer
challenges in terms of scalability, robustness
Thanks to Fei He, George Papadopoulos, Liang Sun,
Hong Yue, Doug Kell, Steve Wilkinson