Title: Stochastic Estimation of Fluxes in Metabolic Networks
1Stochastic Estimation of Fluxes in Metabolic
Networks
- Visakan Kadirkamanathan
- Signal Processing and Complex Systems Research
Group, - Department of Automatic Control Systems
Engineering - The University of Sheffield, United Kingdom
2Overview
- Background to metabolic systems
- Metabolic Flux Analysis (MFA)
- MFA based on stoichiometry
- MFA based on 13C tracer experiments
- Steady-state metabolic systems analysis
- Least Squres Estimation
- Expectation-Conditional-Maximisation
- Markov Chain Monte Carlo
- Dynamic metabolic systems analysis
- Particle filtering based flux estimation
3Systems Biology
- Systems biology1 the study of the
interactions between the components of a
biological system, and how these interactions
give rise to the function and behaviour of that
system.
1 Alberghina L. and Westerhoff H.V. (Eds.)
(2005.). "From isolation to integration, a
systems biology approach for building the Silicon
Cell". Systems Biology Definitions and
Perspectives, Springer-Verlag
4Metabolic Systems
- Metabolic System
- Metabolic system in a cell governs the
intracellular chemical and physical changes that
enables growth and function of cell - Metabolic Engineering
- Improvement of cellular activities by
manipulation of enzymatic transport and
regulatory function of the cell - Metabolic Pathways
- Sequence of feasible and observable biochemical
reaction steps connecting a specified set of
input and output metabolites - Metabolic Fluxes
- The rate at which input metabolites are processed
to form output metabolites
5Metabolic Network Map of E.coli2
2 E. Almaas, B. Kovács,, T. Vicsek, Z. N.
Oltvai and A.-L. Barabási, Global organization of
metabolic fluxes in the bacterium Escherichia
coli, Nature 427, 839-843
6Metabolic Network Analysis
- Qualitative and Quantitative Analysis
7Metabolic Flux Analysis
- Goal to calculate and analyze the steady state
conversion - rates (fluxes)
- Steady state from the law of conservation of
mass, the rate of consumption is equal to rate
of production for each metabolite
8Stoichiometric Equation
- Linear system of equations (Stoichiometric
Equation) can be solved for metabolic fluxes.
- mgtn overdetermined system ?
- mn
? - mltn underdetermined system ?
- The number of fluxes is usually greater than the
number of metabolites. - To obtain a unique solution
- Constraints must be included and/or
- Additional independent measurements must be made
9Flux Balance Analysis
- Flux balance analysis3 flux distribution is
obtained by calculating the solution that
optimises an objective function while fulfilling
the constraints imposed by the metabolite
balances.
such that
and
- Flux estimates obtained through Linear
Programming - Objective function aligned to cell metabolism
objective - maximisation of biomass (ATP and NADH)
- minimising total flow of metabolites
- Method suitable only for identifying general
tendencies and inter-relations
3 A. Varma and B. O. Palsson, Metabolic flux
balancing basic concepts, scientific and
practical use, Bio/Technology., vol. 12,
pp.994998, 1994.
1013C Tracer based Flux Analysis
- MFA-based on 13C tracer experiments4 utilise
labelled substrates (usually 13C, measured by
nuclear magnetic resonance or gas
chromatography/mass spectrometry) to obtain
carbon mass balance constraint of the system, so
as to form an overdetermined system.
- The C labelling patterns leads to different
metabolite balancing representations of varying
complexity - Isotopomer balancing 2n per metabolite
- Positional enrichment balancing n per
metabolite - The trade-off for simplicity in models with
positional enrichment balancing is the loss of
detail in the estimated model.
4 C. Zupke and G. Stephanopoulos, Modeling of
isotope distributions and intracellular fluxes in
metabolic networks using atom mapping matrices,
Biotechnol. Prog., vol. 10, no. 5, pp. 489498,
1994.
11MFA based on 13C Tracer Experiment
or
where x is the metabolite labelling information,
, ,
and l is the number of carbon atoms
of all the intracellular metabolites.
12Metabolic Flux Estimation
Assumption Metabolic network structure is known.
Fractional labelling of metabolites measured.
Problem Estimate flux v given the stoichiometry
and carbon mass balance relations Unknown
fluxes can be solved by linear least squares
techniques5.
- Ordinary Least Squares (LS) if b is subjected
to model error or residual e, i.e.
then
- Constrained Least Squares (CLS) consider only
non-negative solutions, i.e.
or
5 C. L. Lawson and R. J. Hanson. Solving Least
Squares Problems. Prentice-Hall, Englewood
Cliffs, New Jersey, 1995
13Total Least Squares Estimation
- Measurement Noise
- Metabolite fractional labelling data is noisy
ordinary least squares and constrained least
squares formulation inappropriate
- Total Least Squares (TLS) if A(x) and b are
subjected to model error/ - measurement noise, i.e.
then
- Robustness to Noise
- Solution is essentially a bias compensated least
squares
14Cyclic Pentose Phosphate Pathway and its 13C
Enrichment Balance
15 Least Squares Flux Estimates
Means of the flux estimates using LS, TLS, CLS1
and CLS2 algorithms with 5 measurement errors.
16Linear least squares performance
- NMSE performance of flux estimation using LS,
TLS, CLS1 and CLS2. Measurement errors (5-20)
are added to the true labelling data.
17Nonlinear Least Squares Estimation
Alternative approach to accommodate noise and
incomplete labelling data
Stoichiometry
Carbon Mass Balance
Measurement equation
is the measurement noise
The above problem can iteratively be solved by
constrained optimisation techniques.
18Incomplete Data and Noise
System Equations
Measurement Equation The incomplete labelling
data can be represented by the following
measurement equation
where is the
measurement error and x is the vector of
labelling data which can be written in terms of
the flux vector v as
where is the
modelling error
Estimate the unknown fluxes from the available
labelling data, y
19Maximum Likelihood Estimation
Maximum Likelihood (ML) Flux Estimation
In ML estimation, the unknown parameters
are estimated by
maximising the likelihood function, i.e. pdf of
the observed data y for given ? over the
parameter space ?
ML flux identification problem specifies a
complicated nonlinear optimisation problem in
several variables and the estimation becomes
hard with high computational effort, especially
when the number of unmeasured labelling data is
high.
20Expectation Maximisation
- The EM algorithm6 simplifies the direct ML
estimation.
Initial guess ?
E-step compute the expectation of
the complete-data log-likelihood with respect to
6 G. J. McLachlan and T. Krishnan, The EM
algorithm and extensions, John Wiley and Sons,
New York, 1997.
21Expectation Conditional Maximisation
- Generalised EM (GEM) In the M-step, is
chosen such that the Q-function increases rather
than maximise it.
- Expectation/Conditional Maximisation (ECM)7
is a class of GEM which replaces a complicated
M-step of EM with several computationally simpler
N CM-steps.
Concept In the next iteration, a set of
appropriate values of the unknown fluxes and
associated variances are
calculated such that
where ?kn/N denotes the value of ? on the nth
CM-step of the (k1)th iteration with N the
number of CM-steps of the ECM algorithm.
7 X. L. Meng and D. B. Rubin. Maximum
likelihood estimation via the ECM algorithm a
general framework. Biometrika, 80267-278, 1993.
22Metabolic Flux Estimation by ECM
- ECM algorithm applied to metabolic flux
estimation
Given current
CM1
- Calculate
the conditional mean of p(xy,?k)
- Form A(x)k by treating as the complete
data.
- Calculate vk1 using a linear LS-based
technique.
CM2
- Calculate which maximises
CM3
- Calculate which maximises
Stop if or
is sufficiently small
23 Central metabolism of Corynebacterium glutamicum
24ECM Estimation Results
Assumption the labelling data o the precursors
which can derived from the amino acids
GAP,PYR,CO2,P5P,E4P,OAA,AKG, including G6P and
F6P are measured, resulting in around 30
proportion of missing data.
25ECM Algorithm Efficiency
26Incorporating Noise
DNA
RNA
Metabolites
Proteins
Cells
Noise
Assume that the noise has not influenced the
stoichiometric structure of the system
Bayesian analysis to get the posterior
distribution of the unknown fluxes
27Bayesian Approach
From Bayesian analysis
From
From
From
However, it is still difficult to obtain an
analytical form for the posterior distribution of
v due to the integration involved, except of some
special noise distributions.
28Markov Chain Monte Carlo Method
Directed acyclic graph representation of the
model 9
v
x
y
The full conditional distribution of v The full
conditional distribution of x
Use Gibbs sampling10 to obtain the MCMC of the
posterior distribution of v
9 Gilks, W. R., Richardon, S. and
Spiegelhalter, D. J. Markov Chain Monte Carlo in
Practice. Chapman Hall/CRC, Boca Raton,
Florida, 1996 10 Gelman, S. and Gelman, D.
Stochastic relaxation, Gibbs distributions and
the Bayesian restoration of images. IEEE Tran.
Pattn. Anal. Mach. Intel, 6, 721741, 1984
29 Central Metabolism of Corynebacterium glutamicum
30 Flux Distribution Results
31Dynamic Metabolic System Analysis
- The intracellular fluxes and metabolite
concentrations vary with time Dynamic analysis
- Rapid sampling and fast quenching11
- The cells are stimulated by a quick substrate
pulse and the response sample volumes afterwards
are suspended by immediate rapid sampling and
fast quenching facilities. The sample volumes are
then separated to extracellular samples and
intracellular samples by various-purposed
centrifugation and extractions. A combination of
enzymatic assays, HPLC and ESI-LC-MS can be used
to quantify the concentration data. - Drawbacks of rapid sampling experiment
- Limit of detection (concentration has to be over
1mM) - Intracellular volume is often less than 3 of the
total sample volume
11 A. Buchholz, J. Hurlebaus, C. Wandrey, and
R. Takors, Metabolomics Quantification of
intracellular metabolite dynamics, Biomol. Eng.,
vol. 19, pp. 515, 2002.
32Intracellular Metabolite Estimation
- Time-series extracellular concentration
measurements provide the known information from
the system - Available Michaelis-Menton kinetics for reactions
provide the transfer function of a system - The quantification problem is then transformed to
a system state estimation problem
Measured input
Measured output
Known structure with unknown states
33 Dynamic System State-Space Model
For a metabolite X,
For a flux v between X1 and X2, from
Michaelis-Menton kinetics,
The measurement equation
noise
State-space model
34Sequential Monte Carlo Filter 12
The target distribution
12 A. Doucet, N. de Freitas, and N. Gordon,
Eds., Sequential Monte Carlo Methods in Practice.
Springer-Verlag, 2001
35Sequential Importance Sampling
For samples zi from its distribution p(z)
Importance sampling
Where zi is the ith sample from a proposal
distribution q(z)
Assume proposal distribution
36Particle Filter
Generate N random samples
For each time
Generate new samples from
Update weights
Resample if effective sample size below threshold
Estimate
37Simulated Measurements
- Set the initial condition of all metabolites and
keep the input Glucose concentration to a fixed
amount - In steady state, increase the concentration of
Glucose to a larger amount and keep all the other
metabolite concentrations unchanged (in order to
simulate the pulse experiment during rapid
sampling). - Add Gaussian noise to the recorded data to get
simulated measurement data.
38 The Simulated Measurements
Input Glucose initial concentration 20mM, then
jump to 40 mM during pulse experiment
Output metabolites Ethanol and Glycerol. Initial
concentration 0.0mM
39SMC Estimation Results
40Summary
- Metabolic flux estimation is commonly viewed as a
least squares estimation problem - Missing intracellular metabolite measurements
handled by a maximum likelihood formulation
solved by EM - Incorporating limited uncertainties via a
Bayesian approach using MCMC allows for
distribution of flux estimates - A sequential Monte Carlo filter approach to
estimate intracellular metabolite quantities
using enzyme kinetic models -
41Acknowledgements
- Dr. Jing Yang (PhD Student Oct 2003-Dec 2006)
- Dr. Sarawan Wongsa (PhD Student Oct 2002-Jan
2007) - Professor Steve Billings (Signal Processing
Complex Systems) - Professor Phillip Wright (Systems Biology)
- Professor Mike Williamson (Biochemistry
Biotechnology)