GLOBAL SENSITIVITY ANALYSIS BY RANDOM SAMPLING - HIGH DIMENSIONAL MODEL REPRESENTATION (RS-HDMR) Herschel Rabitz Department of Chemistry, Princeton University, Princeton, New Jersey 08544 - PowerPoint PPT Presentation

About This Presentation
Title:

GLOBAL SENSITIVITY ANALYSIS BY RANDOM SAMPLING - HIGH DIMENSIONAL MODEL REPRESENTATION (RS-HDMR) Herschel Rabitz Department of Chemistry, Princeton University, Princeton, New Jersey 08544

Description:

... output as a hierarchical correlated function expansion of inputs: ... RS-HDMR component functions are approximated by expansions of orthonormal polynomials ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: GLOBAL SENSITIVITY ANALYSIS BY RANDOM SAMPLING - HIGH DIMENSIONAL MODEL REPRESENTATION (RS-HDMR) Herschel Rabitz Department of Chemistry, Princeton University, Princeton, New Jersey 08544


1
GLOBAL SENSITIVITY ANALYSIS BY RANDOM SAMPLING -
HIGH DIMENSIONAL MODELREPRESENTATION (RS-HDMR)
Herschel RabitzDepartment of Chemistry,
Princeton University,Princeton, New Jersey
08544
2
HDMR Methodology
  • HDMR expresses a system output as a hierarchical
    correlated function expansion of inputs

3
HDMR Methodology (Contd.)
  • HDMR component functions are optimally defined
    as
  • where
    are unconditional and conditional probability
    density functions

4
RS (Random Sampling) HDMR (Contd.)
  • RS-HDMR component functions are approximated by
    expansions of orthonormal polynomials
  • Inputs can be sampled independently and/or in a
    correlated fashion
  • Only one set of data is needed to determine all
    of the component functions
  • Statistical analysis (F-test) is used proper
    truncation of RS-HDMR expansion

5
Global Sensitivity Analysis by RS-HDMR
  • Individual RS-HDMR component functions have a
    direct statistical correlation interpretation,
    which permits the model output variance to be
    decomposed into its input contributions
  • Where are defined as the
    covariances of
  • with f(x),
    respectively

6
A Propellant Ignition Model
Calculated profiles of temperature and major mole
fractions for the ignition and combustion of the
M10 solid propellant
7
A Propellant Ignition Model
  • 10 independent and 44 cooperative contributions
    of inputs were identified as significant

8
A Propellant Ignition Model
  • Nonlinear global sensitivity indexes efficiently
    identified all significant contributions of inputs

9
Trichloroethylene (TCE) Microenvironmental/Pharmac
okinetic Modeling
Microenvironmental/exposure/dose modeling system
Structure of TCE-PBPK model (adapted from Fisher
et. al., 1998)
10
Example Trichloroethylene (TCE)
Microenvironmental/Pharmacokinetic Modeling
  • The coupled microenvironmental/pharmacokinetic
    model
  • Three exposure routes (inhalation, ingestion, and
    dermal absorption)
  • Release of TCE from water into the air within the
    residence
  • Activities of individuals and physiological
    uptake processes
  • Seven input variables age (x1), tap water
    concentration (x2), shower stall volume (x3),
    drinking water consumption rate (x4), shower flow
    rate (x5), shower time (x6), time in bathroom
    after shower (x7) are used to construct the
    RS-HDMR orthonormal polynomials
  • Target outputs the total internal doses from
    intake (inhalation and ingestion) and uptake
    (dermal absorption)
  • The amount inhaled or ingested
  • The amount absorbed
  • C(t) exposure concentration, IR(t) inhalation
    or ingestion rate, Kp permeability coefficient,
    SA(t) surface area exposed

11
Trichloroethylene (TCE) Microenvironmental/Pharmac
okinetic Modeling
  • Inputs (x1, x2, x3, x4) have a uniform
    distribution, and inputs (x5, x6, x7) have a
    triangular distribution 10,000 input-output data
    were generated

The data distributions for the uniformly
distributed variable x1 and the triangularly
distributed variable x5
12
Trichloroethylene (TCE) Microenvironmental/Pharmac
okinetic Modeling
  • Seven independent, fifteen 2nd order and one 3rd
    order cooperative contributions of inputs were
    identified as significant

First order sensitivity indexes
13
Trichloroethylene (TCE) Microenvironmental/Pharmac
okinetic Modeling
  • Nonlinear global sensitivity indexes (2nd order
    and above) efficiently identified all significant
    contributions of inputs

The ten largest 2nd and 3rd order sensitivity
indexes
14
Identification of bionetwork model parameters
  • Characteristics of the problem
  • System nonlinearity
  • Limited number type of experiments
  • Considerable biological and measurement noise

Multiple solutions exist !
  • Problems with traditional identification methods
  • Provide only one or a few solutions for each
    parameter
  • Assume linear propagation from data noise to
  • parameter uncertainties
  • The closed-loop identification protocol (CLIP)
  • Extract the full parameter distribution by
    global identification
  • Iteratively look for the most informative
    experiments for
  • minimizing parameter uncertainty

15
General operation of CLIP
Pre-lab analysis and design of the most
informative experiments
Iterative experiment optimization and data
acquisition
Global parameter identification
16
Isoleucyl-tRNA synthetase proofreading
valyl-tRNAIle






Rate constants to be identified
Okamoto and Savageau, Biochemistry, 231701-1709
(1984)
17
The inversion module identifying the rate
constant distribution
  • The Genetic Algorithm (GA)
  • Mutation
  • 1101 11111100 0010
  • 1101 11011100 0110
  • Crossover
  • 1101 1100 1111 0010
  • 1101 0010 1111 1100

The inversion cost function
Typical rate constant distribution after random
perturbation/control
Q
Inversion quality index Q
18
  • The analysis module estimating the most
    informative experiments
  • Estimate the best species for monitoring system
    behavior
  • Determine the best species for perturbing the
    system
  • Nonlinear sensitivity analysis by
    Random-Sampling High Dimensional Model
    Representation (RS-HDMR)

19
Optimally controlled identification squeezing on
the rate constant distribution
  • The control cost function

Inversion quality
Non-
Feng and Rabitz, Biophys. J., 861270-1281
(2004) Feng, Rabitz, Turinici, and LeBris, J.
Phys. Chem. A, 1107755-7762 (2006)
20
  • Network property optimization
  • Identifying the best targeted
  • network locations for intervention
  • B. Identifying the optimal network control

Observed Response
Biological System
Learning Algorithm
Control Objective
Control Design
Optimal Network Performance
Optimal Controls
Initial Guess/ Random Control
21
A. Molecular target identification for network
engineering
Random-sampling high dimensional model
representation (RS-HDMR)
Randomly sample k
  • Advantages of RS-HDMR
  • Global sensitivity analysis
  • Nonlinear component functions
  • Physically meaningful representation
  • Favorable scalability

Li, Rosenthal, and Rabitz, J. Phys. Chem. A,
1057765-7777 (2001)
22
Laboratory data on the mutants
k10 - k13 fixed
k6 fixed
k6
k10 - k13
Feng, Hooshangi, Chen, Li, Weiss, and Rabitz,
Biophys. J., 872195-2202 (2004)
23
Example Biochemical multi-component formulation
mapping
  • Allosteric regulation of aspartate
    transcarbamoylase (ATcase) in vitro by all four
    ribonucleotide triphosphates (NTPs)
  • ATcase activity (output) was measured for 300
    random NTP concentration combinations (inputs) in
    the laboratory
  • A second order RS-HDMR as an input -gt output map
    was constructed. Its accuracy is comparable with
    the laboratory error

The absolute error of repeated measurements
24
Biochemical multi-component formulation mapping
The comparison of the laboratory data and the 2nd
order RS-HDMR approximation for used and
test data
Note The two parallel lines are absolute error
0.2
25
The s-space network identification procedure
(SNIP)
Laboratory data on the transcriptional cascade
aTc x1 IPTG x2 EYFP y(x1,x2)
Encode x1?x1m1(s) x2?x2m2(s)
Response measurement y?y(s)
Decode Fourier transform
26
Nonlinear property prediction by SNIP
Unmeasured region correctly predicted
Nonlinear, cooperative behavior revealed
Feng, Nichols, Mitra, Hooshangi, Weiss, and
Rabitz, In preparation
27
SNIP application to an intracellular signaling
network
Laboratory single cell measurement data
Sachs, et al., Science, 308523-529 (2005)
28
Identified network with predictive capability
Network connections identified by SNIP and
Bayesian analysis
Reliable SNIP prediction of Akt levels
29
Example Ionospheric measured data
  • The ionospheric critical frequencies determined
    from ground-based ionosonde measurements at
    Huancayo, Peru from years 1957 - 1987 (8694
    points)
  • Input year, day, solar flux (f10.7), magnetic
    activity index (kp), geomagnetic field index
    (dst), previous day's value of foE
  • Output ionospheric critical frequencies foE
  • The inputs are not controllable and not
    independent the pdf of the inputs is not
    separable, and was not explicitly known

30
Ionospheric measured data
The dependence of foE on the input day
Ionosonde data distribution the dependences
between normalized input variables year and
f10.7, kp and dst for the data at 12 UT
31
Ionospheric measured data
The accuracy of the 2nd order RS-HDMR expansion
for the output, foE
32
Quantitative molecular property prediction
Standard QSAR
General strategy Molecular activity is a
function of its chemical/physical/structural
descriptors
  • Problems
  • Overfitting (choice of descriptors)
  • Underlying physics

A simple solution yf(x1,x2), x11,2,,N1,
x21,2,,N2
Descriptor-free quantitative molecular property
interpolation
33
Descriptor-free property prediction from an
arbitrary substituent order
34
Property prediction from the optimal substituent
order
Cost function
Complexity of the search N1!N2!14!8!1015
Shenvi, Geremia, and Rabitz, J. Phys. Chem. A,
1072066 (2003)
35
Application to a chromophore transition metal
complex library
Before reordering
After reordering
Cost function
Outliers captured by the reordering algorithm
Liang, Feng, Lowry, and Rabitz J. Phys. Chem. B,
1095842-5854 (2005)
36
Application to a drug compound library
15 of data
gt14,000 compounds
Cost function
Reorder
Prediction
37
THE MODERN WAY TO DO SCIENCE
Adaptively under high duty cycle and automated
You should understand the physics, write down
the correct equations, and let nature do the
calculations. Peter Debye
Write a Comment
User Comments (0)
About PowerShow.com