Title: Realistic Uncertainty Bounds for Complex Dynamic Models
1Realistic Uncertainty Bounds for Complex Dynamic
Models Andrew Packard, Michael
Frenklach CTS-0113985 April 2005
- Developed a formalism involving assertions
expressed as polynomial inequalities on a
parameter space. Use global optimization
methods, developed in control systems analysis,
with origins in algebraic geometry. Novel
re-analysis of GRI Data Set. - Reasoning on Collections of assertions
- test for consistency
- inconsistency falsifies at least one
- sensitivity of consistency to data
- which are likely false?
- infer additional implications from assertions
- sensitivity of inferred conclusions to data
- which assertions have the most impact?
Our research focuses on the benefits of treating
models/data pairs as assertions, that can be
shared and reasoned with using automated
algorithms. Message Use collaboration through
model/data sharing and automated reasoning to
extract the totality of information in the
community data sets.
Frenklach, Packard, Seiler and Feeley,
Collaborative data processing in developing
predictive models of complex reaction systems,
International Journal of Chemical Kinetics, vol.
36, issue 1, pp. 57-66, 2004. Frenklach, Packard
and Seiler, Prediction uncertainty from models
and data, 2002 American Control Conference, pp.
4135-4140, Anchorage, Alaska, May 8-10,
2002. Seiler, Frenklach, Packard and Feeley,
Numerical approaches for collaborative data
processing, to appear Optimization and
Engineering, Kluwer, 2005. Feeley, Seiler,
Packard and Frenklach, Consistency of a reaction
data set, Journal of Physical Chemistry A, vol.
108, pp. 9573-9583, 2004. Project website
http//jagger.me.berkeley.edu/pack/nsfuncertainty
2Chemical Kinetics Modeling
- Chemical kinetics modeling is a form of
- high dimensional (mechanisms are complex),
- distributed (efforts of many)
- system identification.
- The effort of researchers yields complex,
intertwined, factual assertions about the
possible values of the model parameters - Handbook style of parameter, nominal, range,
reference will not work - Each individual assertion is usually not
illuminating in the problems natural
coordinates. Concise individual conclusions are
rare. - Information-rich, anonymous collaboration is
necessary - Machines must do the heavy lifting.
- Managing lists of assertions
- Reasoning and inference
3Separate asserted facts from analysis
- Two types of assertions models and observed
behavior - (Web-based) assertion of models of physical
processes (e.g., if we knew the parameter
values, this parametrized mathematics would
accurately model the process) - (Web-based) assertion of measured outcomes of
physical processes (e.g., I performed expt, and
the process behaved as follows) - Together, these form constraints in
"world"-parameter space of physical constants.
Parameters which satisfy all are feasible (or
unfalsified). - Analysis (global optimization) on the assertions
- Check consistency of a collection of assertions
- Sensitivity of consistency to changes in a single
assertion - Discover highly informative (or highly suspect)
assertions - Explore the information implied by the assertions
- Determine possible range of different scalar
functions on the feasible set. - (old standby) Generate parameter samples from the
feasible set.
Weve taken this perspective, and re-analyzed the
GRI-Mech data set. The results are very
encouraging.
4GRI DataSet
- The GRI-Mech (www.me.berkeley.edu/gri_mech)
DataSet is collection of 77 experimental reports,
consisting of models and raw'' measurement
data, compiled/arranged towards obtaining a
complete mechanism for CH4 2O2 ? 2H2O CO2
capable of accurately predicting pollutant
formation. The DataSet consists of - Reaction model 53 chemical species, 325
reactions (nonlinear). - Unknown parameters (?) 102 parameters,
essentially the various rate constants. - Prior Information Each normalized parameter is
known to lie between -1 and 1. - Processes (Pi) 77 widely trusted, high-quality
laboratory experiments, all involving methane
combustion, but under different - physical manifestations, and different
conditions. - Process Models (Mi) 77 1-d and 2-d numerical
PDE models, coupled with the common reaction
model. - Measured Data (di,ui) data and measurement
uncertainty from 77 peer-reviewed papers
reporting above experiments.
M1(r)
d1 ? u1
Chemistry(r)
Transport 1
Process P1
300 Reactions, 50 Species
CH4 2O2 ? 2H2O CO2
100 unknown parameters
each has -1?k1
Process P77
Process P2
d2 ? u2
d77 ? u77
The prior information, models and measured data
constitute assertions about possible parameter
values.
- kth assertion associated with prior info
- Assertions associated with ith dataset unit
5Manual management of uncertainty propagation
- Manual (paper/email) mode would require an
efficient uncertainty description (linear in
number of model parameters, say).
- Eg., use handbook type description
- parameter values
- plus/minus uncertainty
- Equivalent to requiring a coordinate-aligned cube
to contain feasible set.
- Very ineffective in extracting the predictive
capability of GRI data ie., using assertions to
predict the outcome (a range) of another model - (M1) Use 76 assertions to reduce the parameter
uncertainty to a cube (as above), then do
prediction of 77th models outcome on this cube - (M2) Use 76 assertions directly to predict the
range of the 77th models outcome - Loss value L1 means M1 is no better than just
using the prior info L0 means M1 is as
effective as M2
6Consistency results for GRI-DataSet assertions
- Collection of 77 assertions is consistent.
- Nevertheless, a quantitative consistency measure
was found to be very sensitive (using multipliers
from the dual form) to 2 particular experimental
assertions, but not to the prior info.
Experiment
The scientists involved rechecked calculations,
and concluded that reporting errors had been
made.
Both reports were updated -- one measurement
value increased, one decreased -- exactly what
the consistency analysis had suggested.
Sensitivity of the consistency measure to
individual assertions is greatly reduced, and
spread more evenly across data set.
7How are we computing?
- Transforming real models to polynomial models
- Large-scale computer experimentation on M(r).
- Random sampling and sensitivity calculations to
determine active parameters - Factorial design-of-experiments on active
parameter cube - Linear, Quadratic or Polynomial (stay in
Sum-of-Squares hierarchy) fit - Assess the residuals, account for fit error in
assertion - Assertions become polynomial inequality
constraints - Most analysis is optimization subject to these
constraints - S-procedure, sum-of-squares (scalable emptiness
proofs, outer bounds) - Outer bounds are also interpreted as solutions to
the original problem when cost is an expected
value, constraints are only satisfied on average,
and the decision variable is a random variable. - Branch Bound (or increase order) to eliminate
ambiguity due to fit errors - Off-the-shelf constrained nonlinear optimization
for inner bounds - Use stochastic interpretation of outer bounds to
aid search