Title: Third Summer school on
1- Third Summer school on
- Sensitivity Analysis of Model Output
- Variance-based Methods
- Stefano Tarantola
- September, 14th 2004 - Venice
2Plan of Presentation
Terminology and generalities on variance-based
methods
Methods for non-correlated input factors
Methods for correlated input factors
3Introduction to variance-based methods
4Specification of the input factors
Marginal p.d.f. correlation structure
5Propagation of uncertainty
Model
Input
Output
x1
x2
x3
y
x4
xk
xi input factors
6HDMR
Some SA methods use the decomposition of Yf(x)
into main effects and interactions
This is called High Dimensional Model
Representation (HDMR).
E.g., if k3
HDMR is non-unique.
The total number of summands in the HDMR is
7Properties of HDMR
8Properties of HDMR
then the HDMR decomposition has the properties
f0 mean value of f(x)
All the summands are orthogonal
The HDMR is unique and each term can be defined
as
9Properties of HDMR
10ANOVA - HDMR
11Sensitivity indices
An exhaustive sensitivity analysis should provide
all these indices (i.e. 2k-1 estimates).
12Sensitivity indices
Consider an example with k4
Total number of terms 4641 15 24-1
13Main effects
variance over xi
Minimum loss L when we approximate f(x) with
E(YXi)
The expected amount of variance that would be
removed from the total output variance, if we
were able to learn the true value of xi
On the computation soon
14Joint effects
The expected amount of variance that would be
removed from the total output variance, if we
were able to learn the true value of both xi and
xj
Measures the relative importance of any pair of
input variables in driving the output uncertainty
15Total effects
The expected amount of variance that would
remain unexplained (residual variance) if xi,
and only xi, were left free to vary over its
uncertainty range, all other variables having
been learnt
Use for model simplification, to identify
unessential variables in the model, which are
not important neither singularly nor in
combination with others.
A variable with a small value of its total effect
sensitivity index can be frozen to any value
wihin its range.
16Total sensitivity indices
17Calculation of main effects
Computationally expensive
18Calculation of the sensitivity indices
For non-correlated input factors shortcuts are
available
FAST (Fourier Amplitude Sensitivity Test),
(Cukier et al., 1973)
The method of Sobol, (Sobol, 1993)
EFAST (Extended FAST), (Saltelli, Tarantola and
Chan, 1999)
An improvement of the Sobol method, (Saltelli,
2002)
19The FAST method
The variance of Y (in a space of dimension k) is
re-written as a 1-dimensional integral with
respect to a scalar variable s
Weyls Th. holds if the parametric curve is
space filling (frequencies are incommensurate)
First order indices are estimated via Fourier
spectrum calculated at specific frequencies
20The method of Sobol
Transforms the double-loop integral (in a space
of dimension k) ...
in the integral of the product of f(x) and
f(x) (in a space of dimension 2k-1) Ishigami
and Homma,1990
21The method of Sobol
22The available methods
23A further Extension (SAMO 2004, Santa Fe)
Exploiting additional symmetries in the
formulas More estimates of first order and total
indices at no extra cost
24A further Extension Notations
25A further Extension Procedure
Cost r(2k2)
26A further Extension Substitutions
27A further Extension Exploitation of symmetries
only symmetries that guarantee numerical accuracy
are exploited
28A further Extension Total indices
Exploitation of symmetries results in 4
different estimates of total effects
AT NO EXTRA COST
29Convergence tests for Total effects g-function
30Convergence tests for first order indices
g-function
31Conclusions
Using average indices after exploiting
symmetries is generally more efficient
For important variables we get better total
indices
For less important variables we get better first
order indices
AT NO EXTRA COST
32Calculation of the sensitivity indices
For correlated input variables computations are
heavier
Brute-force methods (Nrk)
Correlation ratios with r-LHS (Nr)
33Brute-force approach
at N points over k-1 dimensions
and at a large number of xi values (r100).
Cost for one main effect Nr
Total Cost Nrk
And then calculate the variance over xi
34Correlation ratiosr-LHS
Use of r-LHS (replicated Latin hypercube
sampling). Ex. Size N5 with r2 replicates
35Correlation ratiosr-LHS
Estimator proposed by McKay (1995)
Model predictions
total
between
within
Cost rN
36Limitations of the variance-based methods
- (1) They are more expensive to compute than
regression coefficients - (2) They are elaborate to code
- (3) They assume that all the information about
the uncertainty of Y is captured by its variance