Title: John Paul Gosling
1GEM-SA a tutorial
- John Paul Gosling
- University of Sheffield
2Overview
- GEM-SA
- Gaussian Emulation Machine for Sensitivity
Analysis - Its a Windows based program that has a graphical
interface created by Marc Kennedy during his time
in CTCD - It does emulation for prediction, uncertainty
analysis and sensitivity analysis - It also has a facility to create experimental
designs for the analysis of computer models.
3Starting the program
- On the desktop, there is a folder ltGEM-SA
tutorialgt, opening it will reveal two other
folders - Inside the folder ltGEM-SA1.1gt is the program
- Double-clicking this will start the program
4Main window
menu
toolbar
Sensitivity Analysis output grid
log window
5Generating input designs
Press this button to create a file of inputs for
your computer model
- There are two designs available LP-TAU and
Maximin Latin Hypercube. Both have good space
filling properties.
6Generating input designs
- Then we specify ranges over which the input will
be of interest - These must cover your beliefs about the range of
each input
7The design
- Heres a 50-point LP-TAU design for three inputs
- Youll also find theyve been written to the file
you specified (LP_TAU50.txt) in GEM-SAs working
directory
8Creating/Editing a project
- Now, well run through some of the options
available to us for emulator building. - We can create a new project or edit an existing
project by selecting the appropriate item from
the project menu. - Or we can use these toolbar buttons.
New Edit
9Edit Project - Files
Names of input files
Names of output files
10Edit Project - Options
How many inputs?
Edit input names
11Edit Project - Options
What should be calculated, and how?
Which joint effects should be calculated?
12Edit Project - Options
What prior mean for the output?
Are the inputs uncertain?
13Edit Project - Options
What kind of predictions and cross validation?
14Edit Project - Simulations
MCMC control parameters
Number of realisations for prediction and ME/JE
How many points used to calculate main effects,
joint effects
15Input names
- By clicking the ltNamesgt button, a window opens
that allows us to name each of the inputs. - This can be handy when viewing the variance
decomposition results and main effects plots.
16Distributions for inputs
- When we click the ltOKgt button, the following
window opens. - This windows allows us to specify our beliefs
about the inputs.
17A first run through
- Consider the simple nonlinear model we saw
earlier - y sin(x1)/1exp(x1x2)
- We have 2 inputs, x1 and x2, and we assume they
both must be valued in the range 0,1. - 20 points will give us a decent coverage of the
unit square that is the input space here. - Two files have already been saved in the folder
ltExamples\Eg1gt to help save us time.
18Monte Carlo method
- Heres the result of a Monte Carlo analysis using
30 input pairs.
- Mean 0.139, median 0.142
- Std. dev. 0.053
- Variance 0.0028
19Monte Carlo method
- Heres the result of a Monte Carlo analysis using
10,000 input pairs.
- Mean 0.114, median 0.115
- Std. dev. 0.054
- Variance 0.0029
20Prediction
- Predictions can be
- Correlated realisations of outputs at the
prediction inputs - Similar to main effect outputs
- Marginal means and variances of outputs at the
prediction inputs - Faster to compute, especially with many
prediction points - Easy to interpret
21A plot of the predictions
- Here is the prediction output files plotted with
the real function with x2 fixed at 0.5.
22Cross validation
- Choice of none, leave-one-out or leave final 20
out - Leave-one-out
- Hyperparameters use all data and are then fixed
when prediction is carried out for each omitted
point - Leave final 20 out
- Hyperparameters are estimated using the reduced
data subset
23A real example
- A dynamic vegetation model is being used to
predict the NBP of deciduous broadleaf woodland
in the vicinity of Whitby, North Yorkshire. - The scientists are uncertain about ten inputs of
the model and want to know how this uncertainty
affects the NBP output of the model Monte Carlo
methods are out of the question as the model is
too complex. - When they used their best guesses for these
inputs, the model returned a NBP of 146.4gC/m2.
24The input names in order
- Maximum age (years) N(200,625)
- Water potential (M Pa) N(3,0.25)
- Leaf life span (days) N(190,1600)
- Leaf mortality index N(0.005,6.25e-6)
- Bud burst limit (degree days) N(135,6.25)
- Seeding density (m2) N(0.1,0.0001)
- Soil sand () N(43.27,222.12)
- Soil clay () N(22.36,49.21)
- log(stem growth rate) N(-5.116,0.041209)
- Bulk density N(1.214,0.0325)
25Main effects plots
- The plug-in estimate of the NBP is far away from
our mean for NBP as the main effect plot for bulk
density is concave around its expected value of
1.214.
26Producing main/joint effects plots for publication
- In the files section of the edit project window,
there are two fields that allow the user to
specify where the main/joint effects data should
be written. - These files can be used to produce graphs like
the one I showed earlier. - The main effects file is structured as follows
- There are a number of blocks of function
realisations one for each input. - These are controlled by
27Limitations of GEM-SA
- In theory, the methods used by GEM-SA are
limitless however, the program itself isnt. - It can handle up to 30 inputs and 400 training
data. - Also, the distributions that are used to express
our uncertainty about the inputs are limited to
uniform or normal.
28When it all goes wrong
- How do we know when the emulator is not working?
- Large roughness parameters
- Especially ones hitting the limit of 99
- Large emulation variance on UA mean
- Poor CV standardised prediction error
- Especially when some are extremely large
- In such cases, see if a larger training set helps
- Other ideas like transforming output scale
29Where to find the program
- GEM-SA is available on the web along with
tutorial slides from a longer course and further
example data sets. - Links to it can be found on my website where
there is also a technical report explaining the
perils of using the plug-in approach - j-p-gosling.staff.shef.ac.uk