Getting started with GEM-SA - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Getting started with GEM-SA

Description:

Starting GEM-SA program. Creating input and output files. Explanation ... GEM can read multiple outputs file, but a single column is specified within a project ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 42
Provided by: marc256
Category:
Tags: gem | gem | getting | started

less

Transcript and Presenter's Notes

Title: Getting started with GEM-SA


1
Getting started with GEM-SA
2
This talk
  • Starting GEM-SA program
  • Creating input and output files
  • Explanation of the menus, toolbars, etc.
  • Description of the project window

3
Starting GEM-SA
  • Double-click the GEM-SA icon to start
  • The main window appears, with
  • Menu
  • Toolbar
  • Main results area with three tabs
  • Sensitivity Analysis, Main Effects and Results
    Summary
  • Initially all empty
  • Log window

4
The main GEM-SA window
menu
toolbar
Sensitivity analysis output grid
Log window
5
Toolbar icons
  • New project
  • Open project
  • Save project
  • Print output report
  • Edit project
  • Generate input design points
  • Rescale an input
  • Standardise design
  • Copy input design to clipboard
  • Convert input to integer
  • Run the analysis
  • Help

6
Output tabs
  • When an emulator has been fitted, the contents of
    these tabs will provide the main results
  • Sensitivity Analysis. This will report the SA
    variance decompositions
  • One line for each input parameter
  • One line for each pair of inputs, if joint
    effects are selected
  • Main effects. This will plot the main effects of
    the various inputs
  • Results Summary. This will present numerical
    summaries of emulator fit and uncertainty analysis

7
Log Window output
  • Tells us
  • Which training data are being loaded/saved
  • Transformations applied to the data
  • Fitted Gaussian process parameters
  • Summary of cross-validation analysis
  • Summary of the uncertainty analysis

8
Creating a GEM project
  • To build the emulator we first need 3 files
  • Data file of code inputs
  • Data file of code outputs
  • GEM-SA project file

9
Restrictions on input/output data
  • Single output
  • Multiple outputs must be treated individually
  • GEM can read multiple outputs file, but a single
    column is specified within a project
  • Max 30 input parameters
  • Max 400 training points
  • The data files are plain text files
  • One row for each point
  • Rows can be space or tab delimited

10
Generating a new input design
  • Designs can be generated using the toolbar icon
    or the menu Input ? Generate
  • The design dialog appears

11
Generating a new input design
  • Click OK and fill in the required range for each
    input
  • Click OK again

12
Editing input designs
  • If you select a column, you can rescale values of
    that input or round values to be integers
  • Designs can be loaded into or saved from this
    window using the Inputs menu. Use to copy
    the points to the clipboard for use in other
    programs

13
Types of design
  • GEM-SA can generate 2 types of design
  • LP-?
  • Maximin Latin Hypercube designs
  • Both have good space-filling properties
  • Ensure all regions of the input space are well
    represented

14
LP-? design
  • Very quick to generate
  • Deterministic set of uniform points
  • Increasing the sample size just adds points to
    the smaller design
  • Making it useful for sequential analysis
  • Only have to generate the extra runs

15
Maximin Latin hypercube design
  • Maximin Latin Hypercube designs
  • Maximise the minimum distance amongst all pairs
    of points
  • Can take a long time to generate
  • Projections also generally space-filling
  • Lower dimensional projections are also latin
    hypercubes
  • Good when only a few inputs are active

16
Creating output points
  • Each row from the input design must be used to
    generate outputs by running the computer code
  • One run for each row
  • Various methods to automate this
  • Spreadsheet
  • Simple, but requires functional form
  • Script
  • Only need executable code
  • Loop through inputs, modify code input file
  • Modify code to loop through the points
  • Messy, need source code

17
Example using a spreadsheet
  • Copy the input design to the clipboard using
  • Open Excel and paste inputs
  • Create formula in final column
  • Copy formula for all rows of the design
  • Cut and paste special (values) in a new sheet
  • Save as text file

18
Example using a script
  • Read simulators base input file
  • Read training inputs file
  • Loop through training file lines
  • Replace target inputs using training line
  • Write new base input file
  • Run code
  • Calculate output(s) and add to training output
    file

19
my pftchangeline 21 change line 21 within
the input file for each run my _at_pftchangecols
(11,14,23,19) columns within pftchangeline to
modify my _at_pftinlh (0,1,2,3) ordering of
these parameters within training
inputs open(BASEINFILE, "input.dat")
getinitial (fixed) input file used by sdgvmd my
_at_lines ltBASEINFILEgt and store the input
lines in _at_lines close BASEINFILE open(LHFILE,
"training_inputs.txt") my newpftline
linespftchangeline my _at_newpftpoints
split(" ", newpftline) while (ltLHFILEgt)
assigns each line in turn to _
chomp split my _at_lhpoints
_at__ open(INFILE, "gt inputfile.dat") _at_newpftpoin
ts_at_pftchangecols _at_lhpoints_at_pftinlh
modify lines linespftchangeline join(' ',
_at_newpftpoints)."\n" print INFILE _at_lines close
INFILE sdgvm0 input.dat run sdgvm0 with
modified input now do something with the
output files.... ...
20
The project window
  • Appears whenever you
  • Load a project
  • Edit a project
  • Create a new project
  • This window also has 3 tabs
  • Files
  • Options
  • Simulations

21
Names for the input files
Names for the output files
22
How many inputs?
What are the input names?
Which column from output file?
23
What should be calculated, and how?
Which joint effects should be calculated?
24
What prior mean for the output?
How are the inputs uncertain?
25
What kind of prediction?
What kind of cross validation?
26
MCMC control parameters
How many realisations of predictions, main and
joint effects to generate
How many points used to calculate main effects,
joint effects
27
The options tab
28
Input parameter names
  • This window appears if you press the Names
    button
  • Giving names is optional, but useful later when
    looking at GEM-SA output
  • Ordering can be changed using the arrows

29
Selecting joint effects
  • Select calculate joint effects if in sensitivity
    analysis you want to see the joint effects
    (interactions) of pairs of inputs as well as
    their individual effects
  • Use Inputs to include in joint effects panel to
    select which ones
  • Default All inputs computes joint effects for all
    pairs
  • Can take a lot of computation
  • To compute only the joint effects between
    selected inputs, deselect All inputs and select
    the two or more inputs whose joint effects are of
    interest

30
Other checkboxes
  • Sum effects
  • There are two ways to plot the joint effect of
    two inputs
  • A combined effect in which the value plotted is
    the mean output value at that combination of
    input values
  • A pure interaction, in which with the main
    effects of those inputs are subtracted from the
    combined effect
  • Select sum effects if you want to see combined
    effects, and deselect it to see interactions
  • This selection is ignored if you dont ask for
    joint effects to be computed

31
Other checkboxes
  • Code has numerical error
  • We generally assume that the model output is
    computed exactly every time
  • So the meta-model passes exactly through all the
    training points
  • There are two situations in which this assumption
    is not right
  • Your code has numerical errors which you want to
    smooth out
  • Your code is stochastic and the output values
    have random noise
  • Selecting code has numerical error turns the
    assumption off
  • The variance of the error will be estimated as
    part of the fitting process
  • The meta-model will smooth out the training
    points to a degree depending on the estimated
    error variance
  • Can make the fitting process quite unstable, so
    beware!

32
Other checkboxes
  • Use MCMC for emulator parameters
  • By default, GEM-SA estimates the underlying
    smoothness parameters and then pretends that the
    estimates are exact
  • Selecting use MCMC for emulator parameters takes
    into account uncertainty in the fitting of the
    emulator
  • Slows down the computation substantially, often
    with minimal effect on the results
  • Auto-tune Metropolis algorithm
  • Use only with MCMC
  • If not selected, you must supply a tuning file

33
Input uncertainty options
  • These options are for specifying what kind of
    distribution each uncertain input has
  • There are a limited range of options
  • All unknown, product normal/uniform
  • Inputs are independent, with either normal or
    uniform distributions
  • All known
  • No uncertainty analysis required
  • Some known, rest product normal/uniform
  • Some input values will be fixed (in the dialog
    window or in a prediction file)
  • Others will be given independent distributions,
    either normal or uniform

34
Input uniform ranges
  • If you say that some or all have uniform
    distributions, a window appears (when you click
    OK) to specify ranges
  • Option to use ranges in input data file

Some fixed, rest uniform
All uniform
35
Input normal parameters
  • If you say that some or all have normal
    distributions, a window appears (when you click
    OK) to specify the mean and variance of each
    distribution
  • Option to use ranges in input data file

Some fixed, rest normal
All normal
36
Prior mean options
  • The emulator will fit better if it knows roughly
    how the output is expected to respond to the
    inputs
  • You have just two choices
  • If you expect to see a trend in the output in
    response to changes in its inputs, select linear
    term for each input
  • Otherwise, selecting constant mean results in no
    overall trends being expected or fitted

37
Selecting prediction type
  • Having fitted the Gaussian process emulator,
    GEM-SA can predict what the output would be if
    the computer code were run at new input sets
  • These are specified in a prediction file
  • If there is no prediction file, selecting the
    prediction type has no effect
  • Predictions can be
  • Simulated realisations of outputs at the
    prediction inputs
  • Similar to main effect outputs
  • Takes account of correlation between predictions
  • Marginal means and variances of outputs at the
    prediction inputs
  • Faster to compute, especially with many
    prediction points
  • Easy to interpret

38
Selecting cross validation type
  • Cross-validation is a way of checking the
    validity of the predictions made by GEM-SA
  • The idea is to fit the emulator leaving out some
    of the training data points, then predict the
    missing points and see how well the predictions
    do
  • Choice of none, leave-one-out or leave final 20
    out
  • Leave-one-out
  • Hyper-parameters use all data and are then fixed
    when prediction is carried out for each omitted
    point
  • Leave final 20 out
  • Hyper-parameters are estimated using the reduced
    (80) data subset

39
The files and simulations tabs
40
GEM-SA files
  • You always have to specify an Inputs File and an
    Outputs File
  • You only need to specify a Prediction Inputs File
    if you want to generate predictions
  • You only need to specify a Metropolis-Hastings
    Tuning File if you select MCMC for computation
    and deselect auto-tuning
  • The Main effects file will always be created when
    you do sensitivity analysis
  • The Joint Effects file will be created if you ask
    for joint effects to be computed
  • The Predictions File will be created if you ask
    for predictions (by specifying a Prediction
    Inputs File)
  • It will contain simulated predictions or
    prediction means
  • The Predictions Variance File is created if you
    ask for predictions and specify prediction means
    and variances

41
Simulations
  • The first three of these settings apply only if
    you select MCMC computation
  • For expert users only!
  • You could choose the number of simulations that
    are computed for each main effect and interaction
  • But the default is generally plenty
  • You might want to increase the number of points
    on each main effect
  • To get more detail in the plots
  • But at the cost of longer computations
Write a Comment
User Comments (0)
About PowerShow.com