Crystal Linkletter and Derek Bingham - PowerPoint PPT Presentation

About This Presentation
Title:

Crystal Linkletter and Derek Bingham

Description:

This research was initiated while Linkletter, Bingham and Ye were visiting the ... which is known to be inert is appended to the design matrix X. This provides a ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 2
Provided by: medic73
Category:

less

Transcript and Presenter's Notes

Title: Crystal Linkletter and Derek Bingham


1
Variable Selection for Gaussian Process Models in
Computer Experiments

Crystal Linkletter and Derek Bingham Department
of Statistics and Actuarial Science Simon Fraser
University
David Higdon and Nick Hengartner Statistical
Sciences Discrete Event Simulations Los
Alamos National Laboratory
Kenny Q. Ye Department of Epidemiology and
Population Health Albert Einstein College of
Medicine
Introduction Computer simulators often require a
large number of inputs and are computationally
demanding. A main goal of computer
experimentation may be screening, identifying
which inputs have a significant impact on the
process being studied. Gaussian spatial process
(GASP) models are commonly used to model computer
simulators. These models are flexible, but make
variable selection challenging. We present
reference distribution variable selection (RDVS)
as a new approach to screening for GASP models.
Results Simulated Example We used a 54-run
space-filling Latin hypercube design with p10
factors. The response is generated by A GASP
model is used to analyse the generated response
and the RDVS algorithm is used to identify the
first four factors as active Posterior
distributions for correlation parameters of 10
factors. The horizontal line marks the 10th
percentile of the reference distribution.
Correlation parameters with posterior medians
below this line indicate active factors. Taylor
Cylinder Experiment A 118-run 5-level
nearly-orthogonal design was used. Exploratory
analysis suggests factor 6 is important,
otherwise significant factors are difficult to
identify RDVS identifies factor 6 and six
other factors as having a significant impact on
cylinder deformation.
Discussion RDVS is able to correctly identify
when none of the true factors are active. This
variable selection technique complements methods
in sensitivity analysis. It can be used as a
precursor to alternative visualization and ANOVA
approaches to screening. The method is robust
to the specification of the prior distributions.
Since the inert variable is assigned the same
prior as the true factors, the method
self-calibrates.
Gaussian Spatial Process Model To model the
response from a computer experiment, we use a
Bayesian version of the GASP model originally
used by Sacks et al. (1989) y(X) Simulator
response (n x 1) vector X Input to the
computer code (n x p) design matrix ?
White-noise process, independent of z(X) The
Gaussian spatial process, z(X), is specified to
have mean zero and covariance function Under
this parameterization, if ?k is close to one, the
kth input is not active. RDVS is a method for
gauging the relative magnitudes of the
correlation parameters ?k.
  • Conclusions and Future Research
  • RDVS is a new method for variable selection for
    Bayesian Gaussian Spatial Process models.
  • The methodology is motivated by asking what
    would the posterior distribution of the
    correlation parameter for an inert factor look
    like given the data?
  • The approach is Bayesian and only requires the
    generation of an inert factor, but the screening
    has a frequentist flavour, using the distribution
    of the inert factor as a reference distribution.
  • Future research
  • Using a linear regression model for the mean of
    the GASP model
  • Using RDVS for variable selection for other
    models.

Computer Experiment Example Taylor Cylinder
Experiment (Los Alamos National Lab) This
is a finite element code used to simulate the
high velocity impact of a cylinder. In the
experiment, copper cylinders (length 5.08 cm,
radius 1 cm) are fired into a fixed barrier at a
velocity of 177 m/s. The cylinder length after
impact is used as the outcome. The process is
governed by 14 parameters which control the
behaviour of the cylinder after impact. Over the
limited range that the computer experiment
exercises the simulator, it is expected that the
response is dominated by only a few of the 14
parameters.
  • RDVS Algorithm
  • To implement RDVS, a factor which is known to be
    inert is appended to the design matrix X.
    This provides a benchmark against which the other
    input factors can be compared.
  • Algorithm
  • Augment the design matrix by adding a new design
    column corresponding to an inert factor.
  • Find the posterior median of the correlation
    parameter corresponding to the dummy factor.
  • Repeat steps 1. and 2. many times to obtain the
    distribution of the posterior median of an inert
    factor to use as a reference distribution.
  • Compare the posterior medians of the correlation
    parameters of the true factors to the reference
    distribution. The percentile of the reference
    distribution used for comparison reflects the
    rate of falsely identifying an inert factor as
    active.

Acknowledgements This research was initiated
while Linkletter, Bingham and Ye were visiting
the Statistical Sciences group at Los Alamos
National Laboratory. This work was supported by
a grant from the Natural Sciences and Engineering
Research Council of Canada. Yes research
supported by NSD DMS-0306306.
Write a Comment
User Comments (0)
About PowerShow.com