Title: CHAPTER 17 OPTIMAL DESIGN FOR EXPERIMENTAL INPUTS
1CHAPTER 17 OPTIMAL DESIGN FOR EXPERIMENTAL
INPUTS
Slides for Introduction to Stochastic Search and
Optimization (ISSO) by J. C. Spall
- Organization of chapter in ISSO
- Background
- Motivation
- Finite sample and asymptotic (continuous) designs
- Precision matrix and D-optimality
- Linear models
- Connections to D-optimality
- Key equivalence theorem
- Response surface methods
- Nonlinear models
- Note Appendix to these slides is brief
discussion of factorial design (not in ISSO)
2Optimal Design in Simulation
- Two roles for experimental design in simulation
- Building approximation to existing large-scale
simulation via metamodel - Building simulation model itself
- Metamodels are curve fits that approximate
simulation input/output - Usual form is low-order polynomial in the inputs
linear in parameters ? - Linear design theory useful
- Building simulation model
- Typically need nonlinear design theory
- Some terminology distinctions
- Factors (statistics term) ? Inputs (modeling
and simulation terms) - Levels ? Values
- Treatments ? Runs
3Unique Advantages of Design in Simulation
- Simulation experiments may be considered special
case of general experiments - Some unique benefits occur due to simulation
structure - Can control factors not generally controllable
(e.g., arrival rates into network) - Direct repeatability due to deterministic nature
of random number generators - Variance reduction (CRNs, etc.) may be helpful
- Not necessary to randomize runs to avoid
systematic variation due to inherent conditions - E.g., randomization in run order and input levels
in biological experiment to reduce effects of
change in ambient humidity in laboratory - In simulation, systematic effects can be
eliminated since analyst controls nature
4Design of Computer Experiments in Statistics
- There exists significant activity among
statisticians for experimental design based on
computer experiments - T. J. Santner et al. (2003), The Design and
Analysis of Computer Experiments, Springer-Verlag - J. Sacks et al (1989), Design and Analysis of
Computer Experiments (with discussion),
Statistical Science, 409435 - Etc.
- Above statistical work differs from experimental
design with Monte Carlo simulations - Above work assumes deterministic function
evaluations via computer (e.g., solution to
complicated ODE) - One implication of deterministic function
evaluations no need to replicate experiments for
given set of inputs - Contrasts with Monte Carlo, where replication
provides variance reduction
5General Optimal Design Formulation (Simulation or
Non-Simulation)
- Assume model
- z h(?,?x) v ,
- where x is an input we are trying to pick
optimally - Experimental design ? consists of N specific
input values x ?i and proportions (weights) to
these input values wi - Finite-sample design allocates n ? N available
measurements exactly asymptotic (continuous)
design allocates based on n ? ?
6D-Optimal Criterion
- Picking optimal design ? requires criterion for
optimization - Most popular criterion is D-optimal measure
- Let M(?,??) denote the precision matrix for an
estimate of ? based on a design ? - M(?,??) is inverse of covariance matrix for
estimate - and/or
- M(?,??) is Fisher information matrix for estimate
- D-optimal solution is
7Equivalence Theorem
- Consider linear model
- Prediction based on parameter estimate and
future measurement vector hT is - Kiefer-Wolfowitz equivalence theorem states
- D-optimal solution for determining ? to be used
in forming is the same ? that minimizes the
maximum variance of predictor - Useful in practical determination of optimal ?
8Variance Function as it Depends on Input Optimal
Asymptotic Design for Example 17.6 in ISSO
9Orthogonal Designs
- With linear models, usually more than one
solution is D-optimal - Orthogonality is means of reducing number of
solutions - Orthogonality also introduces desirable secondary
properties - Separates effects of input factors (avoids
aliasing) - Makes estimates for elements of ? uncorrelated
- Orthogonal designs are not generally D-optimal
D-optimal designs are not generally
orthogonal - However, some designs are both
- Classical factorial (cubic) designs are
orthogonal (and often D-optimal)
10Example Orthogonal Designs, r 2 Factors
x
x
k
2
k
2
x
x
k
1
k
1
r
Cube (2
design)
Star (2r
design)
11Example Orthogonal Designs, r 3 Factors
xk3
12Response Surface Methodology (RSM)
- Suppose want to determine inputs x that minimize
the mean response z of some process (E(z)) - There are also other (nonoptimization) uses for
RSM - RSM can be used to build local models with the
aim of finding the optimal x - Based on building a sequence of local models as
one moves through factor (x) space - Each response surface is typically a simple
regression polynomial - Experimental design can be used to determine
input values for building response surfaces
13Steps of RSM for Optimizing x
- Step 0 (Initialization) Initial guess at optimal
value of x. - Step 1 (Collect data) Collect responses z from
several x values in neighborhood of current
estimate of best x value (can use experimental
design). - Step 2 (Fit model) From the x, z pairs in step 1,
fit regression model in region around current
best estimate of optimal x. - Step 3 (Identify steepest descent path) Based on
response surface in step 2, estimate path of
steepest descent in factor space. - Step 4 (Follow steepest descent path) Perform
series of experiments at x values along path of
steepest descent until no additional improvement
in z response is obtained. This x value
represents new estimate of best vector of factor
levels. - Step 5 (Stop or return) Go to step 1 and repeat
process until final best factor level is
obtained.
14Conceptual Illustration of RSM for Two Variables
in x Shows More Refined Experimental Design Near
Solution
Adapted from Montgomery (2005), Design and
Analysis of Experiments, Fig. 11-3
15Nonlinear Design
- Assume model
- z h(?,?x) v ,
- where ? enters nonlinearly and x is
r-dimensional input vector - D-optimality remains dominant measure
- Maximization of determinant of Fisher information
matrix (from Chapter 13 of ISSO Fn(?, X) is
Fisher information matrix based on n inputs in
n??r matrix X) - Fundamental distinction from linear case is that
D-optimal criterion depends on ? - Leads to conundrum
- Choosing X to best estimate ?, yet need to know
? to determine X
16Strategies for Coping with Dependence on ?
- Assume nominal value of ? and develop an optimal
design based on this fixed value - Sequential design strategy based on an iterated
design and model fitting process. - Bayesian strategy where a prior distribution is
assigned to ?, reflecting uncertainty in the
knowledge of the true value of ?
17Sequential Approach for Parameter Estimation and
Optimal Design
- Â Step 0 (Initialization) Make initial guess at
?, Allocate n0 measurements to initial
design. Set k 0 and n 0. - Step 1 (D-optimal maximization) Given Xn , choose
the nk inputs in X to maximize - Step 2 (Update ? estimate) Collect nk
measurements based on inputs from step 1. Use
measurements to update from to - Step 3 (Stop or return) Stop if the value of ? in
step 2 is satisfactory. Else return to step 1
with the new k set to the former k 1 and the
new n set to the former n nk (updated Xn now
includes inputs from step 1).
18Comments on Sequential Design
- Note two optimization problems being solved one
for ?, one for ? - Determine next nk input values (step 1)
conditioned on current value of ? - Each step analogous to nonlinear design with
fixed (nominal) value of ? - Full sequential mode (nk 1) updates ? based
on each new input?ouput pair (xk , zk) - Can use stochastic approximation to update ?
-
- where
19Bayesian Design Strategy
- Assume prior distribution (density) for ?, p(?),
reflecting uncertainty in the knowledge of the
true value of ?. - There exist multiple versions of D-optimal
criterion - One possible D-optimal criterion
- Above criterion related to Shannon information
- While log transform makes no difference with
fixed ?, it does affect integral-based solution - To simplify integral, may be useful to choose
discrete prior p(?)
20Appendix to Slides for Chapter 17 Factorial
Design (not in ISSO)
- Classical experimental design deals with linear
models - Factorial design is most popular classical method
- All r inputs (factors) changed at one time
- Factorial design provides two key advantages over
one-at-a-time changes - Greater efficiency in extracting information
from a given number of experiments - Ability to determine if there are interaction
effects - Standard method is 2r factorial 2 comes about
by looking at each input at two levels low (?)
and high () - E.g., if r 3, then have 23 8 input
combinations - (? ? ?), ( ? ?), (? ?), (? ? ),
- ( ?), ( ? ), (? ), ( )
21Appendix to Slides (contd) Factorial Design
with 3 Inputs
- Consider r 3 linear model
- zk ?0 ?1xk1 ?2xk2 ?3xk3 ?4xk1xk2
?5xk1xk3 ?6xk2xk3 ?7xk1xk2xk3
noise, - where ? ?0,? ?1,,? ?7T represents vector of
(unknown) parameters and xki represents i?th term
in input vector xk - 23 factorial design allows for efficient
estimation of all parameters in ? - In contrast, one-at-a-time provides no
information for estimating ?4 to ?7 - However, 23 factorial design must be augmented in
some way if wish to add quadratic (e.g., )
or other higher-order polynomial terms to model
22Appendix to Slides (contd) Illustration of
Interaction with 2 Inputs
- Example responses for r 2 no interaction and
interaction between input variables - Left plot (no interaction) shows that change in
zk with change in xk2 does not depend on xk1
right plot (interaction) shows change in zk does
depend on xk1
No interaction
Interaction
zk
Xk1 high
( ?)
(? )
Xk1 low
(? ?)
( )
xk2