Sequential Supersaturated Designs for Effective Screening presentation

About This Presentation

Transcript and Presenter's Notes

Title: Sequential Supersaturated Designs for Effective Screening

1
Sequential Supersaturated Designs for Effective
Screening

Angela Jugan
University of Missouri Rolla
October 21st 2005

2
Aknowledgements

This research was conducted under the guidance of
my advisor Dr. David Drain for my Thesis
A Sequential Approach to Supersaturated Design
Which was completed in June 2005

3
Continuing Research

We will be explaining the thought process behind
the work and progress
We have learned how to adjust our parameters and
the algorithm to improve our results.
Each simulation tells us something about where to
go next.

4
Purpose

Experimentation in science or industry can
involve a great number of variables that need to
be tested for effects on a response or the
outcome of a process.
Researchers are looking for ways to reduce the
number of trials needed in experiments, while
still achieving accurate conclusions.

5
Supersaturated Designs

SSDs have trials numbering fewer than the number
of parameters being estimated
Assumptions
Effect sparsity very few factors have an
effect on the response
Linear relationship

6
Original Contributions

Create SSD sequentially, using information from
previous trials to chose the next trial.
Use of a hybrid of genetic algorithm, an
evolutionary computing technique, to search a
space of candidate designs to find optima.

7
Scope

18 variable 16 run design
Start with initial 8 runs then add 8 additional
runs sequentially based on previous results

8
Heuristic Optimization

Heuristic means related to improving
problem-solving performance, it is also used in
regard to any method or trick used to improve the
efficiency of a problem-solving system.

9
History of Heuristic Methods

In 1961, Marvin Minsky found to be worthwhile to
introduce a heuristic method, which happens to
cause occasional failures, if there was an
over-all improvement in performance.
Minsky did a great deal of work toward the
discovery and mechanization of problem-solving
processes in general-purpose computers
Search, Pattern-Recognition, Learning, Planning,
and Induction

10
Methods

Hill climbing
Simulated Annealing
Genetic Algorithm

11
Genetic Algorithm Structure

The algorithm searches a population of candidate
designs to identify the best one based on a
measure of fitness
Fitness measures optimization
A new population is created from selected parents
Repeat until optimum is achieved

12
(No Transcript)
13
Starting Population

Best to start with the best designs
Search the space efficiently in the first
generations

14
Fitness Evaluation

Every candidate design goes through an evaluation
of fitness.
Fitness is defined before the experiment and is
the same for every candidate in every generation.

15
Allowed to Breed

The number of Elites, these have the highest
fitness values
Percentage of highly fit models
Random selection of low fit models, called a
Roulette Rate
These rates and numbers are parameters to the
experiment and can change throughout the
generations

16
Breeding

A crossover creates a new individual from the
information contained within two or more parents.
Mutation is an operation that applies some kind
of randomized change to the offspring.

17
Tuning the Algorithm

The performance of heuristic optimization methods
was studied by Wolpert and Macready and they
developed a theorem called No Free Lunch
(1997)
It states that if an algorithm does particularly
well on average for one class of problems then it
must do worse on average over the remaining
problems.

18
Tuning the Algorithm

NFL Theorem also shows that no heuristic method
can be applied to all problems successfully
rather, the method must be chosen to match the
problem being solved.
Not a black box tool
Many choices to make and adjust to optimize
specific problem

19
Starting Population

I chose the initial population
Placed existing designs into population
Start with good designs to breed and create
better designs

20
Choosing the additional runs

The original choice for the new run was based on
breaking the highest aliasing between the
variables found to be significant and the
variables found to be not significant.
Alias Matrix (X1' X1)-1X1 'X2
This is only one possible choice for the new run.

21
Choosing the Additional Runs

Changed the new run to break all aliasing found
between the significant variables and
nonsignificant variables higher than a preset
value.
If the Matrix X1 is singular, then a the new run
was chosen to help balance the design

22
Classic Analysis Methods

Westfall, Young, and Lin suggested a forward
selection error control (1998)
Lin and Li proposed a method based on non-convex,
penalized least squares (2003)
Holcomb, Montgomery, and Carlyle used contrasts
to determine active factors by checking for a
significant response change when the factor
setting was changed (2003)
Lu and Wu introduced a new way to search for
active factors in SSDs based on the idea of
staged dimensionality reduction

23
Chosen Analysis Method

Stepwise regression
Adding one variable at a time to the model based
on partial correlations to the response
Checks variables already entered in model once
the new variable has been added
Alpha risk 0.15
The constant term is included in the model, and
up to three variables
Forward regression procedure was used in earlier
simulations

24
Fitness Evaluation Details

The fitness evaluation was based on how
accurately the design identified the true
significant factors.
This was done by comparing the results of the
stepwise regression to a challenging variety of
test vectors
Type I error received weight of -1
Type II error received weight of -3

25
Fitness Evaluation

Test vector chosen to create response values
Stepwise regression found significant variables
Alias Matrix used to create new run
Repeat until eight additional runs added
Fitness assessed on alpha and beta risks
New test vector chosen to simulate data and
repeat the above process, this done several times
An average Fitness score is given to the design

26
(No Transcript)
27
Regression Using Matrix Algebra

The regression procedure was carried out through
a series of matrix manipulations described by
Jennrich in 1977, and Thisted in 1988.
These manipulations start with the Sum of Squares
Cross-Products matrix, SSCP, and use the SWEEP
operator. The results are given in a matrix
which contains all necessary information for
regression analysis.

28
GA-SA

This research uses a genetic algorithm
simulated annealing (GA-SA) hybrid to select a
supersaturated design.
The GA-SA first explores the space effectively,
then in future generations exploits the good
solutions for the global optima.

29
GA-SA Hybrid

Predetermined set of starting candidates
Crossover set at midpoint
Mutation rate from .05 to .0001
Roulette rate .3 to .05
Proportion Breeding .3
Three Elites
20 to 100 test vectors
5 Generations

30
Tuning the Algorithm
31
Tune the Objective Function
32
Benchmark Designs

Comparisons were made to 16 run 18 variable
designs.
Nguyens cyclic replication of the BIBD
D-optimal design

33
Forward Regression Results

Design Fitness
Replicated BIBD 0.69
D-Optimal 0.56
Winner 0.40
Effect size 0.75 to 1.5
Replicated BIBD 0.18
D-Optimal 0.16
Winner 0.13
Effect size 0.25 to 0.75

34
Adjustments to Algorithm

Wanted to balance alpha and beta risk
Changed weight on beta risk to -3
More efficient use of each new run
Break several aliased variable in new run
Add balance to design
Added to initial Candidates

35
Adjustments to Algorithm

Stepwise Regression
Stepwise regression significantly raised the
fitness evaluation.
More accurate check of variables found
significant by the design
Necessary data in the SWEEP matrix

36
Stepwise Regression Results
Effect sizes 0.75 to 1.5
Effect sizes 0.25 to 0.75
37
Future Work

Improve the performance of objective function,
the fitness value
Explore different choices for new run
Different weights for alpha and beta risk
Continue tuning the algorithm parameters
Continued improvement of the designs

38
Questions?

Write a Comment

User Comments (0)

About PowerShow.com

Sequential Supersaturated Designs for Effective Screening PowerPoint PPT Presentation