Sequential Supersaturated Designs for Effective Screening - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Sequential Supersaturated Designs for Effective Screening

Description:

We have learned how to adjust our parameters and the algorithm to improve our results. ... NFL Theorem also shows that no heuristic method can be applied to all ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 39
Provided by: amj4
Category:

less

Transcript and Presenter's Notes

Title: Sequential Supersaturated Designs for Effective Screening


1
Sequential Supersaturated Designs for Effective
Screening
  • Angela Jugan
  • University of Missouri Rolla
  • October 21st 2005

2
Aknowledgements
  • This research was conducted under the guidance of
    my advisor Dr. David Drain for my Thesis
  • A Sequential Approach to Supersaturated Design
  • Which was completed in June 2005

3
Continuing Research
  • We will be explaining the thought process behind
    the work and progress
  • We have learned how to adjust our parameters and
    the algorithm to improve our results.
  • Each simulation tells us something about where to
    go next.

4
Purpose
  • Experimentation in science or industry can
    involve a great number of variables that need to
    be tested for effects on a response or the
    outcome of a process.
  • Researchers are looking for ways to reduce the
    number of trials needed in experiments, while
    still achieving accurate conclusions.

5
Supersaturated Designs
  • SSDs have trials numbering fewer than the number
    of parameters being estimated
  • Assumptions
  • Effect sparsity very few factors have an
    effect on the response
  • Linear relationship

6
Original Contributions
  • Create SSD sequentially, using information from
    previous trials to chose the next trial.
  • Use of a hybrid of genetic algorithm, an
    evolutionary computing technique, to search a
    space of candidate designs to find optima.

7
Scope
  • 18 variable 16 run design
  • Start with initial 8 runs then add 8 additional
    runs sequentially based on previous results

8
Heuristic Optimization
  • Heuristic means related to improving
    problem-solving performance, it is also used in
    regard to any method or trick used to improve the
    efficiency of a problem-solving system.

9
History of Heuristic Methods
  • In 1961, Marvin Minsky found to be worthwhile to
    introduce a heuristic method, which happens to
    cause occasional failures, if there was an
    over-all improvement in performance.
  • Minsky did a great deal of work toward the
    discovery and mechanization of problem-solving
    processes in general-purpose computers
  • Search, Pattern-Recognition, Learning, Planning,
    and Induction

10
Methods
  • Hill climbing
  • Simulated Annealing
  • Genetic Algorithm

11
Genetic Algorithm Structure
  • The algorithm searches a population of candidate
    designs to identify the best one based on a
    measure of fitness
  • Fitness measures optimization
  • A new population is created from selected parents
  • Repeat until optimum is achieved

12
(No Transcript)
13
Starting Population
  • Best to start with the best designs
  • Search the space efficiently in the first
    generations

14
Fitness Evaluation
  • Every candidate design goes through an evaluation
    of fitness.
  • Fitness is defined before the experiment and is
    the same for every candidate in every generation.

15
Allowed to Breed
  • The number of Elites, these have the highest
    fitness values
  • Percentage of highly fit models
  • Random selection of low fit models, called a
    Roulette Rate
  • These rates and numbers are parameters to the
    experiment and can change throughout the
    generations

16
Breeding
  • A crossover creates a new individual from the
    information contained within two or more parents.
  • Mutation is an operation that applies some kind
    of randomized change to the offspring.

17
Tuning the Algorithm
  • The performance of heuristic optimization methods
    was studied by Wolpert and Macready and they
    developed a theorem called No Free Lunch
    (1997)
  • It states that if an algorithm does particularly
    well on average for one class of problems then it
    must do worse on average over the remaining
    problems.

18
Tuning the Algorithm
  • NFL Theorem also shows that no heuristic method
    can be applied to all problems successfully
    rather, the method must be chosen to match the
    problem being solved.
  • Not a black box tool
  • Many choices to make and adjust to optimize
    specific problem

19
Starting Population
  • I chose the initial population
  • Placed existing designs into population
  • Start with good designs to breed and create
    better designs

20
Choosing the additional runs
  • The original choice for the new run was based on
    breaking the highest aliasing between the
    variables found to be significant and the
    variables found to be not significant.
  • Alias Matrix (X1' X1)-1X1 'X2
  • This is only one possible choice for the new run.

21
Choosing the Additional Runs
  • Changed the new run to break all aliasing found
    between the significant variables and
    nonsignificant variables higher than a preset
    value.
  • If the Matrix X1 is singular, then a the new run
    was chosen to help balance the design

22
Classic Analysis Methods
  • Westfall, Young, and Lin suggested a forward
    selection error control (1998)
  • Lin and Li proposed a method based on non-convex,
    penalized least squares (2003)
  • Holcomb, Montgomery, and Carlyle used contrasts
    to determine active factors by checking for a
    significant response change when the factor
    setting was changed (2003)
  • Lu and Wu introduced a new way to search for
    active factors in SSDs based on the idea of
    staged dimensionality reduction

23
Chosen Analysis Method
  • Stepwise regression
  • Adding one variable at a time to the model based
    on partial correlations to the response
  • Checks variables already entered in model once
    the new variable has been added
  • Alpha risk 0.15
  • The constant term is included in the model, and
    up to three variables
  • Forward regression procedure was used in earlier
    simulations

24
Fitness Evaluation Details
  • The fitness evaluation was based on how
    accurately the design identified the true
    significant factors.
  • This was done by comparing the results of the
    stepwise regression to a challenging variety of
    test vectors
  • Type I error received weight of -1
  • Type II error received weight of -3

25
Fitness Evaluation
  • Test vector chosen to create response values
  • Stepwise regression found significant variables
  • Alias Matrix used to create new run
  • Repeat until eight additional runs added
  • Fitness assessed on alpha and beta risks
  • New test vector chosen to simulate data and
    repeat the above process, this done several times
  • An average Fitness score is given to the design

26
(No Transcript)
27
Regression Using Matrix Algebra
  • The regression procedure was carried out through
    a series of matrix manipulations described by
    Jennrich in 1977, and Thisted in 1988.
  • These manipulations start with the Sum of Squares
    Cross-Products matrix, SSCP, and use the SWEEP
    operator. The results are given in a matrix
    which contains all necessary information for
    regression analysis.

28
GA-SA
  • This research uses a genetic algorithm
    simulated annealing (GA-SA) hybrid to select a
    supersaturated design.
  • The GA-SA first explores the space effectively,
    then in future generations exploits the good
    solutions for the global optima.

29
GA-SA Hybrid
  • Predetermined set of starting candidates
  • Crossover set at midpoint
  • Mutation rate from .05 to .0001
  • Roulette rate .3 to .05
  • Proportion Breeding .3
  • Three Elites
  • 20 to 100 test vectors
  • 5 Generations

30
Tuning the Algorithm
31
Tune the Objective Function
32
Benchmark Designs
  • Comparisons were made to 16 run 18 variable
    designs.
  • Nguyens cyclic replication of the BIBD
  • D-optimal design

33
Forward Regression Results
  • Design Fitness
  • Replicated BIBD 0.69
  • D-Optimal 0.56
  • Winner 0.40
  • Effect size 0.75 to 1.5
  • Replicated BIBD 0.18
  • D-Optimal 0.16
  • Winner 0.13
  • Effect size 0.25 to 0.75

34
Adjustments to Algorithm
  • Wanted to balance alpha and beta risk
  • Changed weight on beta risk to -3
  • More efficient use of each new run
  • Break several aliased variable in new run
  • Add balance to design
  • Added to initial Candidates

35
Adjustments to Algorithm
  • Stepwise Regression
  • Stepwise regression significantly raised the
    fitness evaluation.
  • More accurate check of variables found
    significant by the design
  • Necessary data in the SWEEP matrix

36
Stepwise Regression Results
Effect sizes 0.75 to 1.5
Effect sizes 0.25 to 0.75
37
Future Work
  • Improve the performance of objective function,
    the fitness value
  • Explore different choices for new run
  • Different weights for alpha and beta risk
  • Continue tuning the algorithm parameters
  • Continued improvement of the designs

38
Questions?
Write a Comment
User Comments (0)
About PowerShow.com