SEED Center for Data Farming Overview - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

SEED Center for Data Farming Overview

Description:

Mission: Advance the collaborative development and use of simulation ... 2004 Steele (Ensign, US Navy) Unmanned Surface Vehicles. 2004 Hakola (Captain, USMC) ... – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 29
Provided by: ltctom
Category:

less

Transcript and Presenter's Notes

Title: SEED Center for Data Farming Overview


1
SEED Center for Data Farming Overview
  • Tom Lucas and Susan Sanchez
  • Operations Research Department
  • Naval Postgraduate School
  • Monterey, CA.

Mission Advance the collaborative development
and use of simulation experiments and efficient
designs to provide decision makers with timely
insights on complex systems and operations
2
Simulation studies underpin many DoD decisions
  • DoD uses complex, high-dimensional, simulation
    models as an important tool in its
    decision-making process.
  • Used when too difficult or costly to experiment
    on real systems
  • Needed for future systemswe shouldnt wait until
    theyre operational to decide on appropriate
    capabilities and operational tactics, or evaluate
    their potential performance
  • Investigate the impact of randomness and other
    uncertainties

Many complex simulations involve hundreds or
thousands of factors that can be set to
different levels.
3
Design of experiments
  • An experimental design is the complete
    specification of input settings and runs
  • The choice of design constrains the information
    we can extract from the model
  • A table (matrix) of factor levels describes the
    design
  • Each column corresponds to a factor (input
    variable)
  • Each row to a design point (combination of
    factors settings)

2100 is foreverGeneral Jasper Welch
4
A simple example
  • Without examining multiple factors
    simultaneously, we
  • Limit the insights possible (cant look for
    interactions - places where interesting things
    happen for specific combinations of factors) so
  • only tell part of the story
  • Less chance for surprises

Ex which is more important, stealth or
range?
Ex suppose your factors include
fuel, air and spark. Youll NEVER find
fire by examining only two at a time.
excursions from base case wouldnt show anything
5
The traditional view
Philosophy The three primary objectives of
computer experiments are (i) Predicting the
response at untried inputs, (ii) Optimizing a
function of the input factors, or (iii)
Calibrating the computer code to physical
data. --Sacks, Welch, Mitchell, and Wynn
(1989)
For many (military) applications, these can be
problematic!
  • Approach
  • Limit yourself to just a few factors or scenario
    alternatives
  • Fix all other factors in the simulation to
    specified values
  • At each design point, run the experiment a small
    number of times (once for deterministic
    simulations)

The purpose of computing is insight, not
numbersHamming
6
The new view
  • We contend that appropriate goals are
  • (i) Developing a basic understanding of a
    particular model or system
  • seeking insights into high-dimensional space.
  • identifying significant factors and interactions.
  • finding regions, ranges, and thresholds where
    interesting things happen.
  • (ii) Finding robust decisions, tactics, or
    strategies
  • (iii) Comparing the merits of various decisions
    or policies Kleijnen, Sanchez, Lucas Cioppa
    2005

Models are for thinking Sir Maurice Kendall
Once you have invested the effort to build (and
perhaps verify, validate accredit) a
simulation model, its time to let the model work
for you!
7
These goals mean fewer assumptions...
  • Traditional DOE
  • Assumptions
  • Small/ moderate of factors
  • Univariate response
  • Homogeneous error
  • Linear
  • Sparse effects
  • Higher order interactions negligible
  • Normal errors
  • Black box model
  • Assumptions for Defense
  • Homeland security Simulations
  • Large of factors
  • Many output measures of interest
  • Heterogeneous error
  • Non-linear
  • Many significant effects
  • Significant higher order interactions
  • Varied error structure
  • Substantial expertise exists

The idea behind Monte Carlo simulationis to
replace theory by experiment whenever the
former faltersHammersley and Handscomb
We use simulations to avoid making Type III
errorsworking on the wrong modelW. David Kelton
8
...that, in turn, call for different designs
We have focused on Latin hypercubes
and sequential approaches
Efficient R5 FF and CCD
Factorial (gridded) designs are most familiar
9
Choosing an experimental design
  • Plant the seeds for successful data farming
  • Explore landscapes by running experiments while
    varying factors
  • Sequential process, human in the loop to
    interpret results, plan new experiments
  • Where you should plant depends on what you want
    to harvest!
  • For developing an understanding, you may wish to
    identify
  • important factors e.g., fractional factorial or
    sequential screening
  • what factors matter? (when interactions are NOT
    sizeable)
  • interactions e.g., higher resolution fractional
    factorial
  • are stealth and range synergistic?
  • quadratic effects e.g., central composite
  • does increasing USVs have diminishing returns?
  • thresholds, change points, and robust regions
    Latin hypercubes
  • What does the landscape look like? What decision
    factors, interactions, and higher-order terms
    matter?

We seek designs that allow one to fit a
variety of models and provide information about
all portions of the experimental
region. --Santner, Williams, and Notz (2003)
10
An all-purpose experimental design
  • We have found Latin hypercubes a very good
    all-purpose design, particularly when factors are
    quantitative and there is considerable a priori
    uncertainty about the response, because of
  • Efficiency
  • Space-filling (if we look at any group of
    factors, well find a variety of combinations of
    levels)
  • Design flexibility
  • few restrictions on factors, levels, sampling
    budget
  • Analysis flexibility
  • good at screening
  • many cameras on the landscape
  • accidental VV A
  • allow you to fit many different types of complex
    metamodels to multiple MOEs
  • Orthogonal, nearly-orthogonal, and space-filling
    Latin hypercubes have advantages in fitting
    landscapes to the data

11
So, what is a Latin hypercube?
A 6-run, 2 factor design
  • In its basic form, each column in an n-run,
    k-factor LH is a permutation of the integers
    1,2,,n
  • The n integers correspond to levels across the
    range of the factor
  • For exploratory purposes, we use a uniform spread
    over the range (but may round to integer values)
  • slightly different designs arise if you force
    sampling at the low and high values

Pairwise projection
5 15 25 35 45 55
Factor 2
0 12 24 36 48 60
Low 0
High 60
Factor 1
12
Nearly orthogonal and space-filling Latin
hypercubes
  • The pairwise projections for a 17-run, 7-factors
    orthogonal LH show
  • Orthogonality (no pairwise correlations)
  • space-filling behavior (points fill the
    sub-plots)
  • 17 total runs!

13
Other possibilities
  • Very large resolution V fractional factorials and
    central composite designs
  • Standard DOE literature 211-3
  • New an easy way to catalogue and generate up to
    2443-423
  • Two-phase adaptive sequential procedure for
    factor screening
  • New procedure that requires fewer assumptions,
    improves efficiency
  • Frequency domain experiments
  • Naturally samples factors at coarser/finer levels
  • Crossed/combined designs to identify robust
    decision factor settings

14
Our portfolio of designs
  • Kleijnen, J. P. C., S. M. Sanchez, T. W. Lucas,
    and T. M. Cioppa, A Users Guide to the Brave
    New World of Designing Simulation Experiments,
    INFORMS Journal on Computing, Vol. 17, No. 3,
    2005, pp. 263-289.
  • Cioppa, T. M. and T. W. Lucas, Efficient Nearly
    Orthogonal and Space-filling Latin Hypercubes,
    Technometrics, forthcoming.
  • Sanchez, S. M. and P. J. Sanchez, "Very Large
    Fractional Factorials and central composite
    designs," ACM Transactions on Modeling and
    Computer Simulation, Vol. 15, No. 4, 2005, pp.
    362-377.
  • Sanchez, S. M., H. Wan, and T. W. Lucas, "A
    Two-phase Screening Procedure for Simulation
    Experiments," Proc. 2005 Winter Simulation
    Conference, eds. M. E. Kuhl, N. M. Steiger, F. B.
    Armstrong, and J. A. Joines, Institute of
    Electrical and Electronic Engineers, Piscataway,
    New Jersey, 2005, pp. 223-230.
  • Sanchez, S. M., F. Moeeni, and P. J. Sanchez, "So
    Many Factors, So Little TimeSimulation
    experiments in the frequency domain,"
    International Journal of Production Economics,
    Vol. 103, 2006, pp. 149-165.

15
Other publications
  • Sanchez, Lucas, Agent-based Simulations Simple
    Models, Complex Analyses, Invited paper, Proc.
    2002 Winter Simulation Conference, 116-126.
  • Lucas, Sanchez, Brown, Vinyard, Better Designs
    for High-Dimensional Explorations of
    Distillations,  Maneuver Warfare Science 2002,
    Marine Corps Combat Development Command, 2002,
    17-46.
  • Vinyard, Lucas, Exploring Combat Models for
    Non-monotonicities and Remedies, PHALANX, 35,
    No. 1, March 2002, 19, 36-38.
  • Lucas, McGunnigle, When is Model Complexity Too
    Much? Illustrating the Benefits of Simple Models
    with Hughes Salvo Equations, Naval Research
    Logistics, Vol. 50, April 2003, 197-217.
  • Lucas, Sanchez, Cioppa, Ipekci, Generating
    Hypotheses on Fighting the Global War on
    Terrorism,  Maneuver Warfare Science 2003,
    Marine Corps Combat Development Command, 2003,
    117-137.
  • Lucas, Sanchez, Smart Experimental Designs
    Provide Military Decision-Makers With New
    Insights From Agent-Based Simulations, Naval
    Postgraduate School RESEARCH, 13, 2, 20-21,
    57-59, 63.
  • Lucas, Sanchez, NPS Hosts the Marine Corps
    Warfighting Laboratorys Sixth Project Albert
    International Workshop, Lucas, T.W. and S.M.
    Sanchez, Naval Postgraduate School RESEARCH, 13,
    2, 45-46.
  • Sanchez, Wu, Frequency-Based Designs for
    Terminating Simulation Experiments  A
    Peace-enforcement Example, Proc. 2003 Winter
    Simulation Conference, 952-959.
  • Brown, Cioppa, Objective Force Urban Operations
    Agent Based Simulation Experiment, Technical
    Report TRAC-M-TR-03-021, Monterey, CA, June 2003.
  • Cioppa, Brown, Jackson, Muller, Allison,
    Military Operations in Urban Terrain Excursions
    and Analysis With Agent-Based Models, Maneuver
    Warfare Science 2003, Quantico, VA, 2003.
  • Cioppa, Advanced Experimental Designs for
    Military Simulations, Technical Report
    TRAC-M-TR-03-011, Monterey, CA, February 2003.
  • Brown, Cioppa, Lucas, Agent-based Simulation
    Supporting Military Analysis, PHALANX, Vol. 37,
    No. 3, Sept 2004.Cioppa, Lucas, Sanchez,
    Military Applications of Agent-based
    Simulation, Proc. 2004 Winter Simulation
    Conference.
  • Cioppa, Lucas, Sanchez, Military Applications of
    Agent-Based Simulations, Proceedings of the 2004
    Winter Simulation Conference, 171-179
  • Allen, Buss, Sanchez, Assessing Obstacle
    Location Accuracy in the REMUS Unmanned
    Underwater Vehicle, Proceedings of the 2004
    Winter Simulation Conference, 940-948.
  • Cioppa, An Efficient Screening Methodology For a
    Priori Assessed Non-Influential Factors, Proc.
    2004 Winter Simulation Conference, 171-180.
  • Sanchez, Work Smarter, Not Harder Guidelines
    for Designing Simulation Experiments. Proc. of
    the 2005 Winter Simulation Conference,
    forthcoming.
  • Wolf, Sanchez, Goerger, Brown, Using Agents to
    Model Logistics, under revision for Military
    Operations Research.
  • Baird, Paulo, Sanchez, Crowder, Measuring
    Information Gain in the Objective Force, under
    revision for Military Operations Research.

16
Student thesesnote the breadth of applications
  • 2000 Brown (Captain, USMC)
  • Human Dimension of Combat
  • 2001 Vinyard (Major, USMC)
  • Reducing Non-monotonicities in Combat Models,
  • MORS/Tisdale Winner, MORS Walker Award
  • 2002 Erlenbruch (Captain, German Army)
  • German Peacekeeping Operations, MORS/Tisdale
    Finalist
  • 2002 Pee (Singapore DSTA)
  • Information Superiority and Battle Outcomes,
    MORS/Tisdale Finalist
  • 2002 Wan (Major, Singapore Army)
  • Effects of Human Factors on Combat Outcomes
  • Dickie (Major, Australian Army)
  • Swarming Unmanned Vehicles, MORS/Tisdale
    Finalist
  • 2002 Ipekci (1st Lieutenant, Turkish Army)
  • Guerrilla Warfare, MORS/Tisdale Winner
  • 2002 Wu (Lieutenant, USN)
  • Spectral Analysis and Sonification of Simulation
    Data
  • 2002 Cioppa (Lieutenant Colonel, US Army, PhD)
  • Experimental Designs for High-dimensional
    Complex Models,ASA 3rd Annual Prize for Best
    Student Paper Applying Stat. to Defense
  • 2004 Berner (LCDR, US Navy)
  • Multiple UAVs in Maritime Search and Control
  • 2004 Tan (Singapore ST)
  • Checkpoint Security
  • 2005 Babilot (USMC)
  • DO versus Traditional Force in Urban Terrain
  • 2005 Bain (USMC)
  • Logistics Support for Distributed Ops,
    MORS/Tisdale Finalist
  • 2005 Gun (Turkish Army)
  • Sunni Participation in Iraqi Elections
  • 2005 McMindes (USMC)
  • UAV Survivability
  • 2005 Sanders (USMC)
  • Marine Expeditionary Rifle Squad
  • 2005 Ang (Singapore Technologies Engineering)
  • Increasing Participation and Decreasing
    Escalation in Elections
  • 2005 Chang (Singapore DSTA)
  • Edge vs. Hierarchical Organizations for
    Collaborative Tasks
  • 2005 Liang (Singapore DSTA)

17
An environment for exploration requires
  • Flexible models or tools to build them
  • High-performance computing
  • Experimental design (already talked about)
  • Data analysis and visualization

18
Agent-based distillations from Project Albert
  • DistillationsFast, robust, easy to use,
    transparent agent-based simulations that focus on
    specific aspects of operational scenarios (only
    strive to capture the essence)
  • Project Albert collaborators have developed six
    (agent-based) distillation models which are
    currently implemented in a data farming
    environment
  • ISAAC/EINSTein Socrates NetLogo
  • Pythagoras Mana PAX

4
19
The Big Iron
  • Hardware
  • Maui High Performance Computing Center (MHPCC)
    An Air Force Research Laboratory Center Managed
    by the University of Hawai'i
  • Enabling Software
  • Web-based Maui High Performance Computing Center
  • Local Resources OldMcData

20
Interpreting the results
  • Standard statistical graphics tools (regression
    trees, 3-D scatter plots, contour plots, plots of
    average results for a single factors, interaction
    profiles) can be used to gain insights from the
    data
  • Step-wise regression and regression trees
    identify important factors, interactions, and
    thresholds

21
Decision Tree Time and routing (Raffetto, 2004)
MOE Proportion of enemy classified
Most Important Factor Needs to fly over 7 hours
Think like the enemy! Rt.2 planned with intel
In either case, throw more forces/capabilities at
it next if available
22
Example Regression analysis (Raffetto, 2004)
  • Across the noise factors, the regression models
    produce R-Square values from .906 to .921 with
    seven to nine terms for 1-3 UAVs
  • Provides a means to compare expected effects of
    different configurations
  • Parameter estimates are put into a simple Excel
    spreadsheet GUI to allow decision makers to view
    relative effects of configurations within this
    scenario

23
Example Interactions (Steele, 2004) camera
range and speed
  • At low speeds, camera range is unimportant
  • At higher speeds, camera range has big impact
  • One of several technological challenges for
    systems design

24
Example One-way analysis (Hakola, 2004)
25
Example MART (Ipekci, 2002)
Relative Variable Importance
Blue Casualties
Relative Variable Importance
Red Casualties
26
Example Contour plot (Allen, 2004)
27
Resources Seed Center for Data Farming
  • http//harvest.nps.navy.mil
  • Check here for
  • lists of student theses (available online)
  • spreadsheets software
  • pdf files for several of our publications,
    publication info for the rest
  • links to other resources
  • updates

All models are wrong, but some are usefulGeorge
Box
28
Questions?
SEED Center for Data Farming Mission Advance the
collaborative development and use of simulation
experiments and efficient designs to provide
decision makers with timely insights on complex
systems and operations. Primary
Sponsors International Collaborators A
pplications Include Peacekeeping operations,
convoy protection, networked future forces,
unmanned vehicles, anti-terror emergency
response, urban operations, humanitarian relief,
and more Products Include New downloadable
experimental designs, plus over 40 student
thesesand a dozen articles
http//diana.gl.nps.navy.mil/SeedLab/
Write a Comment
User Comments (0)
About PowerShow.com