Title: Validating Models of Complex Phenomena
1Validating Models of Complex Phenomena
- Some Ideas About Mathematical Decision Aids for
Complex Human-Social-Cultural-Behavior Simulation
Models - Dec 5, 2008
- Dr. Michael Smeltzer smeltzer_at_ebrinc.com
703.287.0376
Based on Phase I DARPA SBIR Contract Validating
Large Scale Simulation of Socio-Political
Phenomena
2Agenda
- HSCB models
- Valuable and sometimes very complex
- Not always properly validated
- Validation techniques
- Leverage user/developer/SME expertise and
expectations - Address applicability to specific problems
- Traditional sensitivity analysis approach to
validation - Visual one-by-one factor analysis
- Simulation of Irish insurgency against the
British in 1916 - An tool leveraging Bayesian inference for active
factor screening - A real simulation, and a real examination of
alternatives - Technical details
- Challenges
3The Opportunities and Challenges Associated with
HSCB Simulations
- Potential for significant contribution
- Identification of the effects of policy on
alternative economic, social, and political
futures - Large and complex
- Hundreds of variables non-linear relationships,
multiple feedback loops, delays, environmental
sensitizers or dampeners, potential for emergent
behavior - The complexity often results in incomplete
validation - Uncertain applicability in specific circumstances
- Not validated over the full range of possible
configurations and uses - Lack of credibility may result in user avoidance
- Not trusted, not understood, not used
They Lack Transparency and Credibility
Human, Social, Cultural Behavior
4Definition of User
- This collective term as used in this presentation
refers to three different user classes - Subjects people and possibly processes that run
simulations and interpret the results - RD Managers Individuals responsible for
managing the development of a simulation model
and ensuring that the product is meaningful and
useful . - Developers Model builders responsible for the
construction of a simulation model under the
guidance of RD managers and/or Subject Matter
Experts - In many cases, the presentation does not
specifically identify a specific user classes,
and in fact it often refers to all three. - To clearly articulate the issues and understand
the benefits of the approach, it is important to
reflect on all three classes.
5Simulation Modelers
- Sometimes they are MS experts
- Sometimes they are social scientists
- Sometimes they are RD managers
- Sometimes they are software engineers
- Sometimes they are just smart people sometimes
they arent - Often the models integrate components from
multiple complex domains - Sometimes the SMEs are involved sometimes not
- Users are rarely involved often after the fact
6The Problem
- Modelers make lots of assumptions and create very
complicated black boxes that model even more
complex dynamic systems - COMPOEX
- Around 5000 simulation variables for one exercise
- 1000 equations
- 12,000 19,000 factors
- The science sometimes stops
- Are there scientific techniques that can aid
users - Users include subjects, managers, modelers, and
SMEs - Validate complex simulations
- Provide users with some confidence that the model
is viable
7Agenda
- HSCB models
- Validation techniques
- Traditional sensitivity analysis approach to
validation - An tool leveraging Bayesian inference for active
factor screening - Technical details
- Challenges
8Indicators That Help Users Assess Model
Applicability for a Particular Use
- Simulation-produced results agree (or disagree)
with user expectations - Changes in an input variable value lead to
expected (or unexpected) changes in output values - Factors that drive simulation output agree with
factors that user believes should be drivers - Chain of causes and effects within simulation are
judged appropriate to the situation and problem
being addressed
User Needs Tools to Help Generate and Explore
Such Indicators
9Ultimate Goal and Guiding Principles
- Goal To create a method to help a user assess
simulation applicability for a specific use - Not a substitute comprehensive validation
campaign, which may include empirical and
construct validation - May also be used to support SMEs and developers
during and after model development - Principles
- Treat users mental model as the point of
reference for judging applicability and validity
of model - Help user to apply his subject matter expertise
without his having to be familiar with model
algorithms - Facilitate user evaluation of model applicability
- Loosely-coupled statistical methods to clarify
important model factors
10Objective
- Achieve simulation transparency and credibility
- Do it with an automated tool
- Help users explore the simulation results in a
specific problem context
11Concept of Operations
- User defines his/her problem, configures the
simulation, and frames experiment - Specifies cases to be explored and questions to
be answered - Identifies a set of relevant input variables
- Establishes hypotheses about expected outcomes
- Tool provides the following automated services
- Supports systematic variation of input variables
and examination of output variables - Identifies the subset of chosen input variables
and interactions that most influence chosen
output variables - Presents information for visual interpretation
12Agenda
- HSCB models
- Validation techniques
- Traditional sensitivity analysis approach to
validation - An tool leveraging Bayesian inference for active
factor screening - Technical details
- Challenges
13Simulation Chosen for Illustration
- Anderson, Edward G. A Preliminary System
Dynamics Model of Insurgency Management The
Anglo-Irish War of 1916-21 as a Case Study.
Proceedings of the 2006 International System
Dynamics Conference, Nijmegen, the Netherlands.
March, 2006. - University of Texas, McCombs School of Business,
Department of Information, Risk and Operations
Management. - Developers employed model to demonstrate the
potential of using the system dynamics computer
simulation methodology to gain insight into the
dynamic behavior of insurgents.
14About the Simulation
Input Parameters 32 Variables, 6 Interacting
Classes
- Simulation Variables (Partial List)
- British public war weariness effect on troops in
Ireland - Effect of Irish public satisfaction on Irish
insurgents - Effect of insurgent density on identifying
insurgents - Insurgents captured per coercive act
- Number of insurgents dead or captured who would
have remained active - Inactive insurgent retirement rate
- Coercion acts per month (also per soldier, per
citizen) - Irish awareness of coercive acts
- Fraction of insurgents that are active
- Increase in insurgents
- Number of insurgents if there was no delay to
activate - Potential insurgents
- Pressure of British government to reduce incidents
- Citizens
- Time to weary of war
- Satisfaction parameter
- Attrition parameter
- Time to satisfy
- Time to dissatisfy
- Fraction of population attracted to insurgent
action
- Insurgents
- Initial active insurgents
- Average insurgent career (yrs)
- Base insurgent density
- Base insurgent fraction
- Frac of males likely to join
- Lifespan in years
- Population annual growth rate
- Base population
- Time to join insurgency
- Min demobilization time
- Minimum insurgent frac activated
- Insurgent parameter
- Incidents
- Incidents/insurgent-month
- Attrition rate per incident
- Output Variables for Exploration
- British Troops in Ireland (t120 months)
- Irish satisfaction (t120 months)
- Active insurgents (t120 months)
- British War Weariness(t120 months)
- Coercion
- Base coercion fruitfulness
- Coercion response time
- Max coercive acts /month
- Coercion parameter
- Soldiers
- Base troops in Ireland
- Minimum troops to hold Ireland
- Time to move troops
- Troop parameter diminishing returns of troop
presence
- Vensim instantiation of model enables
- Representation of time delays
- Integration over simulation intervals
- Weapons
- Weapons availability y/n
- Weapons parameter
15Users Mental ModelSome Relationships and
Competing Influences
Fewer coercive acts, since fewer troop to do them
Fewer British troops in Ireland
More desire by British public to bring troops home
More insurgent incidents
More active insurgents
More coercive acts (killing, jailing insurgents)
More desire by British public stamp out insurgency
Fewer insurgents
More insurgents killed or jailed
More coercive acts (killing, jailing) against
insurgents
Irish public more dissatisfied
More insurgent recruits
More insurgents
16Input Variables Changed
- Insurgent parameter (Decreased) sensitivity of
number of insurgent recruits to Irish
dissatisfaction - Time to satisfy (Decreased) Delay between a
reduction of coercive acts and Irish satisfaction
with British troops - User will evaluate applicability of model by
comparing expectations generated by his mental
model - Response of output variables to changes in input
17User Expectations if Insurgent Parameter is
Reduced
- New recruits will diminish
- Violent acts will diminish
- British coercive acts per British soldier
diminished - British war weariness will diminish
- Number of British troops withdrawn will be less
so more will remain - Number of coercive acts may increase or decrease
depending on which has more impact decrease in
coercive acts per soldier or increase in British
soldiers - Irish dissatisfaction with British will increase
or decrease depending on whether number of
coercive acts increases or decreases
18Simulation Results Insurgent Parameter Reduced
Reduced war weariness at first
Fewer insurgents at first. More at end
More British troops
Decreased Irish satisfaction over longer term
19User Evaluation of Simulation Results When
Insurgent Parameter Reduced
Reduced war weariness at first, as expected due
to fewer insurgent attacks
Fewer insurgents at first, as expected since new
recruits reduced
More British troops, as expected, since reduced
war weariness reduces pressure to bring home
No user expectations because change in Irish
satisfaction depends on coercive acts, which
users model says could either increase or
decrease
20User Conclusions Decrease of Insurgent Parameter
- Simulation output for number of insurgents, war
weariness and British troops behaved as expected,
increasing confidence of simulation applicability
for this case - Simulation output on Irish satisfaction could not
impact user evaluation of simulation since user
had no prior expectations - If the number of British soldiers matters most
then Irish satisfaction would be expected to
decrease - If coercive acts per soldier matters most then
Irish satisfaction would be expected to increase
But Weve Only Looked at One Factor
21User Expectations if Time to Satisfy is Reduced
This change reduces the time for Irish approval
of British to increase when British reduced
number of coercive acts User expectations
- Time constant change will have little if any
impact on final steady state value of key
variables - Number of active insurgents
- British war weariness
- Number of British troops
- Irish satisfaction with British
- All changes to these values will simply occur
more quickly
22Simulation Results Time To Satisfy Decreased
(Quicker Reaction)
Peak reached sooner
Peak reached sooner
Change occurs more quickly
No change to steady state value or rate of change
No change to extrapolated steady state value
23User Evaluation of Simulation Results Time to
Satisfy Reduced
Peak reached sooner as expected
Peak reached sooner, as expected
No changed to extrapolated steady state value as
expected
No change to steady state value as expected
Change occurs more quickly, as expected
No significant impact to rate of change,
contrary to expectations
No changed to extrapolated steady state value as
expected
24User Conclusions Decrease In Time to Satisfy
- No impact on steady state response, as expected
- Decreased time response to number of active
insurgents, British war weariness, and Irish
satisfaction with British rule as expected - No change in time behavior of number of British
troops, which is contrary to expectations
But Weve Only Looked at Two Factors and No
Interactions
25User Conclusions From One-by-One Examination of
Output Changes
- Output changes from insurgent parameter and
time-to-satisfy match user expectations
reasonably well, increasing credibility for user - Sensitivity analysis has not helped much with
transparency. - Sensitivity analysis is hard
- The inspection of additional variables would
- Be useful, but time consuming
- Provide additional insight into the applicability
of the model - Help focus user on key variables to examine in
more detail
User Would Like More Information on the Drivers
for Selected Output Variables
26Agenda
- HSCB models
- Validation techniques
- Traditional sensitivity analysis approach to
validation - An tool leveraging Bayesian inference for active
factor screening - Technical details
- Challenges
27User Evaluation Assessing Impact of Drivers
Using a Bayesian Tool
- The sensitivity analysis inspected changes to
output variables in response to changes to input
variables, analyzed one-by-one. - A Bayesian analysis is a way to automate and
expedite the process - This approach identifies simultaneously many
input variables and variable interactions that
cause a change in the value of output variables. - User assessment of applicability of the model
will be based on a match between variables
identified as drivers/non-drivers and those he
expects to be drivers/non-drivers - The technique can analyze both the input
variables the user expects to drive output as
well as some he is uncertain about or does not
expect to drive output
28Technical Approach Innovative Use of Modified
Box-Meyer Algorithm
- Employ an extension of Box-Meyer (B-M) algorithm
for finding active factors in fractionated
screening experiments - Originally intended to support experimental
design - Uses Bayesian algorithm to compute the
probability a set of variables is active given
experimental outcomes - Uses marginal posterior probabilities to identify
active factors - Use extended B-M method to help identify active
and important factors that are small in magnitude
and may be overwhelmed by large magnitude factors - Conduct computational experiments with
tool-generated experimental design for
user-defined input variables - Compute the probability a variable is active
given the simulation outputs
29Example 8 Input Variables Possibly Affecting
the Number of Active Insurgents
- Weapons availability Affects number and
strength of insurgent attacks and resulting
pressure to reduce incidents - Coercion parameter How much an increase in the
number and strength of insurgent attacks
increases British coercive acts - Max coercive acts per British soldier prevents
number of British attacks per soldier from
exceeding specified value - Reference incidents affects impact of insurgent
attacks on British desire to reduce attacks and
on pressure on British to reduce incidents - Time to weary of war Delay between a change in
pressure to reduce incidents and British war
weariness - Troop parameter How much war weariness impacts
the withdrawal of British troops - Time to satisfy Delay between a reduction of
coercive acts and Irish satisfaction with British
troops - Insurgent parameter sensitivity of number of
insurgent recruits to Irish dissatisfaction
30User Expectations if Numerous Input Variables
Change Simultaneously
- User expects impacts that more directly affect
insurgents to have greatest impact - Weapons availability moderate impact, since
affects British reaction to attacks rather than
troop levels directly - Coercion parameter moderate impact, since
coercive acts can directly impact number of
insurgents through attrition - Max coercive acts per British soldier uncertain
impact since depends on whether this ceiling is
activated - Reference incidents low impact, since works
indirectly through British reaction to insurgent
attacks - Time to weary of war Low impact, since it is a
time constant - Troop parameter Low impact, since works
indirectly through coercive acts - Time to satisfy Low impact, since is a time
constraint - Insurgent parameter High impact, since directly
affects recruitment of new members
Impact Expected
Minimal or No Impact Expected
31Using Box-Meyer Bayesian Method to Identify
Drivers
- User chooses
- Output variables whose drivers are to be
determined - Input variables that are candidates for being
drivers - Binary (low, high) values for each of these input
variables - Type of design
- Depth of interactions to be included in tools
analysis - X1, X2, X3 or X1, X2, X3, X1X2, X1X3,X2X3 or
- Tool
- Calculates the impact of individual and combined
variables on the output variable
32Design Matrix Simulation Runs2 8-4
Insurgent Parameter Time to Satisfy Troop Parameter Time to Weary of War Reference Incidents (political) Max Coercive acts per Brit soldier Coercion Parameter (political) Weapons Availability Active Insurgents
-1 -1 -1 1 1 1 -1 1 744
1 -1 -1 -1 -1 1 1 1 727
-1 1 -1 -1 1 -1 1 1 400
1 1 -1 1 -1 -1 -1 1 575
-1 -1 1 1 -1 -1 1 1 312
1 -1 1 -1 1 -1 -1 1 1020
-1 1 1 -1 -1 1 -1 1 650
1 1 1 1 1 1 1 1 0
1 1 1 -1 -1 -1 1 -1 762
-1 1 1 1 1 -1 -1 -1 1347
1 -1 1 1 -1 1 -1 -1 1154
-1 -1 1 -1 1 1 1 -1 936
1 1 -1 -1 1 1 -1 -1 1311
-1 1 -1 1 -1 1 1 -1 572
1 -1 -1 1 1 -1 1 -1 1098
-1 -1 -1 -1 -1 -1 -1 -1 1248
33Box-Meyer is a Snapshot of the Simulation Results
- The quantitative analysis shown next is from t
120
34Bayesian Results Driver Identification
?.25 ?.7
35User Reaction to Bayesian Results
- Results are in general agreement with
expectations - User is surprised by reference incidents and
insurgent parameter - Inspection of earlier insurgent parameter run
indicates high initial impact which diminishes
over time but not to zero. - User will evaluate reference incidents more
closely to understand reasons for high impact
?.25 ?.7
Match expectations
Contrary to expectations
36Value in Simulation Evaluation for Current Use
- Bayesian analysis identified the active factors
that drive the output - Impact of 6 input variables met expectations
- User should investigate reference incidents more
closely to understand its importance - User may want to investigate insurgent parameter
more closely to understand its initial but
diminishing impact - Screening technology prompts user to focus on
most surprising variable (reference incidents) - Typical follow-on analysis might
- Re-examine the four active factors and study the
interactions - Examine temporal behavior
- Examine the causal chains
37User Assessment
- If results agree with expectations/hypothesis it
is an indicator that the simulation is consistent
with users presumed understanding of situation - If user is uncomfortable with results it is an
indicator that further exploration of the
simulations causal chains is warranted - If deviation of results from user expectations
remain after further investigation it is an
indicator that the model may not be applicable
for specific use being considered
38Analytical Benefits and Application Strategy
- Benefits
- Parsimony Helps distinguish the vital few
variables from the trivial many for a specific
problem - Explanatory power Helps find plausible variables
that account for experimentation results or can
be explored further - Economy puts focus on fractional or
nearly-orthogonal designs that allow more
variables to be considered - Application Strategy
- Consider a set of main variables and interactions
up to a specified order (level of complexity) - Identify drivers (or unimportant parameters) and
examine their interaction more closely through
subsequent experiments - Add additional variables or levels of complexity
as needed
More Accurate than Standard Visual Statistical
Methods
39Iterative Methodology
F1, F60, F61, F80
- Work bottom-up through multiple integrated models
to identify globally important input factors
F1 F7
F17 -F19
F60
F61
F76-F80
F1-F10
F11-F30
F31-F60
F61-F75
F76-F100
Religious
Economic
Social
Cultural
Political
40Agenda
- HSCB models
- Validation techniques
- Traditional sensitivity analysis approach to
validation - An tool leveraging Bayesian inference for active
factor screening - Technical details
- Challenges
41Algorithm Description Application of Bayes
formula
- Tool calculates p(Miy), the probability of each
hypothesis given the simulation output - Can be interpreted as the extent to which the set
of variables and interactions in the regression
formula associated with Mi can account for the
simulation output, taking all the Mi into account - Computes the marginal probabilities for each
model variable and interaction. - This is the sum of all the probabilities of
hypotheses in which that term appears
Box and Meyer, 1993
42Extension Modified Box Method
- Samset, O., and J. Tyssedal. Study of the
Box-Meyer Method for Finding Active Factors in
Screening Experiments. 1998. - As the difference in size between the largest
and smallest effects in the true model increases,
the Box Meyer method seems to have a problem with
identifying the smallest effects, even if these
are highly significant - They modified the Box-Meyer method to accommodate
this - The idea Identify interactions that are small
and delete those columns from the analysis
matrix. With noise-level factors removed, the
significant but small magnitude factors should
show up
Samset and Tyssedal, 1998
43Applying the Extension
Samset-Tyssedal 2 8-2 Partial Design ? 0.6
Box Meyer 2 8-2 Partial Design ? 0
- The Samset Tyssedal extension improves the active
factor analysis for a partial design to be very
similar to an active factor analysis for a full
design.
44Agenda
- HSCB models
- Validation techniques
- Traditional sensitivity analysis approach to
validation - An tool leveraging Bayesian inference for active
factor screening - Technical details
- Challenges
45Challenges in Applying the Extension
- Without the full design, when does one stop?
- Strategy for choosing delta?
- 0 ? ? ? TBD?
- How is ? related to ??
- Dependencies among ?,?, and ? and how to pick
their values - How do you know when to stop increasing ??
- For 2 k-p designs, what are the limits on
increasing the value of p when leveraging
modified B-M?
Explore The Extension in Phase II
46Follow-up ExperimentIdentifying Driving
Interactions
Consider this example y 5X1X34X1X3
M1y ?0 ?1 X1 ?3 X3 P(M1y) P
(X1, X3y) .95 Sigma-square 491
M1y ?0 ?1X1 ?3X3 ?4 X1X3 P(M1y) P
(X1, X3y) .99 Sigma-square 35
- The X1X3 term isnt evaluated
- M1 accounts for much of the output, but the
interaction term is required to capture all the
effects.
- How do we know if the interactions are important?
- Rerun the algorithm with maximum interactions
limited to one
Explore Strategies of Identifying Interactions in
Phase II
47Other Challenges
- How to pick min/max for two factor designs
- Experimentation design strategies for sampling a
large space of variables - Number of factors and degree of interaction
associated with partial designs - Explore the accuracy-time tradeoffs to
appropriately address a very large number of
variables in very large simulations - Strategies for investigating interactions
- Complementary methods and tools for follow on
analysis - Means of identifying the time sequence of causes
and effects - Apply Box-Meyer at different time intervals
- Develop performance metrics and system
limitations/boundaries - Designs
- Orthogonality
- Scalability and full designs
- False negatives and limitation of partial designs
48Conclusions
- Large complex simulation models are often little
more than black boxes to users - The utilization of rigorous quantitative
techniques like Bayesian inference and
statistical analysis can - Improve both the credibility and transparency of
the model - Do this for a range of user classes
- Subjects, SMEs, RD managers and developers
- The biggest challenge is how to attack
scalability in a systematic way
49 50Recursive Methodology Idea
51Recursive Methodology for Integrated Models
- Work top-down to identify important intermediate
outputs and thus relevant models
O21-O25
Outputs 1 20 Inputs to the next stage
F1-F10
F11-F30
F31-F60
F61-F75
F76-F100
Religious
Economic
Social
Cultural
Political
52Bayes
53Bayesian Refresher
- We have enough information about the outputs
to predict the values of the pertinent densitys
defining parameters
We dont have enough information, so we make an
assumption about the prior distribution of the
defining parameters and watch the densities
update as we gain information from the outputs
54Likelihood Function and the Sequential Nature of
Bayes
Learning from Experience Our knowledge of ?, ?2
grows as new data becomes available
55Genetics Example
MICE BB (black) Bb (Black) bb (brown)
BB mate bb 0 1 0
Bb mate bb 0 ½ ½
Bb mate Bb ¼ ½ ¼
- Suppose a test mouse is black and its parents
were Bb and Bb - As offspring are born to the test mouse and a
bb mate what can we say about the probability
that the test mouse is BB versus Bb - The prior probability for the test mouse is
- P(BBblack) (1/4)/(1/41/2) 1/3
- P(Bbblack) (1/2)(1/41/2) 2/3
- As seven black offspring are born the added
information changes these probabilities as we
apply Bayes theorem successively - P(BBblack) 1/3?1/2?2/3?4/5?8/9?16/17?32/33?64/65
- P(Bbblack) 2/3?1/2?1/3?1/5?1/9?1/17?1/33?1/65
56Questions
- What does the probability mean?
- Frequency of occurrence for the genetics example
- Mathematical expression of our belief in a
certain proposition - Bayes allows us to recalibrate our belief
- Choice and necessity of prior distribution?
- Known genetic facts
- What our belief is before we begin experimenting,
e.g. betting odds - Non-informative uniform distribution-tails go to
zero - Integrity of the density function
- Likelihood dominates the prior with enough
information
57Algorithm Description Application of Bayesian
Regression
- Tool creates a set of hypotheses or model (Mi)
from user-defined variables - Regression formulas for a subset of variables and
variables interactions - Examples
- Calculates each regression terms and variance for
each of the hypotheses - Treats each ?i in each regression formula as a
Bayesian prior random variables i.i.d. N(0, ?2s2) - From regression data, determines the predictive
density of the simulation output, given a
hypothesized model, Mi
Box and Meyer, 1993
58Experimental Design
59Screening Designs Scalability
- Low Resolution Designs
- Screen out the few important main effects from
the many less important others. - Main effects are confounded with two factor
interactions - High Resolution Designs
- Estimate interaction effects
- No main or two factor interaction is confounded
with any other main or two factor interaction - Two factor interactions are confounded with 3
factor interactions
60Design Matrix
Run X1 X2 X3
1 -1 -1 -1
2 1 -1 -1
3 -1 1 -1
4 1 1 -1
5 -1 -1 1
6 1 -1 1
7 -1 1 1
8 1 1 1
61Analysis Matrix
I X1 X2 X1X2 X3 X1X3 X2X3 X1X2X3 Y
1 -1 -1 1 -1 1 1 -1 Y1
1 1 -1 -1 -1 -1 1 1 Y2
1 -1 1 -1 -1 1 -1 1 Y3
1 1 1 1 -1 -1 -1 -1 Y4
1 -1 -1 1 1 -1 -1 1 Y5
1 1 -1 -1 1 1 -1 -1 Y6
1 -1 1 -1 1 -1 1 -1 Y7
1 1 1 1 1 1 1 1 Y8
- Orthogonality eliminates correlation between
estimates of main effects and estimates of
interaction effects - Note that in the regression solution the H and L
values have to replace the 1s and -1s
62Parameters
63Input Parameters
- ? is the probability that a single factor is
active, and since we expect only half the factors
to be active in any analysis we can assume it is
between 0 and 0.5 - ? is chosen with a nominal value of 0.25, but the
choice normally has negligible effect - ? is chosen by evaluating p(?y) for 1? ? ? 10 in
increments of 1 and choosing the one that gives
the maximum for the analysis
64Formula Parameters
65Simulation
66(No Transcript)