Simulation and Uncertainty - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Simulation and Uncertainty

Description:

Aleatory (randomness) Number of heads in 10 tosses of a fair coin ... The distinction between epistemic and aleatory uncertainty is useful ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 30
Provided by: Anthony309
Category:

less

Transcript and Presenter's Notes

Title: Simulation and Uncertainty


1
Simulation and Uncertainty
  • Tony OHagan
  • University of Sheffield

2
Outline
  • Uncertainty
  • Example bovine tuberculosis
  • Uncertainty analysis
  • Elicitation
  • Case study 1 inhibiting platelet aggregation
  • Propagating uncertainty
  • Case study 2 cost-effectiveness
  • Conclusions

3
Two kinds of uncertainty
  • Aleatory (randomness)
  • Number of heads in 10 tosses of a fair coin
  • Mean of a sample of 25 from a N(0,1) distribution
  • Epistemic (lack of knowledge)
  • Atomic weight of Ruthenium
  • Number of deaths at Agincourt
  • Often, both arise together
  • Number of patients who respond to a drug in a
    trial
  • Mean height of a sample of 25 men in Fontainebleau

4
Two kinds of probability
  • Frequency probability
  • Long run frequency in many repetitions
  • Appropriate only for purely aleatory uncertainty
  • Subjective (or personal) probability
  • Degree of belief
  • Appropriate for both aleatory and epistemic (and
    mixed) uncertainties
  • Consider, for instance
  • Probability that next president of USA is
    Republican

5
Uncertainty and statistics
  • Data are random
  • Repeatable
  • Parameters are uncertain but not random
  • Unique
  • Uncertainty in data is mixed
  • But aleatory if we condition on (fix) the
    parameters
  • E.g. likelihood function
  • Uncertainty in parameters is epistemic
  • If we condition on the data, nothing aleatory
    remains

6
Two kinds of statistics
  • Frequentist
  • Based on frequency probability
  • Confidence intervals, significance tests etc
  • Inferences valid only in long run repetition
  • Does not make probability statements about
    parameters
  • Bayesian
  • Based on personal probability
  • Inferences conditional on the actual data
    obtained
  • Makes probability statements about parameters

7
Example bovine tuberculosis
  • Consider a model for the spread of tuberculosis
    (TB) in cows
  • In the UK, TB is primarily spread by badgers
  • Model in order to assess reduction of TB in cows
    if we introduce local culling (i.e. killing) of
    badgers

8
How the model might look
  • Simulation model components
  • Location of badger setts, litter size and
    fecundity
  • Spread of badgers
  • Rates of transmission of disease
  • Success rate of culling

9
Uncertainty in the TB model
  • Simulation
  • Replicate runs give different outcomes (aleatory)
  • Parameter uncertainty
  • E.g. mean (and distribution of) litter size,
    dispersal range, transmission rates (epistemic)
  • Structural uncertainty
  • Alternative modelling assumptions (epistemic)
  • Interest in properties of simulation distribution
  • E.g. probability of reducing bovine TB incidence
    below threshold (with optimal culling)
  • All are functions of parameters and model
    structure

10
General structure
  • Uncertain model parameters (structure) X
  • With known distribution
  • True value XT
  • Object of interest YT Y(XT)
  • Possibly optimised over control parameters
  • Model output Z(X), related to Y(X)
  • E.g. Z(X) Y(X) error
  • Can run model for any X
  • Uncertainty about YT due to two sources
  • We dont know XT (epistemic)
  • Even if we knew XT,can only observe Z(XT)
    (aleatory)

11
Uncertainty analysis
  • Find the distribution of YT
  • Challenges
  • Specifying distribution of X
  • Computing Z(X)
  • Identifying distribution of Z(X) given Y(X)
  • Propagating uncertainty in X

12
Parameter distributions
  • Necessarily personal
  • Even if we have data
  • E.g. sample of badger litter sizes
  • Expert judgement generally plays a part
  • May be formal or informal
  • Formal elicitation of expert knowledge
  • A seriously non-trivial business
  • Substantial body of literature, particularly in
    psychology

13
Case study 1
  • A pharmaceutical company is developing a new drug
    to reduce platelet aggregation for patients with
    acute coronary syndrome (ACS)
  • Primary comparator is clopidogrel
  • Case study concerns elicitation of expert
    knowledge prior to reporting of Phase 2a trial
  • Required in order to do Bayesian clinical trial
    simulation
  • 5 elicitation sessions with several experts over
    a total of about 3 days
  • Analysis revisited after Phase 2a and 2b trials

14
Simulating SVEs
SVE Secondary vascular event IPA Inhibition
of platelet aggregation
15
Distributions elicited
  • Many distributions were actually elicited
  • Mean IPA (efficacy on biomarker) for each drug
    and dose
  • Patient-level variation in IPA around mean
  • Relative risk of SVE conditional on individual
    patient IPA
  • Baseline SVE risk
  • Other things to do with side effects
  • We will just look here at elicitation of the
    distribution of mean IPA for a high dose of the
    new drug
  • Judgements made at the time
  • Knowledge now is of course quite different!
  • But decisions had to be made then about Phase 2b
    trial
  • Whether to go ahead or drop the drug
  • Size of sample, how many doses, etc

16
Elicitation record
17
Eliciting one distribution
  • Mean IPA () for high dose
  • Range 80 to 100
  • Median 92
  • Probabilities P(over 95) 0.4, P(under 85) 0.2
  • Chosen distribution Beta(11.5, 1.2)
  • Median 93
  • P(over 95) 0.36, P(under 85) 0.20, P(under
    80) 0.11

18
Propagating uncertainty
  • Usual approach is by Monte Carlo
  • Randomly draw parameter sets Xi, i 1, 2, , N
    from distribution of X
  • Run model for each parameter set to get outputs
    Yi Y(Xi), i 1, 2, , N
  • Assume for now that we can do big enough runs to
    ignore the difference between Z(X) and Y(X)
  • These are a sample from distribution of YT
  • Use sample to make inferences about this
    distribution
  • Generally frequentist but fundamentally epistemic
  • Impractical if computing each Yi is
    computationally intensive

19
Optimal balance of resources
  • Consider the situation where each Z(Xi) is an
    average over n individuals
  • And Y(Xi) could be got by using very large n
  • Then total computing effort is Nn individuals
  • Simulation within simulation
  • Suppose
  • The variance between individuals is v
  • The variance of Y(X) is w
  • We are interested in E(Y(X)) and w
  • Then optimally n 1 v/w (approx)
  • Of order 36 times more efficient than large n

20
Emulation
  • When even this efficiency gain is not enough
  • Or when we the conditions dont hold
  • We may be able to propagate uncertainty through
    emulation
  • An emulator is a statistical model/approximation
    for the function Y(X)
  • Trained on a set of model runs Yi Y(Xi) or Zi
    Z(Xi)
  • But Xis not chosen randomly (inference is now
    Bayesian)
  • Runs much faster than the original simulator
  • Think neural net or response surface, but better!

21
Gaussian process
  • The emulator represents Y(.) as a Gaussian
    process
  • Prior distribution embodies only a belief that
    Y(X) is a smooth, continuous function of X
  • Condition on training set to get posterior GP
  • Posterior mean function is a fast approximation
    to Y(.)
  • Posterior variance expresses additional
    uncertainty
  • Unlike neural net or response surface, the GP
    emulator correctly encodes the training data

22
2 code runs
  • Consider one input and one output
  • Emulator estimate interpolates data
  • Emulator uncertainty grows between data points

23
3 code runs
  • Adding another point changes estimate and reduces
    uncertainty

24
5 code runs
  • And so on

25
Then what?
  • Given enough training data points we can emulate
    any model accurately
  • So that posterior variance is small everywhere
  • Typically, this can be done with orders of
    magnitude fewer model runs than traditional
    methods
  • Use the emulator to make inference about other
    things of interest
  • E.g. uncertainty analysis, calibration,
    optimisation
  • Conceptually very straightforward in the Bayesian
    framework
  • But of course can be computationally hard

26
Case study 2
  • Clinical trial simulation coupled to economic
    model
  • Simulation within simulation
  • Outer simulation of clinical trials, producing
    trial outcome results
  • In the form of posterior distributions for drug
    efficacy
  • Incorporating parameter uncertainty
  • Inner simulation of cost-effectiveness (NICE
    decision)
  • For each trial outcome simulate patient outcomes
    with those efficacy distributions (and many other
    uncertain parameters)
  • Like the optimal balance of resources slide
  • But complex clinical trial simulation replaces
    simply drawing from distribution of X

27
Emulator solution
  • 5 emulators built
  • Means and variances of (population mean)
    incremental costs and QALYs, and their covariance
  • Together these characterised the Cost
    Effectiveness Acceptability Curve
  • Which was basically our Y(X)
  • For any given trial design and drug development
    protocols, we could assess the uncertainty (due
    to all causes) regarding whether the final Phase
    3 trial would produce good enough results for the
    drug to be
  • Licensed for use
  • Adopted as cost-effective by the UK National
    Health Service

28
Conclusions
  • The distinction between epistemic and aleatory
    uncertainty is useful
  • Recognising that uncertainty about parameters of
    a model (and structural assumptions) is epistemic
    is useful
  • Expert judgement is an integral part of
    specifying distributions
  • Uncertainty analysis of a stochastic simulation
    model is conceptually a nested simulation
  • Optimal balance of sample sizes
  • More efficient computation using emulators

29
References
  • On elicitation
  • OHagan, A. et al (2006). Uncertain Judgements
    Eliciting Expert Probabilities. Wiley
  • www.shef.ac.uk/beep
  • On optimal resource allocation
  • OHagan, A., Stevenson, M.D. and Madan, J.
    (2007). Monte Carlo probabilistic sensitivity
    analysis for patient level simulation models
    Efficient estimation of mean and variance using
    ANOVA. Health Economics (in press)
  • Download from tonyohagan.co.uk/academic
  • On emulators
  • O'Hagan, A. (2006). Bayesian analysis of computer
    code outputs a tutorial. Reliability Engineering
    and System Safety 91, 1290-1300.
  • mucm.group.shef.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com