Fast Simulators for Assessment and Propagation of Model Uncertainty* PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Fast Simulators for Assessment and Propagation of Model Uncertainty*


1
Fast Simulators for Assessment and Propagation of
Model Uncertainty
  • Jim Berger, M.J. Bayarri, German Molina
  • June 20, 2001
  • SAMO 2001, Madrid
  • Project of the National Institute of Statistical
    Sciences

2
Some activities requiring numerous runs of a
complex computer model
  • Output analysis with random inputs, what is the
    distribution of output variables?
  • Optimization finding the optimal setting for
    process control variables (e.g., signal timing).
  • Design of computer or field experiments.
  • Bayesian Inference learning about unknown model
    parameters or inputs from field data (i.e., data
    from the process being modeled).

3
The problem and solution
  • If runs of the computer model are too slow, the
    activity cannot be completed.
  • The natural solution is to approximate the
    computer model most common is approximation by a
    faster computer model.
  • models of lower resolution
  • linearized versions of the model
  • response surface (or Gaussian process)
    approximations
  • probability networks of various types.

4
An Example Bayesian input analysis for CORSIM
  • The microsimulator CORSIM is a computer model of
    street and highway traffic.
  • It models vehicles, entering the network and
    moving according to interaction rules.
  • The traffic network studied consists of a
    44-intersection neighborhood in Chicago.
  • CORSIM was applied to model a one-hour period
    during rush-hour.

5
Network (Chicago)
OHare
Kingsbury
Huron
Erie
Ontario
Ohio
Grand
Illinois
Hubbard
Dearborn
Orleans
Franklin
LaSalle
Clark
Wells
LOOP
6
Key Unknown Inputs
  • Demands, ? the means of exponential
    inter-arrival time distributions that determine
    the (random) numbers of vehicles that enter the
    system from external streets. ? is
    16-dimensional.
  • Turning probabilities, P the probabilities that
    vehicles turn right, left, or go through each
    intersection. P is 84-dimensional.

7
Data vehicle counts, C
  • Demand counts the numbers of vehicles entering
    the network at each street, recorded by observers
    placed on the external streets.
  • Turning counts made by observers over short time
    intervals at all intersections.
  • Video counts At central intersections, cameras
    were placed that produced an exact count of
    vehicles.

8
Problems with the Data
  • Demand counts are inaccurate, some as much as
    40.
  • Turning counts were made over short time periods.
  • Some of the turning counts were missing.
  • The observer counts were incompatible with the
    video counts (reality) so they were tuned to
    bring them into accordance.

9
Example of a tuning adjustment

Observer reported 1969 vehicles entering
here. This was adjusted to 1790 vehicles to fit
the observed video count here.
Erie
Ontario
LaSalle
10
Problems with tuning
  • Often, too few inputs are tuned, and those that
    are tuned are then over-tuned.
  • The often considerable uncertainty in the tuned
    inputs is ignored, resulting in overly optimistic
    assessment of output variance .
  • Tuning can mask model biases that actually exist,
    making the model less accurate for prediction
    outside the range of the data (not applicable
    here).

11
A solution Bayesian analysis
  • Compute the posterior distribution of the true
    model inputs, given the data.
  • But this typically requires use of Markov chain
    Monte Carlo (MCMC) methods, involving thousands
    of model runs too time consuming for CORSIM.
  • Thus a fast simulator is needed, one which
    represents those features of CORSIM that allow
    the data to be related to model inputs.

12
(No Transcript)
13
Structure of the fast simulator
  • It is a probability network
  • with the same nodal structure as CORSIMS
  • with unknown inputs ? (vehicle inter-arrival
    rates) and P (turning probabilities) that mean
    the same as in CORSIM
  • but, with instantaneous vehicles, that (i)
    enter the network (ii) turn appropriately (iii)
    exit.
  • Note fast simulators often have a limited
    purpose, and are not general replacements for the
    computer model here, we ignore the key features
    of time, interactions, signals, etc.

14
(No Transcript)
15
Modeling the demand counts data
  • Demand counts Each demand count, CiD, is
    modelled by a Poisson distribution with mean bi
    Ni , where Ni is the true count and bi-1 is the
    unknown observer bias.
  • The bi are modelled as being i.i.d. Gamma(?, ?),
    with ? lt2? (so that the expected bias is less
    than 100), but are otherwise unknown, and
    assigned a uniform prior distribution.

16
Modeling the turning counts data
  • If Ni vehicles arrive at an intersection from a
    given direction, the numbers turning right, left,
    and going through, (NiR, NiL, NiT), are assumed
    to follow a multinomial distribution with
    probabilities (PiR, PiL, PiT).
  • The (PiR, PiL, PiT) are assigned the Jeffreys
    prior distribution ? (PiR PiL PiT)-1/2.
  • The observed turning counts, CiT, were assumed to
    be accurate.

17
Latent Variables and Restrictions
  • Introduce latent Ni , counts on all streets
  • the total number of vehicles entering an
    intersection must equal the number leaving
  • the video counts, assumed to be accurate, lead to
    known values of some sums of these Ni
  • Eliminate excess Ni (from an initial ?? to 74),
    in such a way that the restrictions have a simple
    structure. (Poster by G. Molina.)
  • Let N denote the constrained region of Ni .

18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
The posterior distribution
  • By Bayes theorem, the posterior distribution, p(
    N, l, P, b, ?, ? C), of all unknowns given the
    data C, is simply proportional to the product of
    the likelihood and the prior, i.e.
  • fPoisson(CD ND, b) fmultinomial(CT P)
    ? pmultinomial(N P) pPoisson(ND l)
    ? pJeffreys(P,?) pGamma(b ?, ? )
    1?????? 1N.

23
Computation
  • The posterior has 192 unknown parameters.
  • Computation must be done by MCMC. We utilize a
    Gibbs sampling scheme.
  • The full conditional distributions for P,?, b,
    and ? are, respectively, Dirichlet, Gamma, Gamma,
    and restricted Gamma these are easy to sample.
  • ? has a log-concave density rejection sampling
  • Each Ni is sampled directly from its discrete
    distribution (restricted range).
  • Roughly 100,000 iterations needed.

24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
Gridlock and model constraints
  • In CORSIM, gridlock (all vehicles stopped) can
    occur (20 of the runs in last graph).
  • This essentially defines the unfeasibility
    region, ?, of the parameter space.
  • This can be handled in CORSIM by simply ignoring
    runs that yield gridlock (in the Bayesian
    inference, this corresponds to multiplying the
    posterior by 1?).

28
Conclusions
  • Tuning should be replaced by Bayesian inference
    for unknown parameters or inputs.
  • It may be necessary to constrain the parameter
    space by ignoring model runs that lie outside the
    unfeasibility region.
  • If evaluation of the computer model is too slow,
    fast simulators should be sought for which
    Bayesian inference is feasible.
Write a Comment
User Comments (0)
About PowerShow.com