Title: The case for aggregate econometric accident models
1The case for aggregate econometric accident models
- Presentation at the 2006 TRB Annual Meeting
- Lasse Fridstrøm
- Managing Director
- Institute of Transport Economics (TØI)
- Oslo, Norway
- lef_at_toi.no
- www.toi.no
2Outline
- Conclusions
- Errors of aggregation and disaggregation
- On the fundamental randomness of accident counts
- The case for (generalized) Poisson regression
- Random vs. systematic variation
- Interpreting the goodness-of-fit
- The normal approximation to the Poisson model
- Selected results from the TRULS model
3Conclusions (1)
- The econometric toolbox is extremely well suited
for accident analysis. In fact, it is better
suited for this than for economics. - Various substantive as well as methodological
arguments suggest that accident models should be
at least moderately aggregate. - Econometric accident models could be temporal,
cross-sectional or preferably combined
spatio-temporal. - Autoregressive accident models make little sense.
4Conclusions (2)
- Accidents are fundamentally random and subject to
the whitest noise in behavioral science. - The analyst working with accident data has access
to more information than usually in econometrics.
Accident counts are Poisson distributed and
should be analyzed as such. Their (objective)
variance equals their mean. This information
should be exploited. - There is an upper bound on the amount of
variation which an accident relation can and
should explain. This bound is computable. - Although random, accident counts obey a causal
(systematic) probability law. Models should
distinguish clearly between random and systematic
variation.
5Conclusions (3)
- Victim counts are overdispersed, i. e. they
exhibit larger variances than under the Poisson
law. - Victim counts may be analysed by generalized
Poisson models (negative binomial models). - Large accident counts can be modeled as a
heteroskedastic Gaussian process. Even so, the
variance structure should reflect the Poisson
property var(y) E(y). -
6Conclusions (4)
- The most important explanatory factor in any
accident model is exposure. - Accident and risk functions are multiplicative
rather than additive. - Accident models are useful in estimating a
variety of policy relevant parameters, including - the marginal external accident cost,
- the (marginal) contribution of various road user
categories to risk, - the effect of accident countermeasures, and
- the importance of behavioral response (risk
compensation).
7Determinants of road accident counts a general
taxonomy
- Factors outside the national social system
- General socioeconomic conditions
- Size and structure of transportation sector
(affecting, e. g., risk exposure) - Accident reporting
- Randomness
- Accident countermeasures
8Why aggregate models?
- The accident generating process is very complex,
involving a large number of determinants, forces
and policy variables operating at the societal
level. - Accident data are non-experimental and must be
analyzed by non-experimental methods. - The atomistic fallacy exposure and accidents
migrate between micro units. Units should be
large enough to absorb such migration. - Micro units may be subject to selectivity bias.
- Accidents are rare.
- One needs to strike a balance between (i) the
accuracy and (ii) cost of measurement, (iii) the
random noise affecting casualty counts, and (iv)
the atomistic and (v) aggregative fallacies of
inference.
9The individual accident is unpredictable
- Had the individual accident been anticipated, it
would not have happened! It is thus logically
unpredictable. - No matter how much we learn about accident
generating mechanisms or countermeasures, we
would never be able to predict exactly where,
when, and by whom the single accident is going to
occur. - Our failure to predict the single accident is not
a matter of incomplete knowledge. The randomness
involved is ontological (objective) rather than
epistemic (subjective). It is a feature of the
real world, not only of how we (fail to)
understand it.
10Eeyore is right
- Im not saying there wont be an Accident now,
mind you.Theyre funny things, Accidents. You
never have them till youre having them. (A.A.
Milne The House at Pooh Corner)
11The law of rare events
Consider a time-varying random variable Y(t) such
that
Then
the number of events occurring during any
interval of length t (say) has a Poisson
distribution with mean
12The Poisson distribution
- There are compelling theoretical and empirical
reasons to assume that accident counts are
Poisson distributed. - The Poisson is a one parameter distribution
When we know the mean, we also know how much
variance to expect around it! The coefficient of
variation decreases with the mean
13Poisson distribution95 per cent probability
bounds around expected value
Observed number
Expected number
14The negative binomial distribution
Suppose the Poisson parameter is itself random,
and drawn from a gamma distribution with shape
parameter (say). In this case the
observed number of accidents can be shown to
follow a negative binomial distribution with
expected value (say) and
variance
- being the overdispersion parameter.
- Two interpretations
- Unobserved heterogeneity (Greenwood Yule 1920)
- True contagion (Eggenberger and Pólya 1923)
15Generalized Poisson variates
- Integer valued 0, 1, 2,
- Zero occurrences OK.
- Poisson invariance under summation
- Non-negative outcome and positive expected
value. Suggests multiplicative structure of
cofactors/independent variables. - Estimable through maximum likelihood (ML)
methods. - ML implicitly takes account of heteroskedasticity
16Probabilistic theories are complete
- Einstein He God does not play dice.
- Salmon (1984) Certain laws are irreducibly
statistical, - i. e. they include an inevitable, objectively
random component. Single events may occur at
random intervals, but with an almost constant
overall frequency in the long run. Such laws are
common in particle physics, but rare in
behavioral science. - Although the single event is all but impossible
to predict, the collection of such events may
very well behave in a perfectly predictable way,
amenable to description by means of precise
mathematical-statistical relationships. - Ex. radioactive decay (C14 method), die tossing,
road accidents.
17Random and systematic variation coexist
While the u terms are probabilistically
independent, the terms are
functionally dependent on certain common factors
and hence empirically correlated.
18The linear probability model
systematic (causal) part random part
19Autoregression is overfitting
Trying to explain the causal part in period t by
means of the white noise in period t-1, t-2, etc!
20Misspecification may show up as overdispersion
- Suppose one relevant variable has been left out.
- In this case some systematic variation is indeed
contained in the error term
21The upper bound on explana-tory power is
computable
- On account of the Poisson assumption, it is
possible, for a given accident data set, to
calculate the normal amount of random variation
and hence also the maximal amount of explainable,
systematic variation. - Using this information, one may calculate
goodness-of-fit measures for the systematic
variation of interest, thus comparing the
explained to the explainable. - See AAP vol 27, pp 1-20 (1995)
22Randomness accounts for large part of variation
in smaller accident counts
Source AAP 27 (1)1-20 (1995)
23Victim counts are overdispersed.
Source AAP 27 (1)1-20 (1995)
2495 per cent, overdispersed probability interval
around trend-fitted annual road fatalities in
Norway.
Source Elvik (2005), TØI report 792
25Is (generalized) Poisson regression the only way
to go?
- No. The limiting distribution of the Poisson is
the normal. Approximation is good already for
mean 10 and above. - But dependent variable should be log-transformed.
- Heteroskedasticity should be accounted for
through appropriate weighting. This requires
iteration and sometimes cumbersome
transformations. - Box-Cox regression models are useful, since for
many partial relationships, curvature is not
known a priori.
26For large Poisson counts y, the variance of ln(y)
is inversely proportional to the expected value ?.
The Box-Tukey constant is needed, since the
log of a Poisson variate has infinite variance.
27The variance of ln(ya), where y is Poisson
distributed with parameter ?.
Source TØI report 457
28The asymptotic approximation is very inaccurate
for small accident counts
? values ranging from 0.000248 to 692
Source TØI report 457
29The TRULS model for Norway a member of the DRAG
family
- Recursive system of equations at the county and
month level 19 counties x 264 months (22 years)
5016 observations - Equations
- Car ownership
- Exposure light and heavy vehicle road use, MCs,
and public transport - Seat belt use
- Injury accident frequency
- Severity fatalities, dangerously injured,
severely injured - Various casualty subset equations single vs
multiple vehicle crashes heavy vehicle crashes
car occupant, bicyclist, and pedestrian victims
(non-)seat belt users injured
30The TRULS model
- Injury accident frequency
- Severity
31- The TRULS model for Norway
- Estimated elasticities w r t exposure, by
severity.
Source TØI report 457
32- The TRULS model for Norway
- Estimated elasticities w r t exposure, by road
user catagory.
Source TØI report 457
33The TRULS model for Norway injury accident risk
plotted against of traffic density (monthly veh
kms per road km). 5016 sample points (19
counties x 264 months).
34The TRULS model for Norway elasticities w r t
exposure for various accident types, plotted
against traffic density. 5016 sample points (19
counties x 264 months).
35The TRULS model for Norway relative injury
accident frequency as function of aggregate seat
belt use. 5016 sample points (19 counties x 264
months).
36According to TRULS, heavy vehicles are 3.8 times
(1.321/0.345) more dangerous than light ones.
Light vehicle road users generate a positive
external accident costs only if their own share
of the accident cost is less than 34 .
37Summary
Occasionally, a humble donkey may have better
ideas than the most ingenious scientist.
38Thank you for listening!
- Read more
- www.toi.no
- http//www.toi.no/attach/985/R457_1999.pdf
- Acc. Anal. Prev. 27 (1)1-20 (1995)