Title: Discrete Choice Models for Incident Prediction
1Discrete Choice Models for Incident Prediction
- Donald E. Brown
- Calcott Professor Chair, Department of Systems
Information Engineering
2AGENDA
- Incident Prediction
- Spatial Models
- Preferential Point Process Models
- Discrete Choice Models
- Application for Incident Prediction
- Conclusions
3Incident Prediction Problem
- Inputs
- series of incidents (e.g., crimes, attacks) in an
area of interest and over a fixed time interval, - (optional) doctrine or subjective behavioral
descriptions of the criminals or attackers, and - Formal description of the named areas of interest
and actions by friendly elements given by values
of features that are known or believed to be
relevant to the occurrence of the attacks or
incidents - Output1 The likelihood that another attack or
incident occurs at specified locations within the
named area of interest and within a specified
time range - Output2 The change that occurs in the likelihood
of attack over multiple periods
4Spatial Models
- Grid region into discrete cells
- Cells show measurements and are vector valued
- As with time, space is correlated. Points that
are close in space are more similar in their
measurements than far away points.
x11 x12 x13 x14 x15 x16
x21 x22 x23 x24 x25 x26
x31 x32 x33 x34 x35 x36
x41 x42 x43 x44 x45 x46
x51 x52 x53 x54 x55 x56
x61 x62 x63 x64 x65 x66
5Selected Literature in Spatial Modeling
- STARMA (Cliff, et al., 1975)
- Spatial Autoregression (Anselin, 1980)
- Spatial Point Processes (Snyder Miller, 1991)
- Components of Spatial Modeling (Cressie, 1993)
- Spatial Scan Statistic (Kulldorff 1997)
- Point Patterns (Diggle, 2003)
- Spatial Preferential Point Processes (Liu
Brown, 2003) - Discrete Choice Models for Spatial Incident
Prediction (Brown Xue, 2003)
6Kernel Density Estimation
- Common method for visually identifying hot spots
- Implies only spatial relationship are important
- As a predictive tool the method assumes that what
happened yesterday will happen tomorrow.
7Preferential Point Processes
- Given a realization of a marked space-time shock
point process s ÎD, t ÎT, Xs,t Îc, locations,
times, and feature values - where - D Ì Â2 is the study region or geographical space
- T Ì Â is the study horizon
- c Ì Âp is the feature space
- Estimate transition density yn(sn1 ,tn1 Dn,
Tn, cn) where - Dn s1 , s2 , , sn
- Tn t1 , t2 , , tn and
- c n X1 , X2 , , Xn
8Preferential Point Processes Model Construction
- First decomposition - separating space and time,
we model each aspect with a conditional density
function
Assumptions 1. Feature space does not contain
temporal features 2. Temporal evolution does
not depend on spatial evolution (not essential).
9Model Components
10Components of Preferential Point Process Model
11Point Processes in Feature Space
- Each event location in space/time maps to a
location in feature space - Some feature values (key features) are related to
the occurrence of events
- Cliques in key feature space define site
selection preferences - Models in feature space enable us to predict
events outside the hot spot regions Anticipate!
12Example Applications
- Law enforcement
- Breaking and entering analysis for Richmond, VA
(Liu and Brown 2003) showed significant
improvement over kernel density estimates for
predicting criminal incidents - Counter-Terrorism
- Model developed for suicide bombings in Israel
- Significant performance improvements over kernel
density estimates (Brown, et al., 2004)
13Richmond Application -Data Acquisition
- 579 completed forcible Breaking and Entering
incidents between July 1, 1997 and Aug. 31, 1997. - Feature data (100 features)
- Demographic counts
- Consumer expenditures
- Distances to geographic landmarks
- Feature data are coarse
- Areal census data
- Errors inherent in distance to highway
calculation
14Preferential Point Process (Mixture)
- Training July 7-20 Testing following 1 week
2 weeks.
15Preferential Point Process (WPK)
- Training July 7-20 Testing following 1 week
2 weeks.
16Preferential Point Process (FPK)
- Training July 7-20 Testing following 1 week
2 weeks.
17Suicide Bombing Study Region
Suicide bombing incidents were analyzed for all
of Israel. To evaluate the model a smaller study
region was selected in the Jerusalem area The
preliminary urban model for a particular group
was calculated for the area defined by the cyan
box on the image to the left. This area
represents most if not all of Jerusalem proper
with leading edges into the West Bank.
18Test Results with Later Incidents
19Approaches with Explicit Decision Models
Motivation
- Spatial decision making - offenders choose the
place of a crime based on attributes at that
place (Brantingham and Brantingham 1975, Molumby
1976, Newman 1972, Repetto 1974, Scarr 1973) - Journey to crime - distance to the place of the
crime is important (Amir 1971, Baldwin and
Bottoms 1976, Capone and Nichols 1976, LeBeau
1987, Rossmo 1993, Rossmo 1994) - Spatial alternatives have three components
- target attributes (e.g., protection
characteristics of the victim) - location (e.g., distance to other features)
- time (e.g., time from a motivating speech)
20Random Utility Maximization
21Modeling Criminal Terrorist Spatial Decisions
- Derived from discrete choice model
- Alternatives are discrete spatial and temporal
points - The number of alternatives is very large
- Depends on the size of the grid
- Feature components spatial alternatives
characteristics - Aggregate alternatives
- Decision makers are not considering all possible
alternatives - Chunk alternatives using clustering
- Hierarchical DCM
- Aggregation based on feature selection
22Feature Selection
- Feature selection methods
- Simple attribute ranking
- Forwards and backward selection
- Branch-and-bound selection
- Clustering
- Example feature selection criteria
- Gini index
- Entropy
23Hierarchical Choice
24Estimation for Logistic Models
25Models Considered
- Logistic Models
- main effects
- quadratic
- interaction
- Tree-based
- Generalized Additive Models
26Modeling Nonlinearities in the Choice Process
Generalized Additive Models (GAM) provide a
mechanism to model nonlinearities in the
relationship between the spatial features and the
probability a location is chosen for the attack.
The nonlinear functions are shown as f(Xi).
27Splines in GAM
- We use restricted cubic splines for f(Xi).
- Spline Components
- Connection points are called knots and their
number can vary depending on the data - Cubic splines fit curved data better than linear
splines - Cubic spines can be made to join at the knots
- Constraining the function to be linear in the
tails improves performance
28Geographic Information System Implementation
We use multiple GIS layers (topography,
transportation networks, demographics, economic
features, etc.) to construct a discrete
suitability surface representation Algorithm
searches over cells and scores them accordingly
29Example Terrain Suitability
Slope
Surface Material
Terrain/Doctrine-based prior field
Vegetation
Roads,Water,Obstacles
30Example Applications
- Law Enforcement
- Richmond breaking entering data
- Linear main effects model
- Compared predictions on test sets
- Reject hypothesis of equality in methods (p
0.005) - Counter-Terrorism
- Data from asymmetric warfare attacks
- GAM
- Method showed significance in ROC
31Richmond Choice Model Results
32Asymmetric Warfare Attacks Against the U.S.
- Attacks take many forms
- suicide bombings
- improvised explosive devices
- hostage taking
- mortar rocket attacks
- Complex attacks
- The incident on the right was a suicide bombing
at a police station in Iraq that occurred on
February 12, 2004 killed 47 people
Hull, Bryson, 100 die in two Iraq suicide
bombings, The Age, February 12, 2004,
http//www.theage.com.au/
33Example Asymmetric of Warfare IED Attacks in Iraq
- Major method of attacking U.S. forces in Iraq
- Responsible for more U.S. deaths than any other
attack mode - Inexpensive, easy to deploy, and deadly
- Picture on right shows U.S. troops with IED on
March 15, 2004 - Models of insurgent decision making are
predictive of attacks
Picture from http//www.middle-east-online.com/eng
lish/?id9250
34Example Features F3
35Example Features F5
36October Threat Surface
37October Surface with Attack Points
38Evaluating Predictive Models
ROC Curve
39DCM Evaluation with KDE
- Comparison of density (surface) values at actual
attack points - KDE and DCM were normalized to 1
- Hypotheses
- H0 mD mK 0
- Ha mD mK gt 0
- DCM results show we can reject H0
- Wilcoxon p lt .01
- Results true for multiple DCM forms
40Conclusions
- Process models that account for preference can
perform incident prediction - Discrete choice models provide explicit
representations of an opponents utility
functions - Both modeling approaches have shown good results
on real data from law enforcement and terrorism - Models can account for multiple decision making
groups but performance has yet to be tested
41Questions?