Title: Spatial Distribution Hot Spot Analysis I
1Spatial DistributionHot Spot Analysis I
2Approaching Hot Spots Analysis
- Determination of approach should be general or
focused. - Definition of a Hot Spot must be specified
using theory or empirical evidence will give an
indication of scale, time, similarity and
distances to select. - Selection of intensity and/or weight variables
beyond the X Y locations. - Number of clusters must be specified in that
there are to be either a fixed or variable set of
clusters. - Visual display should be based on thematic
mapping principals.
3Nearest Neighbor Methods
Event Point
x Sample Point
Z
W
X
Y
x
X2
W Event to Nearest Event
X Sample Point To Nearest Event
X2 Sample Point To 2nd Nearest Event
Y Event-Nearest-Sample Point to Nearest Event
Z Event-Nearest-Sample Point to
Nearest-Event-In-Half Plane-Not-Containing-Sample
Point
Cressie, Noel A.C. (1993) Statistics for
Spatial Data Analysis Revised Edition John
Wiley and Sons, Inc., New York, NY pp 602-603
4Hot Spot I Measures
- Point Techniques
- Mode Fuzzy Mode
- Hierarchical Techniques
- Nearest Neighbor Hierarchical Clustering
- Risk Adjusted Nearest Neighbor Hierarchical
Clustering
5Mode Fuzzy Mode
6Mode Fuzzy Mode
- Mode
- For locations that have multiple incidents.
- Calculates frequency of incidents occurring at a
single location and are ranked from highest to
lowest. - Is more precise but less flexible.
- Fuzzy Mode
- For individual incident locations.
- A fixed distance search radius counts the number
of incidents near a single incident location. - Incidents are counted multiple times as
neighboring incidents are visited. - Is less precise but more flexible.
7Mode Fuzzy Mode Input
8Mode Fuzzy Mode Output
9(No Transcript)
10(No Transcript)
11Fuzzy Mode Statistics Burglary
12Fuzzy Mode Statistics Theft
13(No Transcript)
14(No Transcript)
15Fuzzy Mode Statistics TABC
16(No Transcript)
17(No Transcript)
18Nearest Neighbor Hierarchical Clustering
19NN Hierarchical Clustering
- Two or more incidents are grouped on the basis of
criteria, such as the nearest neighbor form
first-order. - Those pairs are then grouped on, again, being the
nearest neighbor form second-order. - Those grouped pairs are also grouped based on the
nearest group neighbor form third-order. - Those groups converge in to the final group
forming the fourth-order. - OR
- Grouping criteria fails and the points are not
included in the hierarchy.
20NN Hierarchical Clustering Technique
21Clustering Criteria 1
- Threshold Distance
- Random Nearest Neighbor Distance (default). This
is based on a one-tailed confidence interval
around the random expected NN distance. A
t-value is computed under the assumption that the
degrees of freedom are at least 120 (next degree
in the t-distribution is .) This creates a
distance probability between two points. - Fixed Distance. The search distance is specified
exactly. This allows for comparisons against
crime types. The flexibility for exploration of
various distances is greater as theory or
empirical findings can be tested. Distances from
the Moran Correlogram and Ripleys K can be
examined.
22Mean Random Distance
Random Nearest Neighbor Distance Index.
Total area of the region.
23Confidence Interval for Mean Random Distance
This confidence interval defines the probability
for the distance between any pair of points.
24Confidence Interval for Mean Random Distance
0.999
0.00001
0.99
0.95
0.0001
0.9
0.001
0.75
0.01
0.5
0.05
0.1
25Clustering Criteria 2
- Minimum Number of Points
- Can be specified exactly. This allows for
comparisons against crime types. Again, the
number selected could come from theory or
empirical findings as they can be tested. - Used in combination with search distance to
minimize the possibility of over identifying
clusters based on a minimum distance only, that
is, it reduces the chance of finding numerous
small clusters. This is particularly needed if a
small scale such as city is being searched. - More points reduces the number of clusters found
and vice versa for less points.
26Selecting Parameters
- Threshold Distance
- Theory or empirical evidence are best for
starting the exploration. For example, it has
been found that convenience store crime occurs
within ¾ of a mile from limited access highway
intersections. - Use distance found from Ripleys K or Moran
Correlogram. - Minimum Number of Points
- Theory or empirical evidence to does not really
exist as number of incidents do not have known
patterns. However, results from NNI would be
useful. - Assign incident counts to aerial units and
calculate descriptive statistics such a mean,
median, standard deviation and percentiles.
27Descriptive Statistics Burglary
28Descriptive Statistics Theft
29NNH Clustering Input
30NNH Clustering Output
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38(No Transcript)
39(No Transcript)
40Simulation of NNH Clustering
- Routine is not just clustering pairs or fixed
orders of points, but clustering as many points
as possible within both the threshold and minimum
number of points. - Given that the threshold and minimum number of
points can vary the probability distribution is
not, or rather can not be known. - Monte Carlo simulation of randomness, therefore,
needs to be employed. Produces approximate
confidence intervals for the first-order clusters
but not higher-orders.
41NNH Output with Simulation
42Advantages of NNH Clustering
- Can identify small geographical environments with
concentrated incidents. - Can be applied to the full data set instead of
having to carve up the study area into sub
juridictions. - Links between, and among other types of, clusters
can be made. - Demonstrates at which level various neighborhood,
policing, community, policy making, etc
strategies can be focused.
43Limitations of NNH Clustering
- No intensity of weight variable can be assigned.
Therefore, this technique is based solely on
locations. - Size of grouping area is dependent on the sample
size when the confidence interval around the mean
random distance is used for the threshold. - There is a certain arbitrariness to the selection
of minimum number of points. There is no theory
to serve as a guide and empirical evidence likely
varies from place to place, but descriptive
statistics help. - There is not theory in regards to clusters
themselves and must be interpreted in regards to
environment.
44Risk-Adjusted Nearest Neighbor Hierarchical
Clustering
45Risk Adjusted Process
- Primary and Secondary files are required
- Primary observation/incident point locations.
- Secondary aerial unit with baseline or other
location. - Grid is defined from the minimum bounding
rectangle (MBR) specified in the Reference file
tab. A standard number of 50 columns is specified
making up the number of cells. - Area is defined from the Measurement Parameters
tab. If no area is defined area, then of the
grid is used. Thus it is based on the MBR of the
Secondary file.
46Risk Adjusted Process
- Kernel Density parameters must be specified.
- Type of Density Estimation/Interpolation
- Type of Bandwidth
- Minimum Number of Points
- The Secondary file is interpolated to the grid
using the density estimation parameters. It uses
the absolute densities, which are the number of
points per unit of analysis assigned to that grid
cell and is rescaled to add up to the same number
of points as in the Primary file (incidents).
This provides a distribution that has been
standardized from which to compare.
47Risk Adjusted Process
- The incidents in the Primary file are assigned to
the cell in the grid that it falls in
(point-in-polygon). A unique threshold distance,
based on the confidence interval set on the
slider bar, is also assigned to that cell. - Once the pairs of points are selected the
Risk-Adjusted NNH proceeds in the same fashion as
the NNH.
48Risk-Adjusted NNH Input
49(No Transcript)
50(No Transcript)
51(No Transcript)
52Exercise
- Calculate Mode and Fuzzy Mode and exclude values
within 2 standard deviations and/or
Percentiles. - Aggregate crime counts to block groups and derive
descriptive statistics of mean, median, mode,
standard deviation and 75, 80 and 90th
percentiles. - Use descriptive to set minimum number of points
and thresholds for at least two crime types.
Compare the various order ellipses for linkages
and similar/different patterns.
53Exercise
- Make several thematic maps showing population
social, economic or other distributions in
comparison to ellipses. Map out any other data
layers that might have relevance to the NNH
outpt. - Run simulations on a subset and adjust minimum
number of points and thresholds and compare
output with previous output.
54To Do
- Definition of Hot Spot and the 3 examples of
what might constitute one. - Cautions with Fuzzy Mode.
- Graphics of
- Some Process.