Spatial Distribution Hot Spot Analysis I presentation

About This Presentation

Transcript and Presenter's Notes

Title: Spatial Distribution Hot Spot Analysis I

1
Spatial DistributionHot Spot Analysis I
2
Approaching Hot Spots Analysis

Determination of approach should be general or
focused.
Definition of a Hot Spot must be specified
using theory or empirical evidence will give an
indication of scale, time, similarity and
distances to select.
Selection of intensity and/or weight variables
beyond the X Y locations.
Number of clusters must be specified in that
there are to be either a fixed or variable set of
clusters.
Visual display should be based on thematic
mapping principals.

3
Nearest Neighbor Methods
Event Point
x Sample Point
Z
W
X
Y
x
X2
W Event to Nearest Event
X Sample Point To Nearest Event
X2 Sample Point To 2nd Nearest Event
Y Event-Nearest-Sample Point to Nearest Event
Z Event-Nearest-Sample Point to
Nearest-Event-In-Half Plane-Not-Containing-Sample
Point
Cressie, Noel A.C. (1993) Statistics for
Spatial Data Analysis Revised Edition John
Wiley and Sons, Inc., New York, NY pp 602-603
4
Hot Spot I Measures

Point Techniques
Mode Fuzzy Mode
Hierarchical Techniques
Nearest Neighbor Hierarchical Clustering
Risk Adjusted Nearest Neighbor Hierarchical
Clustering

5
Mode Fuzzy Mode
6
Mode Fuzzy Mode

Mode
For locations that have multiple incidents.
Calculates frequency of incidents occurring at a
single location and are ranked from highest to
lowest.
Is more precise but less flexible.
Fuzzy Mode
For individual incident locations.
A fixed distance search radius counts the number
of incidents near a single incident location.
Incidents are counted multiple times as
neighboring incidents are visited.
Is less precise but more flexible.

7
Mode Fuzzy Mode Input
8
Mode Fuzzy Mode Output
9
(No Transcript)
10
(No Transcript)
11
Fuzzy Mode Statistics Burglary
12
Fuzzy Mode Statistics Theft
13
(No Transcript)
14
(No Transcript)
15
Fuzzy Mode Statistics TABC
16
(No Transcript)
17
(No Transcript)
18
Nearest Neighbor Hierarchical Clustering
19
NN Hierarchical Clustering

Two or more incidents are grouped on the basis of
criteria, such as the nearest neighbor form
first-order.
Those pairs are then grouped on, again, being the
nearest neighbor form second-order.
Those grouped pairs are also grouped based on the
nearest group neighbor form third-order.
Those groups converge in to the final group
forming the fourth-order.
OR
Grouping criteria fails and the points are not
included in the hierarchy.

20
NN Hierarchical Clustering Technique
21
Clustering Criteria 1

Threshold Distance
Random Nearest Neighbor Distance (default). This
is based on a one-tailed confidence interval
around the random expected NN distance. A
t-value is computed under the assumption that the
degrees of freedom are at least 120 (next degree
in the t-distribution is .) This creates a
distance probability between two points.
Fixed Distance. The search distance is specified
exactly. This allows for comparisons against
crime types. The flexibility for exploration of
various distances is greater as theory or
empirical findings can be tested. Distances from
the Moran Correlogram and Ripleys K can be
examined.

22
Mean Random Distance
Random Nearest Neighbor Distance Index.
Total area of the region.
23
Confidence Interval for Mean Random Distance
This confidence interval defines the probability
for the distance between any pair of points.
24
Confidence Interval for Mean Random Distance
0.999
0.00001
0.99
0.95
0.0001
0.9
0.001

0.75
0.01
0.5
0.05
0.1
25
Clustering Criteria 2

Minimum Number of Points
Can be specified exactly. This allows for
comparisons against crime types. Again, the
number selected could come from theory or
empirical findings as they can be tested.
Used in combination with search distance to
minimize the possibility of over identifying
clusters based on a minimum distance only, that
is, it reduces the chance of finding numerous
small clusters. This is particularly needed if a
small scale such as city is being searched.
More points reduces the number of clusters found
and vice versa for less points.

26
Selecting Parameters

Threshold Distance
Theory or empirical evidence are best for
starting the exploration. For example, it has
been found that convenience store crime occurs
within ¾ of a mile from limited access highway
intersections.
Use distance found from Ripleys K or Moran
Correlogram.
Minimum Number of Points
Theory or empirical evidence to does not really
exist as number of incidents do not have known
patterns. However, results from NNI would be
useful.
Assign incident counts to aerial units and
calculate descriptive statistics such a mean,
median, standard deviation and percentiles.

27
Descriptive Statistics Burglary
28
Descriptive Statistics Theft
29
NNH Clustering Input
30
NNH Clustering Output
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
Simulation of NNH Clustering

Routine is not just clustering pairs or fixed
orders of points, but clustering as many points
as possible within both the threshold and minimum
number of points.
Given that the threshold and minimum number of
points can vary the probability distribution is
not, or rather can not be known.
Monte Carlo simulation of randomness, therefore,
needs to be employed. Produces approximate
confidence intervals for the first-order clusters
but not higher-orders.

41
NNH Output with Simulation
42
Advantages of NNH Clustering

Can identify small geographical environments with
concentrated incidents.
Can be applied to the full data set instead of
having to carve up the study area into sub
juridictions.
Links between, and among other types of, clusters
can be made.
Demonstrates at which level various neighborhood,
policing, community, policy making, etc
strategies can be focused.

43
Limitations of NNH Clustering

No intensity of weight variable can be assigned.
Therefore, this technique is based solely on
locations.
Size of grouping area is dependent on the sample
size when the confidence interval around the mean
random distance is used for the threshold.
There is a certain arbitrariness to the selection
of minimum number of points. There is no theory
to serve as a guide and empirical evidence likely
varies from place to place, but descriptive
statistics help.
There is not theory in regards to clusters
themselves and must be interpreted in regards to
environment.

44
Risk-Adjusted Nearest Neighbor Hierarchical
Clustering
45
Risk Adjusted Process

Primary and Secondary files are required
Primary observation/incident point locations.
Secondary aerial unit with baseline or other
location.
Grid is defined from the minimum bounding
rectangle (MBR) specified in the Reference file
tab. A standard number of 50 columns is specified
making up the number of cells.
Area is defined from the Measurement Parameters
tab. If no area is defined area, then of the
grid is used. Thus it is based on the MBR of the
Secondary file.

46
Risk Adjusted Process

Kernel Density parameters must be specified.
Type of Density Estimation/Interpolation
Type of Bandwidth
Minimum Number of Points
The Secondary file is interpolated to the grid
using the density estimation parameters. It uses
the absolute densities, which are the number of
points per unit of analysis assigned to that grid
cell and is rescaled to add up to the same number
of points as in the Primary file (incidents).
This provides a distribution that has been
standardized from which to compare.

47
Risk Adjusted Process

The incidents in the Primary file are assigned to
the cell in the grid that it falls in
(point-in-polygon). A unique threshold distance,
based on the confidence interval set on the
slider bar, is also assigned to that cell.
Once the pairs of points are selected the
Risk-Adjusted NNH proceeds in the same fashion as
the NNH.

48
Risk-Adjusted NNH Input
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
Exercise

Calculate Mode and Fuzzy Mode and exclude values
within 2 standard deviations and/or
Percentiles.
Aggregate crime counts to block groups and derive
descriptive statistics of mean, median, mode,
standard deviation and 75, 80 and 90th
percentiles.
Use descriptive to set minimum number of points
and thresholds for at least two crime types.
Compare the various order ellipses for linkages
and similar/different patterns.

53
Exercise

Make several thematic maps showing population
social, economic or other distributions in
comparison to ellipses. Map out any other data
layers that might have relevance to the NNH
outpt.
Run simulations on a subset and adjust minimum
number of points and thresholds and compare
output with previous output.

Spatial Distribution Hot Spot Analysis I PowerPoint PPT Presentation