Title: Introduction to Spatial Statistics
1Introduction to Spatial Statistics
- Geostatistics Group
- Faye Belshe, Smitri Bhotika, Mike Gil,
- Mike Hyman, Kenny Lopiano, Jada White
- Slides contributed by Dr. Christman and Dr. Young
- October 9, 2009
2Types of Spatial Data
- Continuous Random Field
- Lattice Data
- Point Pattern Data
- Note Each type of data is analyzed differently
3Geostatistics
- Geostatistical analysis is distinct from other
spatial models in the statistics literature in
that it assumes the region of study is continuous
- Observations could be taken at any point within
the study area - Interpolation at points in between observed
locations makes sense
4Spatial Autocorrelation
- Spatial modeling is based on the assumption that
observations close in space tend to co-vary more
strongly than those far from each other - Positively co-vary values are similar in value
- E.g. elevation (or depth) tends to be similar for
locations close together) - Negatively co-vary values tend to be opposite in
value - E.g. density of an organism that is highly
spatially clustered, where observations in
between clusters are low and values within
clusters are high
5Covariance
- Definition two variables are said to co-vary if
their correlation coefficient is not zero - where ? is the correlation coefficient between X
and Y and ?X (?Y) is the standard deviation of
X (Y) - Consider this in the context of a single variable
- E.g. do nearest neighbors have non-zero
covariance?
6Continuous Data Geostatistics
- Notation
-
- Z(s) is the random process at location s(x, y)
- z(s) is the observed value of the process at
location s(x, y) - D is the study region
- The sample is the set z(s) s ? D . We say
that it is a partial realization of the random
spatial process Z(s) s ? D
7Conceptual Model
- where
- ?(s) is the mean structure called large-scale
non-spatial trend - W(s) is a zero-mean, stationary process whose
autocorrelation range is larger than min si
sj i,j 1, 2, , n called smooth
small-scale variation - ?(s) is a zero-mean, stationary process whose
autocorrelation range is - smaller than min si sj i,j 1, 2, ,
n and which is independent of W(s) called
micro-scale variation or measurement error - ?(s) is the random noise term with zero-mean and
constant variance and which is independent of
W(s) and ?(s)
8Simpler Conceptual Model
- where
- ?(s) is the mean structure called large-scale
non-spatial trend - d(s) W(s) ?(s) is a zero-mean, stationary
process with autocorrelation which combines the
smooth small- scale and micro-scale variation - ?(s) is the random noise term with zero-mean and
constant variance which is independent of W(s)
and ?(s)
9Graphical Concept with Trend
Red line indicates large-scale trend Green line
shows how the data are arranged around the
trend Note that there is a pattern to the points
around the red line. The pattern implies possible
positive autocorrelation in Z(x). Finally, there
is white noise.
10Graphical Concept without Trend
Red line indicates a constant mean, i.e. no
large-scale trend Green line shows how the data
are arranged around the trend Again, the pattern
of the green line implies possible positive
autocorrelation in RZ(x)
11Important Point
- The model indicates that Z can be decomposed
into large-scale variation, small micro-scale
variation, and noise - The reality is that any estimated decomposition
is not a unique - E.g. in the graph just shown, we could have
instead added a sinusoidal aspect to the
large-scale trend and hence captured much of the
apparent autocorrelation
12Example
Red line indicates large-scale trend captured by
a sinusoidal linear trend Green line shows how
the data are arranged around the trend Note that
now there is no obvious pattern and so the
remaining unexplained variation is likely white
noise in Z(x).
13Modeling
- Ultimately we want to do modeling of Z using the
geostatistical model - Requires estimates of the model components
- the mean
- the small-scale variation and the covariances
among Z values at different locations - Any leftovers, i.e. the unexplained or residual
variability
14Important Point
- The choice of approach (detailed fit of a trend
vs. large-scale trend autocorrelation) to
estimating/predicting Z depends strongly on the
reason for and uses of the model - E.g. if you are interested in predicting Z at
unsampled locations within the study area, then
any model that uses covariates to estimate
large-scale trend must also have the covariates
known for the unsampled locations - E.g. if you are interested in understanding the
reasons for the spatial distribution of Z then
you may or may not want to incorporate a spatial
correlation component
15Correlation Structure (Semivariogram)
- Now, to assess spatial autocorrelation we look at
the behavior of the following -
- for every possible pair of locations in the
dataset (N locations yields N(N-1)/2 pairs). - Correlated we would expect Z(si) to be similar
in value to Z(sj) and hence the squared
difference to be small. - Independent we would expect the squared
difference to be relatively large since the two
numbers would vary according to the population
variability.
16Plot (Variogram Cloud)
Variogram cloud for a dataset of 400 observations
Looking for pattern, i.e. is there a trend in ?
with respect to distance between two locations
17Empirical Variogram
- The variogram cloud is usually very uninformative
- Difficult to discern trend or pattern
- More pertinent is to calculate the average values
of ? for different distances - Problem is we dont usually have discrete
distances between locations (happens only when
data are on a perfect grid). - A common method for averaging ? at specific
distances is to bin the distances into intervals
(called lag distances), i.e. use all points
within some bin width around a given distance
value
18(No Transcript)
19Continuous Data Geostatistics
- Because we do not usually have lots of values at
discrete distances, a common method for averaging
the values at discrete distances is to use all
points within some bin width around a given
distance value. -
- So we choose several levels of h (distances) and
calculate the empirical variogram -
- where N(h) is the set of all locations that are a
distance of h apart within a tolerance region
around h, i.e. -
-
- and N(h) is the number of pairs in N(h).
-
20Empirical Semivariogram
- This plot is called an omnidirectional classical
empirical semivariogram - Omnidirectional because the direction between the
pairs of locations was ignored, - Classical because the equation used to estimate
the mean (alternatives exist that are robust to
outliers or to failure of assumptions of the
model) - Semi because of the division by 2 in the equation
used
Graph based on a set of 20 distance lags
21Important Points
- The constantly increasing semi-variogram
indicates that there is a problem with this
dataset - Ideally, it should at some distance level off at
the variance of the process implying that at some
distance the relationship between 2 locations is
the same regardless of the distance between them
(i.e. observations are independent at large
distances) - This graph indicates that
- The data imply correlation exists at all
distances (and therefore the study region is
small relative to the range of autocorrelation)
or - The data have a large-scale trend which may
account for most of the seeming autocorrelation
(small-scale trend)
22Semivariogram
Empirical semivariogram for different dataset in
which there was no large-scale trend but definite
autocorrelation
Note the rise and then leveling off of the ?(h)
values as distance increases
Well cover shapes for variograms in more detail
later
23Semivariogram
Empirical semivariogram for different dataset in
which there was no large-scale trend and no
autocorrelation
Note that the ?(h) values are more-or-less the
same regardless of distance
24Important Points
- If the empirical semivariogram increases in
distance between locations, then the correlation
between points is decreasing as distance
increases - The point at which it flattens to a constant
value is the distance at which any two points
that distance or larger apart are independent.
The value of ? is the variance of the spatial
process - At this point in our analyses, the number of lag
distances you use is not that critical but when
we try to fit a curve to the empirical
semivariogram later the number of lags becomes
very important
25Important Point About Directionality
- Another point to consider is whether the pattern
of autocorrelation, i.e. the shape of the curve
describing the semivariogram, is the same in
every direction. - Cant tell from the omnidirectional plot.
- Need to check if there is a directional effect
26Directional Semivariograms
- To check directionality in the covariance, plot ?
for each h for different directions - Modify the sets of locations over which the
averaging occurs - Typically done using a set of binned directions
(wedges of the compass) - Requires that you modify the definition of
neighborhood
27Directional Semivariograms
EXAMPLE calculate mean variability for the
angles 0, 22.5, 45, 67.5, 90, and 112.5? with a
tolerance of 11.25? on each side.
28Need for Assumptions in Order to Proceed Beyond
This Point
- The data that are collected are a partial
observation of the spatial surface (e.g. map)
that we are interested in - In addition, it is usually assumed that there is
some super process that created the particular
surface for which we have this partial view - To estimate the spatial autocorrelation we need
to make some assumptions. - Otherwise, we dont have sufficient information
to make any inferences.
29Two Assumptions
- Stationarity, specifically second-order
stationarity - Isotropy
30Stationarity
- The mean of the process is constant, i.e. no
trend - ?(s) ? for all s ? D (1)
- The covariance between any pair of points depends
only on the distance (and possibly direction) of
the points NOT the location of the points in
space - where C(.) is the covariance function
- This implies that the variance of Z is constant
everywhere - If both points are met then the spatial process
we are studying is said to be second-order
stationary.
31Relationship between Semivariogram and Correlation
- Assuming intrinsic stationarity, we have
- Now, assuming that
, we have - where . Thus,
32Isotropy
- The covariance between any pair of points does
not depend on direction but only distance
If this holds then the spatial process is said to
be isotropic
33Non-Constant Mean
- Two ways to handle a trend when it does exist
- Detrend the data using regression (or similar)
with covariates and then use the residuals from
the trend analysis for the spatial
autocorrelation analysis - E.g. disease rates as a function of population
density - Universal kriging (UK) which allows for
estimating the trend as a global polynomial in s
(x, y) and estimating the spatial
autocorrelation simultaneously - UK ignores other explanatory covariates which can
be advantageous or not depending on the purpose
of your study
34Non-Constant Variance
- To account for heterogeneity (non-constant
variance), - estimate variability in smaller subregions of
the study area - Need to make decisions about the size and extent
of the subregions - Need sufficient numbers of observations within
each subregion - Transform or standardize your data so that the
variability of the transformed values is constant
over the region
35Anisotropy
- Two types of anisotropy
- Geometric
- the range over which correlation is non-zero
depends on direction - The variance is constant over all directions
- This type can be adjusted for in geostatistical
analyses - Zonal
- Anything not geometric anisotropy
- Anisotropy implies that the spatial process
evolves differentially throughout the study region
36Variography
- Fitting a valid semivariogram function to the
empirical semivariogram - Now we are interested in describing the variogram
as an equation in which variance is a function
of the distance. - We shall assume that the spatial process is
second-order stationary and isotropic in the
following.
37Semivariogram
- We have already seen how to obtain the empirical
variogram of - is the semivariogram and is the primary
quantity of interest because -
-
- Now we are interested in describing the
semivariogram as a function of the distance. -
- We shall assume that the spatial process is
second-order stationary and isotropic in the
following.
38Semivariogram
- Semivariogram Models have the following
properties -
- 1) Many are not linear in their parameters
- 2) Must be conditionally negative-definite,
i.e. the function must satisfy
-
- for any real numbers satisfying
-
- 3) If as , there is
microscale variation which is assumed to be due
to measurement error (ME) or a process occurring
at the microscale. ME is measurable only if we
have replicate values at each location in the
sample.
39Semivariogram
- Semivariogram Models have the following
properties - If ?(h) is constant for every h except h 0
where ?(0) 0, then Z(s) and Z(t) are
uncorrelated for any pair of locations s and t -
, i.e. h2 is increasing faster than
?(h) as h increases
40A Typical Semivariogram
41Characteristics of the Semivariogram
- It is 0 when the separation distance is 0
(Var(0)0). - Nugget effect variation in two points very
close together. - May be measurement error
- May be indicative of erratic process (gold ore).
- The sill corresponds to the overall variance of
the data. - Data separated by distances less than the range
are spatially autocorrelated (Less variation
between close observations than between far
observations.)
42Estimating the Semivariogram
- Take all pairwise differences in the data
- (Z(si)-Z(sj)), s (x, y), a point in the 2-D
plane. - Compute the Euclidean distance between the
spatial locations - Average pairs that have the same distance class
- Binning like a 2-D histogram.
43End Result Empirical Semivariogram
44Modeling the Semivariogram
- The semivariogram measures variation among units
h units apart. - Note We do not want negative standard errors.
- So, we model the semivariogram with selected
parametric functions ensuring all standard errors
are nonnegative. - We estimate the nugget, sill, and range
parameters of the model that best fit the
empirical semivariogram (nonlinear least squares
problem).
45Selected semivariogram models
46Covariogram Models
Spherical Model
Gaussian Model
Exponential Model
Power Model is simply a reparameterization of the
exponential model.
47Covariogram vs. Semivariogram
The covariogram and semivariogram are related
48The fitted semivariogram model
Estimates nugget0.084, sill0.269, range110.3
miles
49- Common methods for fitting these functions to a
set of empirical semivariogram means -
- 1) choose the most likely candidate model
-
- 2) Methods for estimating the parameters of the
model -
- non-linear least squares estimation allows for
the estimation of parameters that enter the
equation non-linearly but ignores any dependences
among the empirical variogram values - non-linear weighted least-squares generalized
least squares in which the variance-covariance
of the variogram data points is accounted for in
the estimation procedure -
- maximum likelihood assuming the data are Normally
distributed but the estimators are likely to be
highly biased, especially in small samples (the
usual remedy is jackknifing) -
- restricted maximum likelihood maximize a
slightly altered likelihood function which
reduces the bias of the MLEs
50Properties of Variogram Models
- if as then there is
microscale variation - Usually assumed to be due to measurement error
(ME) - ME is measurable only if we have replicate values
at each location in the sample - When fitting a variogram function, may estimate a
non-zero value for c0 even when you do not have
replicate observations at sites. This is called
the nugget. - if ?(h) is constant for every h except h0 where
?(0) 0, then Z(si) and Z(sj) are uncorrelated
for any pair of locations si and sj
51Properties of Variogram Models
52Choosing a Best Model
- Need to choose the variogram model that best fits
the data - Best minimum unexplained variation after
fitting - Look at a measure of deviance
- where is the empirical semivariogram
for the ith lag and is the value
predicted by the fitted semivariogram model
53Choosing a Best Model
- In the absence of comparing deviance (or similar)
measures to determine if the model seems
appropriate - Compare fits visually
- Use prior knowledge from other studies to
determine
54Next Steps
- Using the results of the variography to do
statistical modeling of the spatial process - kriging